How to Integrate Openstack Cluster with External Ceph Storage Cluster
Introduction
Images: OpenStack Glance manages images for VMs. Images are immutable. OpenStack treats images as binary blobs and downloads them accordingly.
Volumes: Volumes are block devices. OpenStack uses volumes to boot VMs, or to attach volumes to running VMs. OpenStack manages volumes using Cinder services.
Guest Disks: Guest disks are guest operating system disks. By default, when you boot a virtual machine, its disk appears as a file on the filesystem of the hypervisor (usually under /var/lib/nova/instances//). Prior to OpenStack Havana, the only way to boot a VM in Ceph was to use the boot-from-volume functionality of Cinder. However, now it is possible to boot every virtual machine inside Ceph directly without using Cinder, which is advantageous because it allows you to perform maintenance operations easily with the live-migration process. Additionally, if your hypervisor dies it is also convenient to trigger nova evacuate and run the virtual machine elsewhere almost seamlessly.
Nodes
openstack controller: 10.195.231.213
openstack compute1: 10.195.231.214
openstack compute2: 10.195.231.215
ceph1: 10.195.231.201 (mgr, mon, osd1)
ceph2: 10.195.231.202 (osd2)
ceph3: 10.195.231.203 (osd3)
Creating Ceph Pools on mgr node
# 1. Create osd pool on ceph mgr node
# By default, Ceph block devices use the rbd pool. You may use any available pool.
# We recommend creating a pool for Cinder and a pool for Glance. Ensure your Ceph cluster is running, then create the pools.
# 128 is the number of placement groups, need to be calculated based on how many disk.ceph osd pool create volumes 128
ceph osd pool create backups 128
ceph osd pool create images 128
ceph osd pool create vms 128# Newly created pools must be initialized prior to use. Use the rbd tool to initialize the pools:
# Init Pools
rbd pool init volumes
rbd pool init images
rbd pool init backups
rbd pool init vms
Configure Openstack as Ceph Clients
# Install on all openstack nodes, controller + compute
yum install python-rbd -y
yum install ceph-common -y
The nodes running glance-api, cinder-volume, nova-compute and cinder-backup act as Ceph clients. Each requires the ceph.conf file:
Copy ceph.conf file from ceph node to all openstack nodes from ceph mgr node:
# Copy /etc/ceph/ceph.conf from mgr node(ceph1) to all openstack nodes
ssh ems-vm-controller.es.equinix.com sudo tee /etc/ceph/ceph.conf </etc/ceph/ceph.conf
ssh ems-vm-compute1.es.equinix.com sudo tee /etc/ceph/ceph.conf </etc/ceph/ceph.conf
ssh ems-vm-compute2.es.equinix.com sudo tee /etc/ceph/ceph.conf </etc/ceph/ceph.conf
Setup Cephx Authentication
# Ceph1 node (mgr node)
# If you have cephx authentication enabled, create a new user for Nova/Cinder and Glance. Execute the following:
ceph auth get-or-create client.cinder mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=vms, allow rx pool=images'
ceph auth get-or-create client.cinder-backup mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=backups'
ceph auth get-or-create client.glance mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=images'# Add the keyrings for client.cinder, client.glance, and client.cinder-backup to the appropriate nodes and change their ownership:
ceph auth get-or-create client.cinder | ssh 10.195.231.213 sudo tee /etc/ceph/ceph.client.cinder.keyring
ssh 10.195.231.213 sudo chown cinder:cinder /etc/ceph/ceph.client.cinder.keyring
ceph auth get-or-create client.cinder-backup | ssh 10.195.231.213 sudo tee /etc/ceph/ceph.client.cinder-backup.keyring
ssh 10.195.231.213 sudo chown cinder:cinder /etc/ceph/ceph.client.cinder-backup.keyring
ceph auth get-or-create client.glance | ssh 10.195.231.213 sudo tee /etc/ceph/ceph.client.glance.keyring
ssh 10.195.231.213 sudo chown glance:glance /etc/ceph/ceph.client.glance.keyringceph auth get-or-create client.cinder | ssh 10.195.231.214 sudo tee /etc/ceph/ceph.client.cinder.keyring
ceph auth get-or-create client.cinder | ssh 10.195.231.215 sudo tee /etc/ceph/ceph.client.cinder.keyringceph auth get-key client.cinder | ssh 10.195.231.214 tee /etc/ceph/client.cinder.key
ceph auth get-key client.cinder | ssh 10.195.231.215 tee /etc/ceph/client.cinder.key
Create Secret
# All Compute nodes, could use the same uuid value across all compute nodes
# Then, on the compute nodes, add the secret key to libvirt and remove the temporary copy of the key:
uuidgen # generate uuid, the reset of openstack could use the same uuid value.
bb3df3eb-7ac9-4964-b2dc-c254d9c71448cat > secret.xml <<EOF
<secret ephemeral='no' private='no'>
<uuid>bb3df3eb-7ac9-4964-b2dc-c254d9c71448</uuid>
<usage type='ceph'>
<name>client.cinder secret</name>
</usage>
</secret>
EOFsudo virsh secret-define --file secret.xml
sudo virsh secret-set-value --secret bb3df3eb-7ac9-4964-b2dc-c254d9c71448 --base64 $(cat client.cinder.key) && rm client.cinder.key secret.xml
Configure openstack to use ceph
Glance
Edit /etc/glance/glance-api.conf and add under the [glance_store] section:
# Controller node
[root@ems-vm-controller ~]# vim /etc/glance/glance-api.conf
[glance_store]
stores = rbd
default_store = rbd
rbd_store_pool = images
rbd_store_user = glance
rbd_store_ceph_conf = /etc/ceph/ceph.conf
rbd_store_chunk_size = 8#If you want to enable copy-on-write cloning of images, also add under the [DEFAULT] section:
show_image_direct_url = True#Disable the Glance cache management to avoid images getting cached under /var/lib/glance/image-cache/, assuming your configuration file has flavor = keystone+cachemanagement:
[paste_deploy]
flavor = keystone# Recommand Configuraiton
# add the virtio-scsi controller and get better performance and support for discard operation
hw_scsi_model=virtio-scsi
# connect every cinder block devices to that controller
hw_disk_bus=scsi
# enable the QEMU guest agent
hw_qemu_guest_agent=yes
# send fs-freeze/thaw calls through the QEMU guest agent
os_require_quiesce=yes
Cinder
OpenStack requires a driver to interact with Ceph block devices. You must also specify the pool name for the block device. On your OpenStack node, edit /etc/cinder/cinder.conf by adding:
# Controller node
# rbd_secret_uuid is the uudi we create before, please note we can not put any comments with the parameter below, otherwise, it will broken.
# remove the [lvm] section
# vim /etc/cinder/cinder.conf
[DEFAULT]
...
enabled_backends = ceph
glance_api_version = 2
...
[ceph]
volume_driver = cinder.volume.drivers.rbd.RBDDriver
volume_backend_name = ceph
rbd_pool = volumes
rbd_ceph_conf = /etc/ceph/ceph.conf
rbd_flatten_volume_from_snapshot = false
rbd_max_clone_depth = 5
rbd_store_chunk_size = 4
rados_connect_timeout = -1
rbd_user = cinder
rbd_secret_uuid = bb3df3eb-7ac9-4964-b2dc-c254d9c71448backup_driver = cinder.backup.drivers.ceph
backup_ceph_conf = /etc/ceph/ceph.conf
backup_ceph_user = cinder-backup
backup_ceph_chunk_size = 134217728
backup_ceph_pool = backups
backup_ceph_stripe_unit = 0
backup_ceph_stripe_count = 0
restore_discard_excess_bytes = true
Openstack using iscsi as volume Type by default, after change to ceph, we need change the default_volume_type.
[root@ems-vm-controller ~(keystone_admin)]# openstack volume type create --public --property volume_backend_name="ceph" ceph# ceph just a name, need to match above volume_backend_name
[root@ems-vm-controller ~]# vim /etc/cinder/cinder.conf
default_volume_type=ceph
Nova
In order to attach Cinder devices (either normal block or by issuing a boot from volume), you must tell Nova (and libvirt) which user and UUID to refer to when attaching the device. libvirt will refer to this user when connecting and authenticating with the Ceph cluster.
# This configure, all vms still running on each local ndoe.
# compute node
[libvirt]
...
rbd_user = cinder
rbd_secret_uuid = bb3df3eb-7ac9-4964-b2dc-c254d9c71448
Full configuration on [libvirt] section:
# Compute node
[libvirt]
virt_type = qemu
images_type = rbd
images_rbd_pool = vms
images_rbd_ceph_conf = /etc/ceph/ceph.conf
rbd_user = cinder
rbd_secret_uuid = 4810c760-dc42-4e5f-9d41-7346db7d7da2
disk_cachemodes="network=writeback"
inject_password = false
inject_key = false
inject_partition = -2
live_migration_flag="VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_PERSIST_DEST"
on all computes node, add [client] section to /etc/ceph/ceph.conf file.
[root@ems-vm-compute1 ~]# vim /etc/ceph/ceph.conf
[client]
rbd cache = true
rbd cache writethrough until flush = true
rbd concurrent management ops = 20
admin socket = /var/run/ceph/guests/$cluster-$type.$id.$pid.$cctid.asok
log file = /var/log/ceph/qemu-guest-$pid.log
Configure the permissions of these paths:
mkdir -p /var/run/ceph/guests/ /var/log/ceph/
chown qemu:libvirt /var/run/ceph/guests /var/log/ceph/
Restart services
systemctl restart openstack-cinder-volume openstack-cinder-api openstack-cinder-scheduler openstack-cinder-backup openstack-glance-api
systemctl restart openstack-nova-compute
Note
Using QCOW2 for hosting a virtual machine disk is NOT recommended. If you want to boot virtual machines in Ceph (ephemeral backend or boot from volume), please use the raw image format within Glance.
Here is the qemu-img command convert image type.
qemu-img convert -f {source-format} -O {output-format} {source-filename} {output-filename}
For example
# check file format
[root@ems-sv4-centos7 Downloads]# qemu-img info bionic-server-cloudimg-amd64.raw# convert from qcow2 to raw format
[root@ems-sv4-centos7 Downloads]# qemu-img convert -f qcow2 -O raw bionic-server-cloudimg-amd64.img bionic-server-cloudimg-amd64.raw