Some notes about Ceph
Laurent Barbe @SIB

Test Ceph NVMe-oF

A quick test of NVMe over Fabrics (NVMe/TCP) and VMware...

NVMe over Fabrics Ceph VMWare

Create a RBD volume

# Create a pool "nvmeof_pool01" and a rbd image inside :
ceph osd pool create nvmeof_pool01
rbd pool init nvmeof_pool01
rbd -p nvmeof_pool01 create nvme_image --size 50G

# Deploy nvmeof instance (version 1.0.0)
ceph config set …

RBD journal offloading

If you are using rbd journaling feature (for example for rbd mirroring), in some cases it could be interesting to offloading journaling on specific pool. For example if your rbd pool is on hdd drives and you also have ssd or nvme.

external rbd journal

To change rbd journaling pool, there are 2 …

How many mouvement when I add a replica ?

Make a simple simulation !

Use your own crushmap :

$ ceph osd getcrushmap -o crushmap

got crush map from osdmap epoch 28673

Or create a sample clushmap :

$ crushtool --outfn crushmap --build --num_osds 36 host straw 12 root straw 0

2017-07-28 15:01:16.240974 7f4dda123760  1 
-4 …

Dealing with some osd timeouts

In some cases, some operations may take a little longer to be processed by the osd. And the operation may fail, or even make the OSD to suicide. There are many parameters for these timeouts. Some examples :

Thread suicide timed out

heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f1ee3ca7700' had suicide timed …

Erasure code on small clusters

Erasure code is rather designed for clusters with a sufficient size. However if you want to use it with a small amount of hosts you can also adapt the crushmap for a better matching distribution to your need.

Here a first example for distributing data with 1 host OR 2 …