WARNING: libcurl doesn't support curl_multi_wait()
WARNING: cross zone / region transfer performance may be affected
If you have already been confronted to this error at startup of RadosGW, the problem is the version of libcurl used.
To enable support of curl_multi_wait, you will need to compile radosgw with libcurl >= 7.28.0 :
http://curl.haxx.se/libcurl/c/curl_multi_wait.html
With each key containing between 100 and 250 bytes, this make a very big object for rados (several GB)… Especially when migrating it from an OSD to another (this will lock all writes), moreover, the OSD containing this object will use a lot of memory …
Since the hammer release it is possible to shard the bucket index. However, you can not shard an existing one but you can setup it for new buckets.
This is a very good thing for the scalability.
You have probably already be faced to migrate all objects from a pool to another, especially to change parameters that can not be modified on pool. For example, to migrate from a replicated pool to an EC pool, change EC profile, or to reduce the number of PGs…
There are different methods, depending on the contents of the pool (RBD, objects), size…
The simple way
The simplest and safest method to copy all objects with the “rados cppool” command.
However, it need to have read only access to the pool during the copy.
For example for migrating to an EC pool :
12345
pool=testpool
ceph osd pool create $pool.new 4096 4096 erasure default
rados cppool $pool$pool.new
ceph osd pool rename $pool$pool.old
ceph osd pool rename $pool.new $pool
But it does not work in all cases. For example with EC pools : “error copying pool testpool => newpool: (95) Operation not supported”.
Using Cache Tier
This must to be used with caution, make tests before using it on a cluster in production. It worked for my needs, but I can not say that it works in all cases.
I find this method interesting method, because it allows transparent operation, reduce downtime and avoid to duplicate all data. The principle is simple: use the cache tier, but in reverse order.
At the begning, we have 2 pools : the current “testpool”, and the new one “newpool”
In ceph osd dump you should see something like that :
--> pool 58 'testpool' replicated size 3 .... tier_of 80
Now, all new objects will be create on new pool :
Now we can force to move all objects to new pool :
1
rados -p testpool cache-flush-evict-all
Switch all clients to the new pool
(You can also do this step earlier. For example, just after the cache pool creation.)
Until all the data has not been flushed to the new pool you need to specify an overlay to search objects on old pool :
1
ceph osd tier set-overlay newpool testpool
In ceph osd dump you should see something like that :
It is not always easy to know how to organize your data in the Crushmap, especially when trying to distribute the data geographically while separating different types of discs, eg SATA, SAS and SSD.
Let’s see what we can imagine as Crushmap hierarchy.
Take a simple example of a distribution on two datacenters.
(Model 1.1)