/etc/ceph/ceph.conf
:
$ zypper in ntp yast2-ntp-client $ systemctl enable ntpd.service $ systemctl start ntpd.service
$ useradd -m cephadm $ passwd cephadm
Defaults:cephadm !requiretty cephadm ALL = (root) NOPASSWD:ALL
$ ssh-copy-id cephadm@node1 $ ssh-copy-id cephadm@node2 $ ssh-copy-id cephadm@node3
$ ceph-deploy purge node1 node2 node3 $ cephadm > ceph-deploy purgedata node1 node2 node3 $ cephadm > ceph-deploy forgetkeys
$ ceph-deploy disk zap node:vdb
/etc/sysctl.conf
:net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 net.ipv6.conf.lo.disable_ipv6 = 1
$ zypper in ceph ceph-deploy $ ceph-deploy install node1 node2 node3
$ ceph-deploy new node1 node2 node3 $ ceph-deploy mon create-initial
$ ceph-deploy admin node1 node2 node3
$ ceph-deploy osd prepare node1:vdb $ ceph-deploy osd activate node1:vdb1
$ ceph-deploy osd prepare --zap node:vdb
Le FS par défaut est xfs.
Les commandes ceph-deploy osd prepare
et ceph-deploy osd activate
peuvent être remplacées par ceph-deploy osd create
.
$ ceph osd pool create gigix-data 128 $ ceph osd pool create gigix-metadata 128
$ ceph fs new gigifs gigix-metadata gigix-data
$ ceph auth get-or-create client.gigix-client mon 'allow r' mds 'allow rw' osd 'allow rw pool=gigix-metadata,allow rw pool=gigix-data'
$ ceph-deploy mds create node1 node2
$ mount.ceph node1,node2,node3:/ /mnt -o name=gigix-client,secretfile=/etc/ceph/ceph.client.gigix-client.keyring
/etc/ceph/ceph.conf
et ajouter les lignes suivantes :
osd pool default size = 2 osd pool default min size = 1
#devices device 0 osd.0 class ssd device 1 osd.1 class hdd device 2 osd.2 device 3 osd.3
ID | Nom | Description |
---|---|---|
0 | OSD | An OSD daemon (osd.1, osd.2, etc.). |
1 | Host | A host name containing one or more OSDs. |
2 | Chassis | Chassis of which the rack is composed. |
3 | Rack | A computer rack. The default is unknownrack. |
4 | Row | A row in a series of racks. |
5 | Pdu | Power distribution unit. |
6 | Pod | |
7 | Room | A room containing racks and rows of hosts. |
8 | Data Center | A physical data center containing rooms. |
9 | Region | |
10 | Root |
Vous pouvez créer votre propre liste, par exemple :
host ceph-osd-server-1 { id -17 alg straw hash 0 item osd.0 weight 1.00 item osd.1 weight 1.00 } row rack-1-row-1 { id -16 alg straw hash 0 item ceph-osd-server-1 weight 2.00 } rack rack-3 { id -15 alg straw hash 0 item rack-3-row-1 weight 2.00 item rack-3-row-2 weight 2.00 item rack-3-row-3 weight 2.00 item rack-3-row-4 weight 2.00 item rack-3-row-5 weight 2.00 } rack rack-2 { id -14 alg straw hash 0 item rack-2-row-1 weight 2.00 item rack-2-row-2 weight 2.00 item rack-2-row-3 weight 2.00 item rack-2-row-4 weight 2.00 item rack-2-row-5 weight 2.00 } rack rack-1 { id -13 alg straw hash 0 item rack-1-row-1 weight 2.00 item rack-1-row-2 weight 2.00 item rack-1-row-3 weight 2.00 item rack-1-row-4 weight 2.00 item rack-1-row-5 weight 2.00 } room server-room-1 { id -12 alg straw hash 0 item rack-1 weight 10.00 item rack-2 weight 10.00 item rack-3 weight 10.00 } datacenter dc-1 { id -11 alg straw hash 0 item server-room-1 weight 30.00 item server-room-2 weight 30.00 } pool data { id -10 alg straw hash 0 item dc-1 weight 60.00 item dc-2 weight 60.00 }
Règle de placement pour un pool :
rule rulename { ruleset ruleset type type min_size min-size max_size max-size step step }
<file> [mon] osd pool default size = 2 osd pool default min size = 1
osd pool default pg num = 64 osd pool default pgp num = 64 <file>
Installe les package ceph sur les clients :
$ ceph-deploy install node1 node2
Configure un méta data server (pour cephfs uniquement) :
$ ceph-deploy mds create node1
Copie la configuration sur les noeuds :
$ ceph-deploy config push host-name [host-name]...
Créer une gateway rados :
$ ceph-deploy --overwrite-conf rgw create ceph-node1:rgw.gateway1
Liste les gateway rados :
$ ceph-deploy rgw list
Supprime une gateway rados :
$ ceph-deploy --overwrite-conf rgw delete ceph-node1:rgw.gateway1
Créer une clé :
$ ceph-authtool --create-keyring /path/to/keyring
/etc/ceph/cluster.name.keyring /etc/ceph/cluster.keyring /etc/ceph/keyring /etc/ceph/keyring.bin
Affiche l'état de Ceph :
$ ceph health detail HEALTH_OK
Affiche le status (plus détaillé que health :
$ ceph status cluster b7b11ce7-76c7-41c1-bbf3-b4283590a187 health HEALTH_OK monmap e1: 3 mons at {osd1=192.168.122.11:6789/0,osd2=192.168.122.12:6789/0,osd3=192.168.122.13:6789/0} election epoch 56, quorum 0,1,2 osd1,osd2,osd3 fsmap e17: 1/1/1 up {0=osd1=up:active} osdmap e94: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds pgmap v713: 67 pgs, 3 pools, 74530 bytes data, 21 objects 110 MB used, 15216 MB / 15326 MB avail 67 active+clean
ceph –watch-warn ou ceph –watch-error pout afficher seulement que les warnings ou erreurs.
Affiche le status du quorum :
$ ceph quorum_status {"election_epoch":56,"quorum":[0,1,2],"quorum_names":["osd1","osd2","osd3"],"quorum_leader_name":"osd1","monmap":{"epoch":1,"fsid":"b7b11ce7-76c7-41c1-bbf3-b4283590a187","modified":"2017-04-09 22:14:42.911169","created":"2017-04-09 22:14:42.911169","mons":[{"rank":0,"name":"osd1","addr":"192.168.122.11:6789\/0"},{"rank":1,"name":"osd2","addr":"192.168.122.12:6789\/0"},{"rank":2,"name":"osd3","addr":"192.168.122.13:6789\/0"}]}}
Affiche le status des mon :
$ ceph mon_status {"name":"osd3","rank":2,"state":"peon","election_epoch":56,"quorum":[0,1,2],"outside_quorum":[],"extra_probe_peers":[],"sync_provider":[],"monmap":{"epoch":1,"fsid":"b7b11ce7-76c7-41c1-bbf3-b4283590a187","modified":"2017-04-09 22:14:42.911169","created":"2017-04-09 22:14:42.911169","mons":[{"rank":0,"name":"osd1","addr":"192.168.122.11:6789\/0"},{"rank":1,"name":"osd2","addr":"192.168.122.12:6789\/0"},{"rank":2,"name":"osd3","addr":"192.168.122.13:6789\/0"}]}}
$ ceph daemon osd.0 config show $ ceph daemon mon.osd1 config show $ ceph daemon mon.osd1 config get keyring
* Afficher la différence de la configuration courante avec la config d'origine :
$ ceph daemon osd.0 config diff
$ ceph daemon osd.0 config get debug_osd { "debug_osd": "0\/5" } $ ceph daemon osd.0 config set debug_osd "5\/5" { "success": "" } $ ceph daemon osd.0 config get debug_osd { "debug_osd": "5\/5" }
Permet de modifier la configuration à la volée :
$ ceph tell osd.0 injectargs -- --debug_osd="5/5" $ ceph daemon osd.0 config get debug_osd $ ceph tell osd.* injectargs -- --osd_scrub_sleep=0.1 --osd_max_scrubs=1
Afficher les serveurs par type (mds, mon ou osd) :
$ ceph node ls all { "mon": { "osd1": [ 0 ], "osd2": [ 1 ], "osd3": [ 2 ] }, "osd": { "osd1": [ 0 ], "osd2": [ 1 ], "osd3": [ 2 ] }, "mds": { "osd1": [ 0 ] } } $ ceph node ls mds { "osd1": [ 0 ] } $ ceph node ls mon { "osd1": [ 0 ], "osd2": [ 1 ], "osd3": [ 2 ] } $ ceph node ls osd { "osd1": [ 0 ], "osd2": [ 1 ], "osd3": [ 2 ] }
Affiche l'utilisation des pools (indiquer l'option detail comme ci-dessous pour avoir plus de détail) :
$ ceph df detail GLOBAL: SIZE AVAIL RAW USED %RAW USED OBJECTS 15326M 15217M 109M 0.72 21 POOLS: NAME ID CATEGORY QUOTA OBJECTS QUOTA BYTES USED %USED MAX AVAIL OBJECTS DIRTY READ WRITE RAW USED rbd 0 - N/A N/A 0 0 5071M 0 0 0 0 0 cephfs_data 2 - N/A N/A 1815 0 7607M 1 1 0 262 3630 cephfs_metadata 3 - N/A N/A 72715 0 7607M 20 20 55 116 142k
Affiche l'état des mon :
$ ceph mon stat e1: 3 mons at {osd1=192.168.122.11:6789/0,osd2=192.168.122.12:6789/0,osd3=192.168.122.13:6789/0}, election epoch 56, quorum 0,1,2 osd1,osd2,osd3
Affiche les informations des mon :
$ ceph mon dump dumped monmap epoch 1 epoch 1 fsid b7b11ce7-76c7-41c1-bbf3-b4283590a187 last_changed 2017-04-09 22:14:42.911169 created 2017-04-09 22:14:42.911169 0: 192.168.122.11:6789/0 mon.osd1 1: 192.168.122.12:6789/0 mon.osd2 2: 192.168.122.13:6789/0 mon.osd3
Mettre un osd en maintenance (ventile ses pg vers un autre osd) :
$ ceph osd out 0 marked out osd.0.
Remettre l'osd en production :
$ ceph osd in 0 marked in osd.0.
Stopper les IO sur les OSD :
$ ceph osd pause
$ ceph osd set pause
Reprendre les IO sur les OSD :
$ ceph osd unpause
$ ceph osd unset pause
Modifie les attributs du cluster :
$ ceph osd set noup # prevent OSDs from getting marked up $ ceph osd set nodown # prevent OSDs from getting marked down
Supprime les attributs du cluster :
$ ceph osd unset noup $ ceph osd unset nodown
Affiche l'espace par osd :
$ ceph osd df tree ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME -1 0.01469 - 15326M 129M 15197M 0.84 1.00 0 root default -2 0.00490 - 5108M 47668k 5062M 0.91 1.08 0 host osd1 0 0.00490 1.00000 5108M 47668k 5062M 0.91 1.08 43 osd.0 -3 0.00490 - 5108M 36584k 5073M 0.70 0.83 0 host osd2 1 0.00490 1.00000 5108M 36584k 5073M 0.70 0.83 45 osd.1 -4 0.00490 - 5108M 47960k 5062M 0.92 1.09 0 host osd3 2 0.00490 1.00000 5108M 47960k 5062M 0.92 1.09 46 osd.2 TOTAL 15326M 129M 15197M 0.84 MIN/MAX VAR: 0.83/1.09 STDDEV: 0.10
Indique l'état des OSD :
$ ceph osd stat osdmap e95: 3 osds: 3 up, 3 in flags sortbitwise,require_jewel_osds
Affiche les OSD avec le moins et le plus de PGs :
$ ceph osd utilization avg 44.6667 stddev 12.6051 (expected baseline 5.4569) min osd.0 with 29 pgs (0.649254 * mean) max osd.1 with 38 pgs (0.850746 * mean)
Affiche les informations sur les osd :
$ ceph osd dump epoch 95 fsid b7b11ce7-76c7-41c1-bbf3-b4283590a187 created 2017-04-09 22:14:59.459412 modified 2017-04-21 21:06:27.344080 flags sortbitwise,require_jewel_osds pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 pool 2 'cephfs_data' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1 pgp_num 1 last_change 60 flags hashpspool crash_replay_interval 45 stripe_width 0 pool 3 'cephfs_metadata' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 2 pgp_num 2 last_change 58 flags hashpspool stripe_width 0 max_osd 3 osd.0 up in weight 1 up_from 89 up_thru 93 down_at 82 last_clean_interval [80,81) 192.168.122.11:6801/3148 192.168.122.11:6802/3148 192.168.122.11:6803/3148 192.168.122.11:6804/3148 exists,up b98a967f-069d-4998-8b23-2fd16d145718 osd.1 up in weight 1 up_from 86 up_thru 93 down_at 85 last_clean_interval [80,83) 192.168.122.12:6800/1565 192.168.122.12:6801/1565 192.168.122.12:6802/1565 192.168.122.12:6803/1565 exists,up 43337286-307d-41bb-9e74-e3d130d348e5 osd.2 up in weight 1 up_from 93 up_thru 93 down_at 92 last_clean_interval [78,83) 192.168.122.13:6800/1525 192.168.122.13:6801/1525 192.168.122.13:6802/1525 192.168.122.13:6803/1525 exists,up 4ef5c79f-eecb-403d-ad4c-21ef3522fcff
Crush map :
$ ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.01469 root default -2 0.00490 host osd1 0 0.00490 osd.0 up 1.00000 1.00000 -3 0.00490 host osd2 1 0.00490 osd.1 up 1.00000 1.00000 -4 0.00490 host osd3 2 0.00490 osd.2 up 1.00000 1.00000
Dit scrub légé (programmé tous les jours). Permet de vérifier l'intégrité des fichiers via la taille des objets et leurs attributs : </xtermrtf> $ ceph osd scrub 1 osd.1 instructed to scrub </xtermrtf>
Scrub programmé une fois par semaine. Permet de vérifier l'intégralité des fichiers en calculant les checksums : </xtermrtf> $ ceph osd deep-scrub 1 osd.1 instructed to deep-scrub </xtermrtf>
Afficher l'OSD dans la crushmap :
$ ceph osd find 0 { "osd": 0, "ip": "192.168.122.11:6801\/2146", "crush_location": { "host": "osd1", "rack": "56-rack1", "room": "56", "root": "default" } }
Liste les pools :
$ ceph osd lspools 0 rbd,2 cephfs_data,3 cephfs_metadata,
$ ceph osd pool create pool_name pg_num pgp_num pgp_type crush_ruleset_name, expected_num_objects
$ ceph osd pool create gigix-data 128
Attention : Par défaut c'est un pool répliqué. Si l'on souhaite faire un pool de type erasure (c'est à dire de type raid pour consommer moins d'espace disque), il faut alors le préciser (il est possible de voir le type de pool avec la commande $ ceph osd pool ls detail. Attention à priori seulement compatible avec rgw !)
Attention : lors de la création d'un nouveau pool si on ne précise pas le profile à ceph il prendra le profile par défaut avec k=2 (nombre de morceaux) et m=1 (nombre de copies). Il est impossible de changer ces valeurs une fois le pool créé !
$ ceph osd erasure-code-profile ls default $ ceph osd erasure-code-profile get default k=2 m=1 plugin=jerasure technique=reed_sol_van
Pour créer un nouveau profile, par exemple avec k=10 et m=2 (cela signifie que l'on va écrire 12 objets ⇒ 10+2) :
$ ceph osd erasure-code-profile set gigix-erasure k=10 m=2 ruleset-failure-domain=room $ ceph osd erasure-code-profile get gigix-erasure jerasure-per-chunk-alignment=false k=10 m=2 plugin=jerasure ruleset-failure-domain=room ruleset-root=default technique=reed_sol_van w=8
Créons maintenant notre pool erasure :
$ ceph osd pool create gigix-erasure 128 128 erasure gigix-erasure
$ ceph osd pool get rbd size size: 3
$ ceph osd pool set rbd size 2 set pool 0 size to 2 $ ceph osd pool set rbd min_size 1 set pool 0 min_size to 1
$ ceph osd pool set-quota pool-name max_objects obj-count max_bytes bytes
$ ceph osd pool set-quota cephfs_data max_bytes 10000000 set-quota max_bytes = 10000000 for pool cephfs_data $ ceph osd pool get-quota cephfs_data quotas for pool 'cephfs_data': max objects: N/A max bytes : 9765kB $ dd if=/dev/zero of=gigix-object-file bs=200M count=1 1+0 enregistrements lus 1+0 enregistrements écrits 209715200 bytes (210 MB, 200 MiB) copied, 0,353353 s, 594 MB/s $ rados -p cephfs_data put gigix-object gigix-object-file 2017-05-01 19:33:57.073792 7fd005812a40 0 client.435355.objecter FULL, paused modify 0x55cdbb0fdcb0 tid 0
$ ceph osd pool mksnap rbd rbd-snapshot1 created pool rbd snap rbd-snapshot1
$ ceph osd pool rmsnap rbd rbd-snapshot1 removed pool rbd snap rbd-snapshot1
$ ceph osd pool delete pool-name pool-name --yes-i-really-really-mean-it
$ ceph osd pool rename current-pool-name new-pool-name
$ ceph osd pool stats pool rbd id 0 nothing is going on pool cephfs_data id 2 nothing is going on pool cephfs_metadata id 3 nothing is going on
$ ceph osd crush set id_or_name weight root=pool-name bucket-type=bucket-name ...
$ ceph osd crush set osd.0 1.0 root=data datacenter=dc1 room=room1 row=foo rack=bar host=foo-bar-1
$ ceph osd crush reweight osd.0 2.0
$ ceph osd crush remove osd.0
ceph osd crush move bucket-name bucket-type=bucket-name, ...
$ ceph osd crush rule dump $ ceph osd crush rule ls
* Créer une nouvelle règle de placement :
$ ceph osd crush rule dump room-crush $ ceph osd crush rule dump room-crush { "rule_id": 1, "rule_name": "room-crush", "ruleset": 1, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "chooseleaf_firstn", "num": 0, "type": "room" }, { "op": "emit" } ] } $ ceph osd pool set cephfs_data crush_ruleset 1
Sauvegarde la crushmap au format binaire :
$ ceph osd getcrushmap -o crush-compiled.map
Afficher les pg :
$ ceph pg ls
$ ceph pg dump | egrep -v '^(0\.|1\.|2\.|3\.)' | egrep -v '(^pool\ (0|1|2|3))' | column -t dumped all in format plain version 1415 stamp 2017-04-29 16:53:49.853359 last_osdmap_epoch 157 last_pg_scan 157 full_ratio 0.95 nearfull_ratio 0.85 pg_stat objects mip degr misp unf bytes log disklog state state_stamp v reported up up_primary acting acting_primary last_scrub scrub_stamp last_deep_scrub deep_scrub_stamp sum 42 0 0 0 0 25644782 976 976 osdstat kbused kbavail kb hb in hb out 2 58528 5173068 5231596 [0,1] [] 1 36868 5194728 5231596 [0,2] [] 0 58464 5173132 5231596 [1,2] [] sum 153860 15540928 15694788
Créer un objet :
$ dd if=/dev/zero of=gigix-object-file bs=10M count=1 $ rados -p cephfs_data put gigix-object gigix-object-file $ rados -p cephfs_data stat gigix-object cephfs_data/gigix-object mtime 2017-04-29 16:50:40.000000, size 10485760
Afficher le placement des objets d'un pool :
$ ceph osd map cephfs_data 3.1 osdmap e158 pool 'cephfs_data' (2) object '3.1' -> pg 2.3f1ee5f6 (2.0) -> up ([0,2], p0) acting ([0,2], p0) $ ceph pg map 3.1 osdmap e158 pg 3.1 (3.1) -> up [1,0] acting [1,0] $ ceph osd map cephfs_data gigix-object osdmap e157 pool 'cephfs_data' (2) object 'gigix-object' -> pg 2.6482e6fe (2.0) -> up ([0,2], p0) acting ([0,2], p0)
Tente de réparer un pg indiqué comme corrompu :
$ ceph pg repair 3.1
Affiche les PGs :
$ ceph pg dump
Affiche tous les PGs ou les PGs dans un états spécifiques :
$ ceph pg ls $ ceph pg ls undersized degraded
Affiche le mapping du pg 3.1 :
$ ceph pg map 3.1 osdmap e490 pg 3.1 (3.1) -> up [3,0] acting [3,0]
Le PG se situe sur l'OSD 1 (primaire) et sur l'OSD 0 (secours).
Affiche les Pgs primaires présents sur l'OSD 0 :
$ ceph pg ls-by-primary 0
Affiche les PGs primaires ou secondaires présents sur l'OSD 0 :
$ ceph pg ls-by-osd 0
Affiche les PGs présents sur un pool :
$ ceph pg ls-by-pool cephfs_data
Effectue un scrub sur un PG :
$ ceph pg scrub 3.1 instructing pg 3.1 on osd.3 to scrub
Crée un nouveau FS :
$ ceph fs new gigifs gigix-metadata gigix-data
Liste les fs :
$ ceph fs ls name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
Affiche les informations d'un fs :
$ ceph fs get cephfs Filesystem 'cephfs' (2) fs_name cephfs epoch 22 flags 0 created 2017-04-11 22:36:59.346001 modified 2017-04-11 22:36:59.346001 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 last_failure 0 last_failure_osd_epoch 102 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2} max_mds 1 in 0 up {0=194098} failed damaged stopped data_pools 2 metadata_pool 3 inline_data disabled 194098: 192.168.122.11:6800/1355 'osd1' mds.0.19 up:active seq 5
Idem que ceph_fs_list mais pour tous les FS.
ceph fs dump
Afficher les droits + clés :
$ ceph auth list installed auth entries: mds.osd1 key: AQALQO1YsZg7HRAAEz1SWK4F5PAQg8zD5ou3mw== caps: [mds] allow caps: [mon] allow profile mds caps: [osd] allow rwx osd.0 key: AQA6nepYBo9VLRAAXbj/Ujoj5LpjVpCcol+11A== caps: [mon] allow profile osd caps: [osd] allow * osd.1 key: AQAMqOpY7049KhAAtGWNoGZN77s06RXZiVevog== caps: [mon] allow profile osd caps: [osd] allow * osd.2 key: AQA/qOpY4NeWMhAAdrIjujFX5XM8HLd8umrdTQ== caps: [mon] allow profile osd caps: [osd] allow * client.admin key: AQBMlupYMBKGDBAAmDpq0QeAgN7jETvG8qN7Pw== caps: [mds] allow * caps: [mon] allow * caps: [osd] allow * client.bootstrap-mds key: AQBMlupYuPVVGBAA8NP4FK0BWgBmCim142o6vg== caps: [mon] allow profile bootstrap-mds client.bootstrap-osd key: AQBMlupYBOdfJRAArJOfVe39w5rSsuuqgHNDPA== caps: [mon] allow profile bootstrap-osd client.bootstrap-rgw key: AQBMlupYv5etMhAAhIjWSYtRdMiGI5VoX3NVog== caps: [mon] allow profile bootstrap-rgw client.gigix key: AQA8Ge1YhRdzJhAAVvowynbYa7Ge5NBoroyOvg== caps: [mds] allow rw caps: [mon] allow r caps: [osd] allow rw pool=gigix,allow
Afficher une entrée en particulier :
$ ceph auth get client.gigix exported keyring for client.gigix [client.gigix] key = AQA8Ge1YhRdzJhAAVvowynbYa7Ge5NBoroyOvg== caps mds = "allow rw" caps mon = "allow r" caps osd = "allow rw pool=gigix,allow"
Affiche la clé d'une entrée :
$ ceph auth get-key client.gigix AQA8Ge1YhRdzJhAAVvowynbYa7Ge5NBoroyOvg==
Ajoute un utilisateur :
$ ceph auth add client.gigix mon 'allow r' mds 'allow rw' osd 'allow rw pool=gigix-metadata,allow rw pool=gigix-data'
Comme ceph_auth_add mais retourne l'utilisateur et la clé (même si l'utilisateur existe déjà) :
$ ceph auth get-or-create client.gigix-client mon 'allow r' mds 'allow rw' osd 'allow rw pool=gigix-metadata,allow rw pool=gigix-data'
Comme ceph auth get-or-create mais retourne seulement la clé (même si l'utilisateur existe déjà) :
$ ceph auth get-or-create-key client.gigix-client mon 'allow r' mds 'allow rw' osd 'allow rw pool=gigix-metadata,allow rw pool=gigix-data' -o gigix.key
Affiche la clé d'un utilisateur :
$ ceph auth print-key AQA8Ge1YhRdzJhAAVvowynbYa7Ge5NBoroyOvg==
Modifie le autorisations pour un compte :
$ ceph auth caps client.gigix mon 'allow r' osd 'allow rw pool=test'
Supprime des droits :
$ ceph auth caps client.gigix mon ' ' osd ' '
Exporte toutes ou une clé :
$ ceph auth export client.gigix -o gigix.key
Importe des utilisateurs :
$ ceph auth import -i gigix.key
Supprime un utilisateur :
$ ceph auth del client.gigix
Idem que ceph_auth_del.
Mesurer la performance d'un OSD :
$ ceph tell osd.0 bench { "bytes_written": 1073741824, "blocksize": 4194304, "bytes_per_sec": 234195125 }
Idem que ceph_df :
$ rados df pool name KB objects clones degraded unfound rd rd KB wr wr KB cephfs_data 2 1 0 0 0 0 0 262 524291 cephfs_metadata 79 20 0 0 0 101 286 123 145 rbd 0 0 0 0 0 0 0 0 0 total used 109920 21 total avail 15584868 total space 15694788
Crée un block device :
$ rbd create new-libvirt-image --size 2048 -p cephfs_data
Affiche les info d'un block device :
$ rbd info new-libvirt-image -p cephfs_data rbd image 'new-libvirt-image': size 2048 MB in 512 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.432b374b0dc51 format: 2 features: layering flags:
Affiche les images d'un pool :
$ rbd -p cephfs_data ls -l NAME SIZE PARENT FMT PROT LOCK new-libvirt-image 2048M 2
Cloner un snapshot :
$ rbd clone pool1/image1@snapshot1 pool1/image2
Lister les fils d'un snapshot :
$ rbd --pool pool1 children --image image1 --snap snapshot1 $ rbd children pool1/image1@snapshot1
Pour supprimer un snaphot clone, vous devez supprimer sa référence à son parent :
$ rbd --pool pool1 flatten --image image1 $ rbd flatten pool1/image1
Créer un snapshot :
$ rbd --pool rbd snap create --snap snapshot1 image1 $ rbd snap create rbd/image1@snapshot1
Affiche les snaphosts :
$ rbd --pool rbd snap ls image1 $ rbd snap ls rbd/image1
Rollback sur snapshot :
$ rbd --pool pool1 snap rollback --snap snapshot1 image1 $ rbd snap rollback pool1/image1@snapshot1
Protéger un snapshot contre la suppression :
$ rbd --pool pool1 snap protect --image image1 --snap snapshot1 $ rbd snap protect pool1/image1@snapshot1
Supprimer la protection d'un snapshot afin de pouvoir le suprimer :
$ rbd --pool pool1 snap unprotect --image image1 --snap snapshot1 $ rbd snap unprotect pool1/image1@snapshot1
Supprime un snapshot :
$ rbd --pool pool1 snap rm --snap snapshot1 image1 $ rbd snap rm pool1/imag1@snapshot1
Supprime l'ensemble des snapshots :
$ rbd --pool pool1 snap purge image1 $ rbd snap purge pool1/image1
Exemple avec 2 gateway ISCSI.
$ rbd create cephfs_data/gigix-iscsi -s 3G
$ zypper in -t pattern ceph_iscsi
$ systemctl enable lrbd.service
{ "auth": [ { "target": "iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi", "authentication": "none" } ], "targets": [ { "target": "iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi", "hosts": [ { "host": "osd1", "portal": "portal-56" }, { "host": "osd3", "portal": "portal-66" } ] } ], "portals": [ { "name": "portal-56", "addresses": [ "192.168.122.11" ] }, { "name": "portal-66", "addresses": [ "192.168.122.13" ] } ], "pools": [ { "pool": "cephfs_data", "gateways": [ { "target": "iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi", "tpg": [ { "image": "gigix-iscsi" } ] } ] } ] }
$ lrbd rbd -p cephfs_data --name client.admin map gigix-iscsi /dev/rbd0 targetcli /backstores/rbd create name=cephfs_data-gigix-iscsi dev=/dev/rbd/cephfs_data/gigix-iscsi Created RBD storage object cephfs_data-gigix-iscsi using /dev/rbd/cephfs_data/gigix-iscsi. targetcli /iscsi create iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi Created target iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi. Selected TPG Tag 1. Created TPG 1. targetcli /iscsi/iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi create 2 Created TPG 2. targetcli /iscsi/iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi/tpg2 disable The TPG has been disabled. targetcli /iscsi/iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi/tpg1 disable The TPG has been disabled. targetcli /iscsi/iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi/tpg1/luns create /backstores/rbd/cephfs_data-gigix-iscsi Selected LUN 0. Created LUN 0. targetcli /iscsi/iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi/tpg2/luns create /backstores/rbd/cephfs_data-gigix-iscsi Selected LUN 0. Created LUN 0. targetcli /iscsi/iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi/tpg1/portals create 192.168.122.11 Using default IP port 3260 IP address 192.168.122.11 does not exist on this host. Created network portal 192.168.122.11:3260. targetcli /iscsi/iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi/tpg2/portals create 192.168.122.13 Using default IP port 3260 Created network portal 192.168.122.13:3260. targetcli /iscsi/iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi/tpg2 set attribute authentication=0 demo_mode_write_protect=0 generate_node_acls=1 Parameter authentication is now '0'. Parameter demo_mode_write_protect is now '0'. Parameter generate_node_acls is now '1'. targetcli /iscsi/iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi/tpg1 set attribute authentication=0 demo_mode_write_protect=0 generate_node_acls=1 Parameter authentication is now '0'. Parameter demo_mode_write_protect is now '0'. Parameter generate_node_acls is now '1'. targetcli /iscsi/iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi/tpg2 enable The TPG has been enabled.
ou
$ systemctl start lrbd
$ targetcli ls o- / ......................................................................................................................... [...] o- backstores .............................................................................................................. [...] | o- fileio ................................................................................................... [0 Storage Object] | o- iblock ................................................................................................... [0 Storage Object] | o- pscsi .................................................................................................... [0 Storage Object] | o- rbd ...................................................................................................... [1 Storage Object] | | o- cephfs_data-gigix-iscsi ...................................................... [/dev/rbd/cephfs_data/gigix-iscsi activated] | o- rd_mcp ................................................................................................... [0 Storage Object] o- ib_srpt ........................................................................................................... [0 Targets] o- iscsi .............................................................................................................. [1 Target] | o- iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi .................................................................. [2 TPGs] | o- tpg1 ........................................................................................................... [disabled] | | o- acls ........................................................................................................... [0 ACLs] | | o- luns ............................................................................................................ [1 LUN] | | | o- lun0 ................................................. [rbd/cephfs_data-gigix-iscsi (/dev/rbd/cephfs_data/gigix-iscsi)] | | o- portals ...................................................................................................... [1 Portal] | | o- 192.168.122.11:3260 ............................................................................... [OK, iser disabled] | o- tpg2 ............................................................................................................ [enabled] | o- acls ........................................................................................................... [0 ACLs] | o- luns ............................................................................................................ [1 LUN] | | o- lun0 ................................................. [rbd/cephfs_data-gigix-iscsi (/dev/rbd/cephfs_data/gigix-iscsi)] | o- portals ...................................................................................................... [1 Portal] | o- 192.168.122.13:3260 ............................................................................... [OK, iser disabled] o- loopback .......................................................................................................... [0 Targets] o- qla2xxx ........................................................................................................... [0 Targets] o- tcm_fc ............................................................................................................ [0 Targets] o- vhost ............................................................................................................. [0 Targets]
$ iscsiadm -m discovery -t sendtargets -p 192.168.122.11 192.168.122.11:3260,1 iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi 192.168.122.13:3260,2 iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi
$ iscsiadm -m node -p 192.168.122.11 --login Logging in to [iface: default, target: iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi, portal: 192.168.122.11,3260] (multiple) Login to [iface: default, target: iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi, portal: 192.168.122.11,3260] successful. $ iscsiadm -m node -p 192.168.122.13 --login Logging in to [iface: default, target: iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi, portal: 192.168.122.13,3260] (multiple) Login to [iface: default, target: iqn.2003-01.org.linux-iscsi.iscsi.x86:gigix-iscsi, portal: 192.168.122.13,3260] successful.
$ lsscsi -s [0:0:0:0] cd/dvd QEMU QEMU DVD-ROM 2.5+ /dev/sr0 - [2:0:0:0] disk SUSE RBD 4.0 /dev/sda 3.22GB [3:0:0:0] disk SUSE RBD 4.0 /dev/sdb 3.22GB
/etc/multipath.conf
:defaults { user_friendly_names yes } devices { device { vendor "(LIO-ORG|SUSE)" product "RBD" path_grouping_policy "multibus" path_checker "tur" features "0" hardware_handler "1 alua" prio "alua" failback "immediate" rr_weight "uniform" no_path_retry 12 rr_min_io 100 } }
$ multipath -ll mpatha (3600140571dc15dc9fa13437ae8840470) dm-2 SUSE ,RBD size=3.0G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw `-+- policy='service-time 0' prio=50 status=active |- 2:0:0:0 sda 8:0 active ready running `- 3:0:0:0 sdb 8:16 active ready running
$ parted /dev/mapper/mpatha mklabel gpt mkpart primary xfs 0% 100% $ parted /dev/mapper/mpatha print Modèle: Mappeur de périphériques Linux (multipath) (dm) Disque /dev/mapper/mpatha : 3221MB Taille des secteurs (logiques/physiques): 512B/512B Table de partitions : gpt Disk Flags: Numéro Début Fin Taille Système de fichiers Nom Fanions 1 4194kB 3217MB 3213MB primary
$ mkfs.xfs /dev/mapper/mpatha1 meta-data=/dev/mapper/mpatha1 isize=512 agcount=9, agsize=97280 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=0, rmapbt=0, reflink=0 data = bsize=4096 blocks=784384, imaxpct=25 = sunit=1024 swidth=1024 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=8 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0
Sauvegarde la crush map decompilée :
$ ceph osd getcrushmap | crushtool -d - -o crush.map
Editer le fichier crush map généré et recompilez la :
$ crushtool -c crush.map -o crush-compiled.map
Importer la nouvelle map :
$ ceph osd setcrushmap -i crush-compiled.map
$ ceph-deploy disk list $(ceph node ls osd | awk -F'"' '{print $2 }' | xargs echo) 2>&1 | egrep 'journal /dev/' [osd1][DEBUG ] /dev/vdb1 ceph data, active, cluster ceph, osd.0, journal /dev/vdb2 [osd2][DEBUG ] /dev/vdb1 ceph data, active, cluster ceph, osd.1, journal /dev/vdb2 [osd3][DEBUG ] /dev/vdb1 ceph data, active, cluster ceph, osd.2, journal /dev/vdb2 [osd4][DEBUG ] /dev/vdb1 ceph data, active, cluster ceph, osd.3, journal /dev/vdb2
$ rbd create -s 3G cephfs_data/gigix $ rbd info cephfs_data/gigix rbd image 'gigix': size 3072 MB in 768 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.2fb9b74b0dc51 format: 2 features: layering flags: $ rbd ls -p cephfs_data gigix rados -p cephfs_data ls rbd_data.2fb9b74b0dc51.00000000000000fe rbd_id.gigix rbd_data.2fb9b74b0dc51.00000000000001f8 rbd_data.2fb9b74b0dc51.000000000000013b rbd_data.2fb9b74b0dc51.00000000000001ff rbd_directory rbd_data.2fb9b74b0dc51.0000000000000276 rbd_data.2fb9b74b0dc51.00000000000001b9 rbd_data.2fb9b74b0dc51.000000000000003f rbd_data.2fb9b74b0dc51.0000000000000001 10000000005.00000000 rbd_data.2fb9b74b0dc51.000000000000007e rbd_data.2fb9b74b0dc51.0000000000000237 rbd_header.2fb9b74b0dc51 rbd_data.2fb9b74b0dc51.00000000000000fd rbd_data.2fb9b74b0dc51.00000000000002f4 rbd_data.2fb9b74b0dc51.000000000000017a rbd_data.2fb9b74b0dc51.0000000000000000 rbd_data.2fb9b74b0dc51.00000000000002b5 rbd_data.2fb9b74b0dc51.00000000000000bd rbd_data.2fb9b74b0dc51.00000000000000fc
L'id de cette image est 2fb9b74b0dc51
.
Ajouter l'authentification pour limiter l'accès :
$ ceph auth get-or-create client.gigix mon 'allow r' osd 'allow rwx object_prefix rbd_data.2fb9b74b0dc51; allow rwx object_prefix rbd_header.2fb9b74b0dc51; allow rx object_prefix rbd_id.gigix' -o /etc/ceph/ceph.client.gigix.keyring
$ cat /etc/ceph/ceph.client.gigix.keyring [client.myclient] key = AQB+EgZZBouuJBAATiBOKU+gYeNZgRB3qmT/Pg==
On est autorisé seulement à monter l'image :
$ rbd -p cephfs_data --id gigix ls $ rbd -p cephfs_data --id gigix create -s 5G gigix rbd: create error: (1) Operation not permitted 2017-04-30 18:53:44.976605 7f040383ee80 -1 librbd: Could not tell if gigix already exists rbd: list: (1) Operation not permitted $ rbd -p cephfs_data --id gigix map gigix /dev/rbd0 $ rbd -p cephfs_data --id gigix unmap gigix
1. Arrêter les clients
2. Le cluster doit être dans un état HEALTH_OK
3. Mettre les flags noout, norecover, norebalance, nobackfill, nodown et pause
$ ceph osd set noout $ ceph osd set norecover $ ceph osd set norebalance $ ceph osd set nobackfill $ ceph osd set nodown $ ceph osd set pause4. Arrêter les OSD 1 par 1
1. Allumer les serveurs d'admin
2. Allumer les serveurs MON
3. Allumer les serveurs OSD
4. Attendre que tous les noeuds et services soient démarrés
5. Supprimer les flags noout, norecover, noreblance, nobackfill, nodown et pause
$ ceph osd unset noout $ ceph osd unset norecover $ ceph osd unset norebalance $ ceph osd unset nobackfill $ ceph osd unset nodown $ ceph osd unset pause6. Vérifier que le cluster est dans l'état HEALTH_OK
Voir : https://ceph.com/geen-categorie/ceph-pool-migration/
pool=testpool $ ceph osd pool create $pool.new 4096 4096 erasure default $ rados cppool $pool $pool.new $ ceph osd pool rename $pool $pool.old $ ceph osd pool rename $pool.new $pool
Affiche par pool le status des PGs qui ne sont pas dans un état 'active+clean :
$ for pool in $(ceph osd pool ls); do echo -e "\n\n\n========== $pool ==========" && ceph pg ls-by-pool $pool | egrep -v 'active\+clean'; done
Affiche par osd le status des PGs qui ne sont pas dans un état 'active+clean (attention cela affiche les osd ayant un PG secodnaire impacté) :
$ for osd in $(ceph osd ls); do echo -e "\n\n\n========== osd.$osd ==========" && ceph pg ls-by-osd $osd | egrep -v 'active\+clean'; done
ou
$ ceph helth detail
$ setfattr -n ceph.quota.max_bytes -v 100000000 /some/dir # 100 MB $ setfattr -n ceph.quota.max_files -v 10000 /some/dir # 10,000 files
$ getfattr -n ceph.quota.max_bytes /some/dir $ getfattr -n ceph.quota.max_files /some/dir
$ setfattr -n ceph.quota.max_bytes -v 0 /some/dir $ setfattr -n ceph.quota.max_files -v 0 /some/dir
Piste de réflexion avec serveur disposant de 24 emplacements disques :
2 OS 4 SSD journal 16 HDD data (2To) => 32T / Serveur ======= 22/24 hdd 64G RAM ?
2 OS 4 SSD journal 18 SSD DATA (1To) ou si pas de journal séparé sur 22 SSD data => entre 18 et 22T / serveur ================== 24/24 ssd Entre 18 et 22G RAM ?
Pour solution HDD :
Protocol RDMA pour :
Tunning :
[global] ... ms_type=async+rdma ms_async_rdma_device_name=mlx5_0