Professional Documents
Culture Documents
Ceph 스토리지, PaaS로 서비스 운영하기
Ceph 스토리지, PaaS로 서비스 운영하기
Hyun Ha @ naver
In-House Platform
Security Storage DB
In-House PaaS Platform - PASTA
Security Storage DB
Goal:
In-House PaaS Platform - PASTA
Containerized
(stateless / stateful)
Mission:
Persistent Volume for (stateful)Container
Ceph
Container
Data
Container
Why Ceph?
Jenkins Farm(RBD)
Elastic-Search Farm(RBD)
etc …
Use Case #1: 서비스 생성 Flow
Docker Registry
Container Docker-plugin
/dev/rbd0
Host
Docekr-Swarm
1. Login
Client PASTA
Ceph
Volume
2. 인증/권한
Keystone Cinder
Use Case #1: 서비스 생성 Flow
Docker Registry
Container Docker-plugin
/dev/rbd0
Host
Docekr-Swarm
3. 서비스 생성 요청
Client PASTA
Ceph
Volume
Keystone Cinder
Use Case #1: 서비스 생성 Flow
Docker Registry
Container Docker-plugin
/dev/rbd0
Host
Docekr-Swarm
4. Container 생성
Client PASTA
Ceph
Volume
Keystone Cinder
Use Case #1: 서비스 생성 Flow
5. Image download / Container 생성
Docker Registry
Container Docker-plugin
/dev/rbd0
Host
Docekr-Swarm
Client PASTA
Ceph
Volume
Keystone Cinder
Use Case #1: 서비스 생성 Flow
6. Volume 생성
Docker Registry
Container Docker-plugin
/dev/rbd0
Host
Docekr-Swarm
Client PASTA
Ceph
Volume
Keystone Cinder
Use Case #1: 서비스 생성 Flow
Docker Registry
Container Docker-plugin
/dev/rbd0
Host
Docekr-Swarm
Client PASTA
Ceph
Volume
7. 인증/권한
Keystone Cinder
Use Case #1: 서비스 생성 Flow
Docker Registry
Container Docker-plugin
/dev/rbd0
Host
Docekr-Swarm
Client PASTA
Ceph
Volume
Keystone Cinder
8. Volume 생성
Use Case #1: 서비스 생성 Flow
Docker Registry
Container Docker-plugin
/dev/rbd0
Host
Docekr-Swarm
9. Attach
Client PASTA
Ceph
Volume
Keystone Cinder
Use Case #1: 서비스 생성 Flow
Docker Registry
Container Docker-plugin
/dev/rbd0
Host
Docekr-Swarm
Client PASTA
Ceph
Volume
Keystone Cinder
Use Case #2: Ceph UI
Use Case #3: Ceph CLI
$ ceph-cli -h
NAME:
ceph-cli - use ceph volume for your PM/VM!
USAGE:
ceph-cli [global options] command [command options] [arguments...]
COMMANDS:
auth Certify with `TOKEN`
show Gets detailed information about the given `VOLUME ID` or `VOLUME NAME`
list Get volume list
create Create volume
delete Delete Volume by `VOLUME ID` or `VOLUME NAME`
attach Attach volume
detach Detach volume
extend Extend volume
reset Rest volume
GLOBAL OPTIONS:
--debug, -d Enable debug logging
--help, -h show help
Ceph - Component
MON
OSD OSD OSD OSD
MDS
…
OSD OSD OSD OSD
Ceph – Pool
Ceph Cluster
HDD OSD.4
Hybrid Pool
HDD Pool HDD OSD.2 HDD OSD.3 HDD OSD.5
Host #1
Container
Mapped
Container Volume
Container
Host #1 Mapped
Container
Mapped
Container Volume
Container
Host #1 Mapped
Container
Mapped
Container Volume
?
Container
System 정상화
Host blacklist 등록을 통한 multi-map 방지
• Add blacklist
$ ceph osd blacklist add ${client_ip} 10
blacklisting ${client_ip}:0/0 until 2018-05-02 (10 sec)
• Automation
Call action Add blacklist
Watchers:
watcher=${client_host_1_ip}:0/1015181303 client.259635408 cookie=18446462598732840991
watcher=${client_host_2_ip}:0/4152018459 client.201522571 cookie=18446462598732841309
• Monitoring / alarm
ISSUE#2 : Upgrade Ceph
Upgrade Policy:
• ceph/src/vstart.sh
$ MON=1 MDS=1 ../src/vstart.sh -d -n -x
• Ceph-ansible (https://github.com/ceph/ceph-ansible)
• Kolla (https://github.com/openstack/kolla)
(https://review.openstack.org/#/c/566810/)
주의할 점 : Kolla
ü Health check
주의할 점 : Configuration
libceph
/dev/rbd0
libceph
/dev/rbd0
http://tracker.ceph.com/issues/20927#change-96952
동일 이슈 – Rook(Storage Orchestration for Kubernetes)
# in-flight I/O
$ rbd unmap –o full-force
https://github.com/rook/rook/pull/1179/files#diff-dabdd325e9ee838bb51e4f3f6c5b046cR142
우리가 해결한 방법은?
(Linux kernel : 4.11 이후)
rbd map 이 아닌 rbd kernel map 사용, “osd_request_timeout” 옵션 적용
(3600 second)
http://docs.ceph.com/docs/argonaut/rbd/rbd-ko/
rbd kernel map을 사용 시 이슈 1)
: Client 에서 keyring 이 그대로 노출됨
$ cat /sys/bus/rbd/devices/0/config_info
[mon_ip] name=admin,secret=AQDnZHxxAAy4OUSreyDDE6YMwKOT4Bug==
Ø 해결 방법: Keyutils 적용
$ cat config_info
[mon_ip] name=admin,key=client.admin volumes vol01 -
rbd kernel map을 사용 시 이슈 2)
: different major number
$ ls -al /dev/rbd*
brw------- 1 root root 252, 0 Dec 5 21:16 /dev/rbd0
brw------- 1 root root 251, 0 Dec 5 21:16 /dev/rbd1
brw------- 1 root root 242, 0 Dec 5 21:17 /dev/rbd10
brw------- 1 root root 241, 0 Dec 7 15:43 /dev/rbd11
brw------- 1 root root 240, 0 Dec 18 11:49 /dev/rbd12
Ø 해결 방법 : single_major 적용
echo ”${mon_ip} name=admin,secret=*** volumes vol01" > /sys/bus/rbd/add_single_major
$ ls -la /dev/rbd*
brw-rw----. 1 root disk 252, 0 Feb 8 02:14 /dev/rbd0
brw-rw----. 1 root disk 252, 16 Feb 8 02:13 /dev/rbd1
brw-rw----. 1 root disk 252, 32 Feb 8 02:14 /dev/rbd2
brw-rw----. 1 root disk 252, 48 Feb 8 02:20 /dev/rbd3
brw-rw----. 1 root disk 252, 64 Feb 8 02:29 /dev/rbd4
ISSUE #4 : scrub / deep-scrub
Deep Scrub 시 performance impact
Ø Configuration
Ø Set noscrub/nodeep-scrub
Ø Manual Schedule
• Scrub : 전체 pg 가 2일에 1회 실행
• Deep scrub : 전체 pg가 34일에 1회 실행(동시 최대 1개)
ISSUE #5 : RBD Image 복구
Ceph Object 저장 방식 – Directory 구조
/var/lib/ceph/osd/ceph-4/current/3.72_head/
rbd_data.2576d643c9869.0000000000000000__head_22269772__3
$ hexdump -C /var/lib/ceph/osd/ceph-4/current/3.7a_head/rbd\\uid.vol01__head_E10E397A__3
00000000 0d 00 00 00 32 35 37 36 64 36 34 33 63 39 38 36 |....2576d643c986|
00000010 39 |9|
Block_name_prefix : 2576d643c9869
MON
Service : Fine
Mon 장애 시나리오
MON
Service : Failure
(단, client I/O는 정상. rbd map/unmap 불가)
Mon 장애 복구
다 죽은 경우는?
MON
Mon 장애 복구
MON MON
Remote backup
백업본으로 복구!
Mon backup
Local backup
MON
Local backup
Remote backup
Mon backup
/var/lib/ceph/mon/ceph-{mon_id}
MON
/local_backup /dev/rbd
백업 정책>
하 현 / Hyun Ha
hyun.ha@navercorp.com