一:对象存储 RadosGW 使用
1.1 RadosGW 对象存储简介
RadosGW 是对象存储(OSS,Object Storage Service)的一种实现方式,RADOS 网关也称为 Ceph 对象网关、RADOSGW、RGW,是一种服务,使客户端能够利用标准对象存储 API 来访问 Ceph 集群,它支持AWS S3和Swift API,rgw运行于librados之上,在ceph 0.8版本之后使用Civetweb 的 web 服务器来响应 api 请求,可以使用 nginx 或或者 apache 替代,客户端基于 http/https 协议通过 RESTful API 与 rgw 通信,而 rgw 则使用 librados 与 ceph 集群通信,rgw 客户端通 过 s3 或者 swift api 使用 rgw 用户进行身份验证,然后 rgw 网关代表用户利用 cephx 与 ceph 存储进行身份验证。
S3 由 Amazon 于 2006 年推出,全称为 Simple Storage Service,S3 定义了对象存储,是对象存 储事实上的标准,从某种意义上说,S3 就是对象存储,对象存储就是 S3,它对象存储市场的 霸主,后续的对象存储都是对 S3 的模仿。
1.2 对象存储特点
通过对象存储将数据存储为对象,每个对象除了包含数据,还包含数据自身的元数据。 对象通过 Object ID 来检索,无法通过普通文件系统的方式通过文件路径及文件名称操作来 直接访问对象,只能通过 API 来访问,或者第三方客户端(实际上也是对 API 的封装)。 对象存储中的对象不整理到目录树中,而是存储在扁平的命名空间中,Amazon S3 将这个扁平命名空间称为 bucket,而 swift 则将其称为容器。 无论是 bucket 还是容器,都不能嵌套。 bucket 需要被授权才能访问到,一个帐户可以对多个 bucket 授权,而权限可以不同。 方便横向扩展、快速检索数据 不支持客户端挂载,且需要客户端在访问的时候指定文件名称。 不是很适用于文件过于频繁修改及删除的场景。
ceph 使用 bucket)作为存储桶(存储空间),实现对象数据的存储和多用户隔离,数据存储在 bucket 中,用户的权限也是针对 bucket 进行授权,可以设置用户对不同的 bucket 拥有不通 的权限,以实现权限管理
1.3 部署 RadosGW 服务
将 ceph-mgr1、ceph-mgr2 服务器部署为高可用的 radosGW 服务
1.3.1 安装 radosgw 服务并初始化
#mgr节点
test@ceph-mgr1:~$ sudo apt install radosgw
test@ceph-mgr2:~$ sudo apt install radosgw
#deploy节点
#在 ceph deploy 服务器将 ceph-mgr1 ceph-mgr2初始化为 radosGW 服务
test@ceph-deploy:~/ceph-cluster$ ceph-deploy rgw create ceph-mgr1
test@ceph-deploy:~/ceph-cluster$ ceph-deploy rgw create ceph-mgr2
1.3.2 验证 radosgw 服务状态
#deploy节点
test@ceph-deploy:~/ceph-cluster$ ceph -s
cluster:
id: 635d9577-7341-4085-90ff-cb584029a1ea
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 2h)
mgr: ceph-mgr2(active, since 20h), standbys: ceph-mgr1
mds: 2/2 daemons up, 2 standby
osd: 12 osds: 12 up (since 2h), 12 in (since 2d)
rgw: 2 daemons active (2 hosts, 1 zones) #2个运行
data:
volumes: 1/1 healthy
pools: 10 pools, 329 pgs
objects: 372 objects, 314 MiB
usage: 1.8 GiB used, 238 GiB / 240 GiB avail
pgs: 329 active+clean
1.3.3 验证 radosgw 服务进程
#mgr节点
test@ceph-mgr1:~$ ps -ef|grep radosgw
ceph 608 1 0 06:43 ? 00:00:27 /usr/bin/radosgw -f --cluster ceph --name client.rgw.ceph-mgr1 --setuser ceph --setgroup ceph
1.3.4 访问 radosgw 服务
#deploy节点
test@ceph-deploy:~/ceph-cluster$ curl http:
<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>
test@ceph-deploy:~/ceph-cluster$ curl http:
<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>
1.4 radosgw 服务配置
1.4.1 radosgw 高可用架构
1.4.2 自定义端口
radosgw 服务器(ceph-mgr1、ceph-mgr2)的配置文件要和deploy服务器的一致,可以ceph-deploy 服务器修改然后统一推送,或者单独修改每个 radosgw 服务器的配置为同一配置
#deploy节点
test@ceph-deploy:~/ceph-cluster$ cat ceph.conf
[global]
fsid = 635d9577-7341-4085-90ff-cb584029a1ea
public_network = 10.0.0.0/24
cluster_network = 192.168.133.0/24
mon_initial_members = ceph-mon1
mon_host = 10.0.0.101
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
mon clock drift allowed = 2
mon clock drift warn backoff = 30
[mds.ceph-mgr2]
#mds_standby_for_fscid = mycephfs
mds_standby_for_name = ceph-mgr1
mds_standby_replay = true
[mds.ceph-mon3]
mds_standby_for_name = ceph-mon2
mds_standby_replay = true
[client.rgw.ceph-mgr1]
rgw_host = ceph-mgr1
rgw_frontends = civetweb port=9900
[client.rgw.ceph-mgr2]
rgw_host = ceph-mgr2
rgw_frontends = civetweb port=9900
#进行推送
test@ceph-deploy:~/ceph-cluster$ scp ceph.conf root@10.0.0.104:/etc/ceph/
test@ceph-deploy:~/ceph-cluster$ scp ceph.conf root@10.0.0.105:/etc/ceph/
#mgr节点
#进行重启服务
test@ceph-mgr1:/etc/ceph$ sudo systemctl restart ceph-radosgw@rgw.ceph-mgr1.service
test@ceph-mgr2:~$ sudo systemctl restart ceph-radosgw@rgw.ceph-mgr2.service
1.4.3 启用 SSL
生成签名证书并配置 radosgw 启用 SSL
1.4.3.1 自签名证书
#mgr2节点
test@ceph-mgr2:~$ cd /etc/ceph/
test@ceph-mgr2:/etc/ceph$ sudo mkdir certs
test@ceph-mgr2:/etc/ceph$ cd certs/
test@ceph-mgr2:/etc/ceph/certs$ sudo openssl genrsa -out civetweb.key 2048
test@ceph-mgr2:/etc/ceph/certs$ sudo openssl req -new -x509 -key civetweb.key -out civetweb.crt -subj "/CN=rgw.magedu.net"
root@ceph-mgr2:/etc/ceph/certs# cat civetweb.key civetweb.crt > civetweb.pem
root@ceph-mgr2:/etc/ceph/certs# ls
civetweb.crt civetweb.key civetweb.pem
1.4.3.2 SSL 配置
#mgr节点
root@ceph-mgr2:/etc/ceph# cat ceph.conf
[global]
fsid = 635d9577-7341-4085-90ff-cb584029a1ea
public_network = 10.0.0.0/24
cluster_network = 192.168.133.0/24
mon_initial_members = ceph-mon1
mon_host = 10.0.0.101
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
mon clock drift allowed = 2
mon clock drift warn backoff = 30
[mds.ceph-mgr2]
#mds_standby_for_fscid = mycephfs
mds_standby_for_name = ceph-mgr1
mds_standby_replay = true
[mds.ceph-mon3]
mds_standby_for_name = ceph-mon2
mds_standby_replay = true
[client.rgw.ceph-mgr1]
rgw_host = ceph-mgr1
rgw_frontends = civetweb port=9900
[client.rgw.ceph-mgr2]
rgw_host = ceph-mgr2
rgw_frontends = civetweb port=9900
[client.rgw.ceph-mgr2]
rgw_host = ceph-mgr2
rgw_frontends = "civetweb port=9900+9443s ssl_certificate=/etc/ceph/certs/civetweb.pem"
#重启服务
root@ceph-mgr2:/etc/ceph# systemctl restart ceph-radosgw@rgw.ceph-mgr2.service
1.4.3.3 验证 9443 端口
mgr节点
root@ceph-mgr2:/etc/ceph# ss -tln
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 127.0.0.1:6010 0.0.0.0:*
LISTEN 0 128 0.0.0.0:9443 0.0.0.0:*
LISTEN 0 128 0.0.0.0:9900 0.0.0.0:*
LISTEN 0 128 10.0.0.105:6800 0.0.0.0:*
LISTEN 0 128 10.0.0.105:6801 0.0.0.0:*
LISTEN 0 128 127.0.0.53%lo:53 0.0.0.0:*
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
LISTEN 0 128 127.0.0.1:43447 0.0.0.0:*
LISTEN 0 128 [::1]:6010 [::]:*
LISTEN 0 128 [::]:22 [::]:*
1.4.3.4 验证访问
1.4.3.5 优化配置
#mgr节点
#创建日志目录
test@ceph-mgr2:~$ sudo mkdir /var/log/radosgw
test@ceph-mgr2:~$ sudo chown -R ceph:ceph /var/log/radosgw
#修改配置
test@ceph-mgr2:~$ cat /etc/ceph/ceph.conf
[client.rgw.ceph-mgr2]
rgw_host = ceph-mgr2
rgw_frontends = "civetweb port=9900+9443s ssl_certificate=/etc/ceph/certs/civetweb.pem error_log_file=/var/log/radosgw/civetweb.error.log access_log_file=/var/log/radosgw/civetweb.access.log request_timeout_ms=30000 num_threads=200"
#重启服务
test@ceph-mgr2:~$ sudo systemctl restart ceph-radosgw@rgw.ceph-mgr2.service
#访问测试
test@ceph-mgr2:~$ curl -k https:
test@ceph-mgr2:~$ curl -k https:
#验证日志
test@ceph-mgr2:~$ tail /var/log/radosgw/civetweb.access.log
10.0.0.105 - - [31/Aug/2021:14:44:47 +0800] "GET / HTTP/1.1" 200 414 - curl/7.58.0
10.0.0.105 - - [31/Aug/2021:14:44:48 +0800] "GET / HTTP/1.1" 200 414 - curl/7.58.0
10.0.0.105 - - [31/Aug/2021:14:44:50 +0800] "GET / HTTP/1.1" 200 414 - curl/7.58.0
注:mgr1做一样的操作
1.5 测试数据的读写
1.5.1 创建RGW账户
#deploy节点
test@ceph-deploy:~/ceph-cluster$ radosgw-admin user create --uid="user1" --display-name="test user"
{
"user_id": "user1",
"display_name": "test user",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"subusers": [],
"keys": [
{
"user": "user1",
"access_key": "6LO8046SQ3DVGVKS84LX",
"secret_key": "iiVFHXC6qc4iTnKVcKDVJaOLeIpl39EbQ2OwueRV"
}
],
"swift_keys": [],
"caps": [],
"op_mask": "read, write, delete",
"default_placement": "",
"default_storage_class": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"temp_url_keys": [],
"type": "rgw",
"mfa_ids": []
}
1.5.2 安装s3cmd客户端
S3cmd是一个免费的命令行工具客户端,用于在Amazon S3和其他使用S3协议的云存储服务提供商(如京东云OSS)上传,检索和管理数据。它适合熟悉命令行程序的高级用户。它也是批处理脚本和S3自动备份的理想选择,由cron等触发。
#deploy节点
test@ceph-deploy:~/ceph-cluster$ sudo apt-cache madison s3cmd
s3cmd | 2.0.1-2 | https:
s3cmd | 2.0.1-2 | https:
test@ceph-deploy:~/ceph-cluster$ sudo apt install s3cmd
1.5.3 配置客户端执行环境
1.5.3.1 s3cmd客户端添加域名解析
#deploy节点
test@ceph-deploy:~/ceph-cluster$ cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 ubuntu
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
10.0.0.100 ceph-deploy.example.local ceph-deploy
10.0.0.101 ceph-mon1.example.local ceph-mon1
10.0.0.102 ceph-mon2.example.local ceph-mon2
10.0.0.103 ceph-mon3.example.local ceph-mon3
10.0.0.104 ceph-mgr1.example.local ceph-mgr1
10.0.0.105 ceph-mgr2.example.local ceph-mgr2
10.0.0.106 ceph-node1.example.local ceph-node1
10.0.0.107 ceph-node2.example.local ceph-node2
10.0.0.108 ceph-node3.example.local ceph-node3
10.0.0.109 ceph-node4.example.local ceph-node4
10.0.0.105 rgw.test.net
1.5.3.2 进行s3cm3配置
#deploy节点
test@ceph-deploy:~/ceph-cluster$ s3cmd --configure
Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.
Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: 6LO8046SQ3DVGVKS84LX #创建用户的时候的access key
Secret Key: iiVFHXC6qc4iTnKVcKDVJaOLeIpl39EbQ2OwueRV #创建用户的secret key
Default Region [US]:
Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint [s3.amazonaws.com]: rgw.test.net:9900
Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]: rgw.test.net:9900/%(bucket)
Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:
Path to GPG program [/usr/bin/gpg]:
When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [Yes]: No
On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:
New settings:
Access Key: 6LO8046SQ3DVGVKS84LX
Secret Key: iiVFHXC6qc4iTnKVcKDVJaOLeIpl39EbQ2OwueRV
Default Region: US
S3 Endpoint: rgw.test.net:9900
DNS-style bucket+hostname:port template for accessing a bucket: rgw.test.net:9900/%(bucket)
Encryption password:
Path to GPG program: /usr/bin/gpg
Use HTTPS protocol: False
HTTP Proxy server name:
HTTP Proxy server port: 0
Test access with supplied credentials? [Y/n] Y
Please wait, attempting to list all buckets...
Success. Your access key and secret key worked fine :-)
Now verifying that encryption works...
Not configured. Never mind.
Save settings? [y/N] y
Configuration saved to '/home/test/.s3cfg'
1.5.3.3创建bucket验证权限
#deploy节点
test@ceph-deploy:~/ceph-cluster$ s3cmd la
test@ceph-deploy:~/ceph-cluster$ s3cmd mb s3:
Bucket 's3://test/' created
test@ceph-deploy:~/ceph-cluster$ s3cmd ls
2021-08-31 08:08 s3:
1.5.3.4 验证上传数据
#deploy节点
#上传文件
test@ceph-deploy:~$ s3cmd put /home/test/test.pdf s3:
upload: '/home/test/test.pdf' -> 's3://test/pdf/test.pdf' [1 of 1]
4809229 of 4809229 100% in 1s 2.47 MB/s done
#查看文件
test@ceph-deploy:~$ s3cmd la
DIR s3:
#查看文件信息
test@ceph-deploy:~$ s3cmd ls s3:
2021-08-31 08:25 4809229 s3:
1.5.3.5 验证下载文件
#deploy节点
test@ceph-deploy:~$ sudo s3cmd get s3:
download: 's3://test/pdf/test.pdf' -> '/opt/test.pdf' [1 of 1]
4809229 of 4809229 100% in 0s 171.89 MB/s done
test@ceph-deploy:~$ ll /opt/
total 4708
drwxr-xr-x 2 root root 4096 Aug 31 16:43 ./
drwxr-xr-x 23 root root 4096 Aug 22 15:29 ../
-rw-r--r-- 1 root root 4809229 Aug 31 08:25 test.pdf
1.5.3.6 删除文件
#deploy节点
test@ceph-deploy:~$ s3cmd ls s3:
2021-08-31 08:25 4809229 s3:
test@ceph-deploy:~$ s3cmd rm s3:
delete: 's3://test/pdf/test.pdf'
test@ceph-deploy:~$ s3cmd ls s3:
二: Ceph crush进阶
CRUSH算法通过计算数据存储位置来确定如何存储和检索数据。CRUSH使Ceph客户机能够直接与OSDs通信,而不是通过集中的服务器或代理。通过算法确定的数据存储和检索方法,Ceph避免了单点故障、性能瓶颈和对其可伸缩性的物理限制。
2.1 PG与OSD映射调整
PG 是一组对象的逻辑集合,通过复制它到不同的 OSD 上来提供存储系统的可靠性。 根据 Ceph 池的复制级别,每个 PG 的数据会被复制并分发到 Ceph集群的多个 OSD上。 可以将 PG 看成一个逻辑容器,这个容器包含多个对象,同时这个逻辑容器被映射到多个 OSD上。
2.1.1 查看当前状态
#deploy节点
test@ceph-deploy:~/ceph-cluster$ ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 hdd 0.01949 1.00000 20 GiB 102 MiB 81 MiB 14 KiB 21 MiB 20 GiB 0.50 0.86 89 up
1 hdd 0.01949 1.00000 20 GiB 130 MiB 95 MiB 27 KiB 35 MiB 20 GiB 0.63 1.10 98 up
2 hdd 0.01949 1.00000 20 GiB 129 MiB 96 MiB 6 KiB 34 MiB 20 GiB 0.63 1.09 83 up
3 hdd 0.01949 1.00000 20 GiB 106 MiB 71 MiB 13 KiB 35 MiB 20 GiB 0.52 0.90 87 up
4 hdd 0.01949 1.00000 20 GiB 128 MiB 94 MiB 8 KiB 33 MiB 20 GiB 0.62 1.08 91 up
5 hdd 0.01949 1.00000 20 GiB 123 MiB 88 MiB 23 KiB 35 MiB 20 GiB 0.60 1.04 91 up
6 hdd 0.01949 1.00000 20 GiB 121 MiB 86 MiB 8 KiB 35 MiB 20 GiB 0.59 1.02 84 up
7 hdd 0.01949 1.00000 20 GiB 119 MiB 91 MiB 18 KiB 28 MiB 20 GiB 0.58 1.00 95 up
8 hdd 0.01949 1.00000 20 GiB 72 MiB 43 MiB 18 KiB 29 MiB 20 GiB 0.35 0.61 91 up
9 hdd 0.01949 1.00000 20 GiB 129 MiB 93 MiB 6 KiB 37 MiB 20 GiB 0.63 1.09 92 up
10 hdd 0.01949 1.00000 20 GiB 141 MiB 111 MiB 11 KiB 30 MiB 20 GiB 0.69 1.19 106 up
11 hdd 0.01949 1.00000 20 GiB 120 MiB 87 MiB 17 KiB 33 MiB 20 GiB 0.59 1.01 100 up
TOTAL 240 GiB 1.4 GiB 1.0 GiB 175 KiB 384 MiB 239 GiB 0.58
MIN/MAX VAR: 0.61/1.19 STDDEV: 0.08
2.1.2 修改WEIGHT并验证
#deploy节点
#修改完会立即更新,速度取决于数据的大小,根据算法进行分配
test@ceph-deploy:~/ceph-cluster$ ceph osd crush reweight osd.10 1.5
test@ceph-deploy:~/ceph-cluster$ ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 hdd 0.01949 1.00000 20 GiB 102 MiB 81 MiB 14 KiB 21 MiB 20 GiB 0.50 0.86 87 up
1 hdd 0.01949 1.00000 20 GiB 134 MiB 95 MiB 27 KiB 39 MiB 20 GiB 0.65 1.13 96 up
2 hdd 0.01949 1.00000 20 GiB 133 MiB 96 MiB 6 KiB 38 MiB 20 GiB 0.65 1.12 85 up
3 hdd 0.01949 1.00000 20 GiB 111 MiB 71 MiB 13 KiB 40 MiB 20 GiB 0.54 0.94 86 up
4 hdd 0.01949 1.00000 20 GiB 128 MiB 94 MiB 8 KiB 33 MiB 20 GiB 0.62 1.08 92 up
5 hdd 0.01949 1.00000 20 GiB 123 MiB 88 MiB 23 KiB 35 MiB 20 GiB 0.60 1.04 92 up
6 hdd 0.01949 1.00000 20 GiB 121 MiB 86 MiB 8 KiB 35 MiB 20 GiB 0.59 1.02 82 up
7 hdd 0.01949 1.00000 20 GiB 119 MiB 91 MiB 18 KiB 28 MiB 20 GiB 0.58 1.00 92 up
8 hdd 0.01949 1.00000 20 GiB 72 MiB 43 MiB 18 KiB 29 MiB 20 GiB 0.35 0.61 92 up
9 hdd 0.01949 1.00000 20 GiB 114 MiB 93 MiB 6 KiB 21 MiB 20 GiB 0.56 0.96 93 up
10 hdd 1.50000 1.00000 20 GiB 141 MiB 111 MiB 11 KiB 31 MiB 20 GiB 0.69 1.19 106 up
11 hdd 0.01949 1.00000 20 GiB 125 MiB 87 MiB 17 KiB 37 MiB 20 GiB 0.61 1.05 99 up
TOTAL 240 GiB 1.4 GiB 1.0 GiB 175 KiB 387 MiB 239 GiB 0.58
2.1.3 修改REWEIGHT并验证
REWEIGHT的值范围在0~1之间,值越小PG越小
#deploy节点
test@ceph-deploy:~/ceph-cluster$ ceph osd reweight 9 0.6
reweighted osd.9 to 0.6 (9999)
test@ceph-deploy:~/ceph-cluster$ ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 hdd 0.01949 1.00000 20 GiB 226 MiB 96 MiB 14 KiB 130 MiB 20 GiB 1.10 0.89 87 up
1 hdd 0.01949 1.00000 20 GiB 213 MiB 98 MiB 27 KiB 115 MiB 20 GiB 1.04 0.84 97 up
2 hdd 0.01949 1.00000 20 GiB 303 MiB 154 MiB 6 KiB 149 MiB 20 GiB 1.48 1.20 82 up
3 hdd 0.01949 1.00000 20 GiB 304 MiB 137 MiB 13 KiB 167 MiB 20 GiB 1.48 1.20 90 up
4 hdd 0.01949 1.00000 20 GiB 170 MiB 69 MiB 8 KiB 101 MiB 20 GiB 0.83 0.67 83 up
5 hdd 0.01949 1.00000 20 GiB 248 MiB 123 MiB 23 KiB 125 MiB 20 GiB 1.21 0.98 86 up
6 hdd 0.01949 1.00000 20 GiB 232 MiB 99 MiB 8 KiB 133 MiB 20 GiB 1.13 0.92 88 up
7 hdd 0.01949 1.00000 20 GiB 301 MiB 154 MiB 18 KiB 147 MiB 20 GiB 1.47 1.19 90 up
8 hdd 0.01949 1.00000 20 GiB 145 MiB 42 MiB 18 KiB 103 MiB 20 GiB 0.71 0.57 89 up
9 hdd 0.01949 0.59999 20 GiB 199 MiB 91 MiB 6 KiB 108 MiB 20 GiB 0.97 0.79 54 up
10 hdd 0.01949 1.00000 20 GiB 544 MiB 303 MiB 11 KiB 240 MiB 19 GiB 2.66 2.15 144 up
11 hdd 0.01949 1.00000 20 GiB 145 MiB 70 MiB 17 KiB 75 MiB 20 GiB 0.71 0.57 96 up
TOTAL 240 GiB 3.0 GiB 1.4 GiB 175 KiB 1.6 GiB 237 GiB 1.23
2.2 crush 运行图管理
通过工具将ceph的crush运行图导出并进行编辑,然后导入
2.2.1 导出crush运行图
导出的crush运行图为二进制格式,要使用crushtool工具转换为文本格式后才能进行编辑
#deploy节点
test@ceph-deploy:~/ceph-cluster$ sudo mkdir /data/ceph -p
test@ceph-deploy:~/ceph-cluster$ sudo ceph osd getcrushmap -o /data/ceph/crushmap
77
2.2.2 将运行图转换为文本
导出的运行图不能编辑,需要转换为文本再进行查看与编辑
#deploy节点
root@ceph-deploy:~# apt install -y ceph-base
root@ceph-deploy:~# crushtool -d /data/ceph/crushmap > /data/ceph/crushmap.txt
root@ceph-deploy:~# file /data/ceph/crushmap.txt
test@ceph-deploy:~/ceph-cluster$ sudo vim /data/ceph/crushmap.txt
# rules
rule replicated_rule {
id 0
type replicated
min_size 1
max_size 6
step take default
step chooseleaf firstn 0 type host
step emit
}
2.2.3 将文本转换成crush格式
#deploy节点
test@ceph-deploy:~/ceph-cluster$ sudo crushtool -c /data/ceph/crushmap.txt -o /data/ceph/newcrushmap
2.2.4 导入新的crush
导入的运行图会立即覆盖原有的运行图并生效
#deploy节点
test@ceph-deploy:~/ceph-cluster$ ceph osd setcrushmap -i /data/ceph/newcrushmap
78
2.2.5 验证crush运行图是否生效
#deploy节点
test@ceph-deploy:~/ceph-cluster$ ceph osd crush rule dump
[
{
"rule_id": 0,
"rule_name": "replicated_rule",
"ruleset": 0,
"type": 1,
"min_size": 1,
"max_size": 6,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}
]
三: Ceph dashboard 及监控
Dashboard(管理看板)是将多个仪表、图表、报表等组件内容整合在一个面板上进行综合显示的功能模块,提供灵活的组件及面板定义,并且提供大量预设置的组件模板,方便用户灵活选择,提高工作效率。可以使分析结果更具有良好的直观性、可理解性,快速掌握运营动态,为决策者做出决策提供更有利的数据支持。
3.1 启用 dashboard 插件
Ceph mgr 是一个多插件(模块化)的组件,其组件可以单独的启用或关闭
新版本需要安装 dashboard 安保,而且必须安装在 mgr 节点
#mgr节点
test@ceph-mgr1:/etc/ceph$ apt-cache madison ceph-mgr-dashboard
ceph-mgr-dashboard | 16.2.5-1bionic | https:
ceph-mgr-dashboard | 16.2.5-1bionic | https:
test@ceph-mgr1:/etc/ceph$ sudo apt install ceph-mgr-dashboard
#deploy节点
#列出所以版块
test@ceph-deploy:~/ceph-cluster$ ceph mgr module ls
#启动版块
test@ceph-deploy:~/ceph-cluster$ ceph mgr module enable dashboard
注:模块启用后还不能直接访问,需要配置关闭 SSL 或启用 SSL 及指定监听地址。
3.2 启用 dashboard 模块
Ceph dashboard 在 mgr 节点进行开启设置,并且可以配置开启或者关闭 SSL
#deploy节点
#关闭 SSL
test@ceph-deploy:~/ceph-cluster$ ceph config set mgr mgr/dashboard/ssl false
#指定 dashboard 监听地址
test@ceph-deploy:~/ceph-cluster$ ceph config set mgr mgr/dashboard/ceph-mgr1/server_addr 10.0.0.104
#指定 dashboard 监听端口
test@ceph-deploy:~/ceph-cluster$ ceph config set mgr mgr/dashboard/ceph-mgr1/server_port 9009
#验证集群状态
test@ceph-deploy:~/ceph-cluster$ ceph -s
cluster:
id: 635d9577-7341-4085-90ff-cb584029a1ea
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 4h)
mgr: ceph-mgr1(active, since 3m), standbys: ceph-mgr2
mds: 2/2 daemons up, 2 standby
osd: 12 osds: 12 up (since 4h), 12 in (since 3d)
rgw: 2 daemons active (2 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 10 pools, 329 pgs
objects: 372 objects, 314 MiB
usage: 1.2 GiB used, 239 GiB / 240 GiB avail
pgs: 329 active+clean
3.3 在 mgr 节点验证端口与进程
#mgr节点
#检查mgr服务是否正常运行,查看端口信息,如果不正常启动,重启一下服务
test@ceph-mgr1:~$ sudo systemctl restart ceph-mgr@ceph-mgr1.service
test@ceph-mgr1:~$ ss -tnl
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 0.0.0.0:54113 0.0.0.0:*
LISTEN 0 128 0.0.0.0:9443 0.0.0.0:*
LISTEN 0 128 127.0.0.1:42569 0.0.0.0:*
LISTEN 0 128 0.0.0.0:9900 0.0.0.0:*
LISTEN 0 128 0.0.0.0:111 0.0.0.0:*
LISTEN 0 5 10.0.0.104:9009 0.0.0.0:*
LISTEN 0 128 127.0.0.53%lo:53 0.0.0.0:*
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
LISTEN 0 128 127.0.0.1:6010 0.0.0.0:*
LISTEN 0 128 [::ffff:0.0.0.0]:2049 *:*
LISTEN 0 128 [::]:43399 [::]:*
LISTEN 0 128 [::]:111 [::]:*
LISTEN 0 128 [::]:22 [::]:*
LISTEN 0 128 [::1]:6010 [::]:*
3.4 dashboard 访问验证
3.5 设置 dashboard 账户及密码
#deploy节点
test@ceph-deploy:~/ceph-cluster$ sudo touch pass.txt
test@ceph-deploy:~/ceph-cluster$ echo "123456" > pass.txt
test@ceph-deploy:~/ceph-cluster$ ceph dashboard set-login-credentials test -i pass.txt
******************************************************************
*** WARNING: this command is deprecated. ***
*** Please use the ac-user-* related commands to manage users. ***
******************************************************************
Username and password updated
3.6 登录界面
3.7 登录成功界面
3.8 dashboard SSL
如果要使用 SSL 访问。则需要配置签名证书。证书可以使用 ceph 命令生成,或是 opessl 命令生成
3.8.1 ceph 自签名证书
#deploy节点
#生成证书
test@ceph-deploy:~/ceph-cluster$ ceph dashboard create-self-signed-cert
Self-signed certificate created
#启用 SSL
test@ceph-deploy:~/ceph-cluster$ ceph config set mgr mgr/dashboard/ssl true
#查看当前 dashboard 状态
test@ceph-deploy:~/ceph-cluster$ ceph mgr services
{
"dashboard": "http://10.0.0.104:9009/"
}
#mgr节点
#重启 mgr 服务
test@ceph-mgr1:~$ sudo systemctl restart ceph-mgr@ceph-mgr1
#再次验证dashboard 状态
test@ceph-deploy:~/ceph-cluster$ ceph mgr services
{
"dashboard": "https://10.0.0.104:9009/"
}
3.9 通过 prometheus 监控 ceph node 节点
3.9.1 部署 prometheus
#mgr节点
test@ceph-mgr1:~$ sudo mkdir /apps
test@ceph-mgr1:~$ cd /apps
test@ceph-mgr1:/apps$ sudo tar xf prometheus-2.23.0.linux-amd64.tar.gz
test@ceph-mgr1:/apps$ sudo ln -sv /apps/prometheus-2.23.0.linux-amd64 /apps/prometheus #方便以后的升级
root@ceph-mgr1:~# cat /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus Server
Documentation=https:
After=network.target
[Service]
Restart=on-failure
WorkingDirectory=/apps/prometheus/
ExecStart=/apps/prometheus/prometheus --config.file=/apps/prometheus/prometheus.yml
[Install]
WantedBy=multi-user.target
root@ceph-mgr1:~# systemctl daemon-reload
root@ceph-mgr1:~# sudo systemctl start prometheus.service
root@ceph-mgr1:~# sudo systemctl enable prometheus.service
Created symlink /etc/systemd/system/multi-user.target.wants/prometheus.service → /etc/systemd/system/prometheus.service.
3.9.2 访问 prometheus
3.9.3 部署 node_exporter
#node节点
#3个节点都做这个操作
root@ceph-node1:~# mkdir /apps/
root@ceph-node1:~# cd /apps/
root@ceph-node1:/apps# tar xf node_exporter-1.0.1.linux-amd64.tar.gz
root@ceph-node1:/apps# ln -sv /apps/node_exporter-1.0.1.linux-amd64 /apps/node_exporter
root@ceph-node1:/apps# cat /etc/systemd/system/node-exporter.service
[Unit]
Description=Prometheus Node Exporter
After=network.target
[Service]
ExecStart=/apps/node_exporter/node_exporter
[Install]
WantedBy=multi-user.target
root@ceph-node1:/apps# systemctl daemon-reload
root@ceph-node1:/apps# systemctl restart node-exporter
root@ceph-node1:/apps# systemctl enable node-exporter
Created symlink /etc/systemd/system/multi-user.target.wants/node-exporter.service → /etc/systemd/system/node-exporter.service.
验证各 node 节点的 node_exporter 数据
3.9.4 配置 prometheus server 数据并验证
#mgr节点
root@ceph-mgr1:~# cd /apps/prometheus
root@ceph-mgr1:/apps/prometheus# cat prometheus.yml
- job_name: 'ceph-node-data'
static_configs:
- targets: ['10.0.0.106:9100','10.0.0.107:9100','10.0.0.108:9100']
root@ceph-mgr1:/apps/prometheus# systemctl restart prometheus.service
验证节点
3.10 通过 prometheus 监控 ceph 服务
Ceph manager 内部的模块中包含了 prometheus 的监控模块,并监听在每个 manager 节点的9283 端口,该端口用于将采集到的信息通过 http 接口向 prometheus 提供数据
3.10.1 启用 prometheus 监控模块
#deploy节点
test@ceph-deploy:~/ceph-cluster$ ceph mgr module enable prometheus
#mgr节点
root@ceph-mgr1:/apps/prometheus# ss -tnl
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
LISTEN 0 128 127.0.0.1:6010 0.0.0.0:*
LISTEN 0 128 0.0.0.0:9443 0.0.0.0:*
LISTEN 0 128 0.0.0.0:9900 0.0.0.0:*
LISTEN 0 128 127.0.0.1:42447 0.0.0.0:*
LISTEN 0 128 0.0.0.0:111 0.0.0.0:*
LISTEN 0 128 10.0.0.104:6800 0.0.0.0:*
LISTEN 0 5 10.0.0.104:9009 0.0.0.0:*
LISTEN 0 128 10.0.0.104:6801 0.0.0.0:*
LISTEN 0 128 0.0.0.0:52241 0.0.0.0:*
LISTEN 0 128 127.0.0.53%lo:53 0.0.0.0:*
LISTEN 0 128 [::]:22 [::]:*
LISTEN 0 128 [::1]:6010 [::]:*
LISTEN 0 128 [::ffff:0.0.0.0]:2049 *:*
LISTEN 0 128 *:9090 *:*
LISTEN 0 5 *:9283 *:*
LISTEN 0 128 [::]:36107 [::]:*
LISTEN 0 128 [::]:111 [::]:*
3.10.2 验证 manager 数据
3.10.3 配置 prometheus 采集数据
#mgr节点
root@ceph-mgr1:/apps/prometheus# cat prometheus.yml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090']
- job_name: 'ceph-node-data'
static_configs:
- targets: ['10.0.0.106:9100','10.0.0.107:9100','10.0.0.108:9100']
- job_name: 'ceph-cluster-data'
static_configs:
- targets: ['10.0.0.104:9283']
root@ceph-mgr1:/apps/prometheus# systemctl restart prometheus.service
3.10.4 验证数据
3.11 通过 grafana 显示监控数据
通过 granfana 显示对 ceph 的集群监控数据及 node 数据
3.11.1 安装 grafana
grafana下载地址:https://grafana.com/grafana/download/7.5.10?pg=get&plcmt=selfmanaged-box1-cta1&edition=oss
#deploy节点
root@ceph-deploy:~# sudo apt-get install -y adduser libfontconfig1
root@ceph-deploy:~# wget https:
root@ceph-deploy:~# sudo dpkg -i grafana_7.5.10_amd64.deb
root@ceph-deploy:~# systemctl enable grafana-server
Synchronizing state of grafana-server.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable grafana-server
Created symlink /etc/systemd/system/multi-user.target.wants/grafana-server.service → /usr/lib/systemd/system/grafana-server.service.
root@ceph-deploy:~# systemctl start grafana-server
3.11.2 登录 grafana
账号密码默认都是admin
3.11.3 配置数据源
在 grafana 添加 prometheus 数据源
3.11.4 导入模板
模板下载地址:https://grafana.com/grafana/dashboards?search=ceph
3.11.5 效果图
有问题可以私聊我!及时改正
|