etcd报错:mvcc: database space exceeded
问题来源:恢复生产环境的时候发现apiserver连etcd报错,etcd一直重启
原因分析: etcd服务未设置自动压缩参数(auto-compact) etcd 默认不会自动 compact,需要设置启动参数,或者通过命令进行compact,如果变更频繁建议设置,否则会导致空间和内存的浪费以及错误。etcd v3 的默认的 backend quota 2GB,如果不compact,boltdb 文件大小超过这个限制后,就会报错:”Error: etcdserver: mvcc: database space exceeded”,导致数据无法写入。
处理过程: 我这里的master节点ip是192.168.10.203,192.168.10.204,192.168.10.205
1 查看告警 [root@ ~]# /opt/k8s/bin/etcdctl --endpoints=https://192.168.10.203:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem alarm list [root@ ~]# /opt/k8s/bin/etcdctl --endpoints=https://192.168.10.204:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem alarm list [root@ ~]# /opt/k8s/bin/etcdctl --endpoints=https://192.168.10.205:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem alarm list
2 获取旧版本号 [root@ ~]# rev=$(/opt/k8s/bin/etcdctl --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem --endpoints=“https://127.0.0.1:2379” endpoint status --write-out=“json” |egrep -o ‘“revision”:[0-9]’|egrep -o '[0-9].’) [root@ ~]# echo $rev 846418475
3 整合压缩旧版本数据 [root@ ~]# /opt/k8s/bin/etcdctl --endpoints=https://192.168.10.203:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem compact $rev [root@ ~]# /opt/k8s/bin/etcdctl --endpoints=https://192.168.10.204:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem compact $rev [root@ ~]# /opt/k8s/bin/etcdctl --endpoints=https://192.168.10.205:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem compact $rev
4 执行碎片整理 [root@ ~]# /opt/k8s/bin/etcdctl --endpoints=https://192.168.10.203:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem defrag [root@ ~]# /opt/k8s/bin/etcdctl --endpoints=https://192.168.10.204:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem defrag [root@ ~]# /opt/k8s/bin/etcdctl --endpoints=https://192.168.10.205:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem defrag
5 解除告警 [root@ ~]# /opt/k8s/bin/etcdctl --endpoints=https://192.168.10.203:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem alarm disarm [root@ ~]# /opt/k8s/bin/etcdctl --endpoints=https://192.168.10.204:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem alarm disarm [root@ ~]# /opt/k8s/bin/etcdctl --endpoints=https://192.168.10.205:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem alarm disarm
6 备份以及查看备份数据信息 ETCDCTL_API=3 etcdctl snapshot save backup.db ETCDCTL_API=3 etcdctl snapshot status backup.db
|