0
点赞
收藏
分享

微信扫一扫

kubernetes二进制方式:etcd备份和恢复

杰森wang 2022-02-09 阅读 52

Etcd是Kubernetes集群中的一个十分重要的组件,用于保存集群所有的网络配置和对象的状态信息。

整个kubernetes系统中一共有两个服务需要用到etcd用来协同和存储配置,分别是:


  • 网络插件flannel、对于其它网络插件也需要用到etcd存储网络的配置信息
  • kubernetes本身,包括各种对象的状态和元信息配置


设置环境变量

当操作kubernetes时,需设置环境变量,ETCDCTL_API=3

export ETCDCTL_API=3


# 或者在`/etc/profile`文件中添加环境变量
vi /etc/profile
...
export ETCDCTL_API=3
...
source /etc/profile


# 或者在命令执行前加 ETCDCTL_API=3
ETCDCTL_API=3 etcdctl --endpoints=$ENDPOINTS member list
#查看版本
#etcdctl version -w table
etcdctl version: 3.4.9
API version: 3.


设置别名,避免每次得设置证书


$vim /etc/profile
alias etcdctl="ETCDCTL_API=3 /data/local/etcd/bin/etcdctl --cacert=/data/local/etcd/ssl/ca.pem --cert=/data/local/etcd/ssl/server.pem --key=/data/local/etcd/ssl/server-key.pem --endpoints="https://192.168.253.227:2379,https://192.168.
253.228:2379,https://192.168.253.229:2379""
#alias etcdctl="ETCDCTL_API=3 /data/local/etcd/bin/etcdctl --cacert=/data/local/etcd/ssl/ca.pem --cert=/data/local/etcd/ssl/server.pem --key=/data/local/etcd/ssl/server-key.pem --endpoints="https://192.168.253.227:2379""
$etcdctl member list -w table
+------------------+---------+--------+------------------------------+------------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+--------+------------------------------+------------------------------+------------+
| c0142b9a19898be | started | etcd-1 | https://192.168.253.227:2380 | https://192.168.253.227:2379 | false |
| 78d50c713efbd616 | started | etcd-2 | https://192.168.253.228:2380 | https://192.168.253.228:2379 | false |
| bcb81ad99e84629a | started | etcd-3 | https://192.168.253.229:2380 | https://192.168.253.229:2379 | false |
+------------------+---------+--------+------------------------------+------------------------------+------------+


查看pod



etcdctl get /registry/pods --prefix --keys-only

  • –keys-only 默认为true,只显示key,如果设置为false,会显示key的所有值.
  • –prefix 默认为true可以看到所有的子目录.


查看service


etcdctl get /registry/service --prefix --keys-only

备份


  1. 备份命令只能写etcd集群中的单个节点,不能写多个节点。
  2.  指定在/etc/profile别名中endpoints="https://192.168.253.227:2379" 一个etcd节点即可

etcdctl snapshot save   /data/backup/etcd/snapshot.db
{"level":"info","ts":1637575884.2848496,"caller":"snapshot/v3_snapshot.go:119","msg":"created temporary db file","path":"/data/backup/etcd/snapshot.db.part"}
{"level":"info","ts":"2021-11-22T18:11:24.297+0800","caller":"clientv3/maintenance.go:200","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":1637575884.2974415,"caller":"snapshot/v3_snapshot.go:127","msg":"fetching snapshot","endpoint":"https://192.168.253.227:2379"}
{"level":"info","ts":"2021-11-22T18:11:24.340+0800","caller":"clientv3/maintenance.go:208","msg":"completed snapshot read; closing"}
{"level":"info","ts":1637575884.3451302,"caller":"snapshot/v3_snapshot.go:142","msg":"fetched snapshot","endpoint":"https://192.168.253.227:2379","size":"3.2 MB","took":0.060068191}
{"level":"info","ts":1637575884.34525,"caller":"snapshot/v3_snapshot.go:152","msg":"saved","path":"/data/backup/etcd/snapshot.db"}
Snapshot saved at /data/backup/etcd/snapshot.db


删除pod

kubectl  delete pod nginx
pod "nginx" deleted

恢复准备

停止所有Master 上 kube-apiserver 服务

systemctl stop kube-apiserver
# 确认 kube-apiserver 服务是否停止
ps -ef | grep kube-apiserver


停止集群中 ETCD 服务

systemctl stop etcd

移除所有 ETCD 存储目录下数据(最好直接备份走从命名


mv /var/lib/etcd/default.etcd /var/lib/etcd/default.etcd.bak

单td恢复备份

这边我已经在集群配置好/etc/profile,所以不用配置节点

etcdctl snapshot restore /data/backup/etcd/snapshot.db --data-dir=/var/lib/etcd/default.etcd


多etcd恢复备份

注:


  • 2380是提供集群其他节点通信的的,2379是提供HTTP API服务的,别改错了
  • 备份是一样的,都是etcd集群恢复也是在某一台etcd备份文件。


拷贝 ETCD 备份快照


scp /data/backup/etcd/snapshot.db 192.168.253.228:/data/backup/etcd/
scp /data/backup/etcd/snapshot.db 192.168.253.229:/data/backup/etcd/


test-k8s-master-227操作

etcdctl snapshot restore /data/backup/etcd/snapshot.db \
--name etcd-1 \
--initial-cluster "etcd-1=https://192.168.253.227:2380,etcd-2=https://192.168.253.228:2380,etcd-3=https://192.168.253.229:2380" \
--initial-cluster-token etcd-cluster \
--initial-advertise-peer-urls https://192.168.253.227:2380 \
--data-dir=/var/lib/etcd/default.etcd


test-k8s-node-228操作

etcdctl snapshot restore /data/backup/etcd/snapshot.db \
--name etcd-2 \
--initial-cluster "etcd-1=https://192.168.253.227:2380,etcd-2=https://192.168.253.228:2380,etcd-3=https://192.168.253.229:2380" \
--initial-cluster-token etcd-cluster \
--initial-advertise-peer-urls https://192.168.253.228:2380 \
--data-dir=/var/lib/etcd/default.etcd


test-k8s-node-229操作

etcdctl snapshot restore /data/backup/etcd/snapshot.db \
--name etcd-3 \
--initial-cluster "etcd-1=https://192.168.253.227:2380,etcd-2=https://192.168.253.228:2380,etcd-3=https://192.168.253.229:2380" \
--initial-cluster-token etcd-cluster \
--initial-advertise-peer-urls https://192.168.253.229:2380 \
--data-dir=/var/lib/etcd/default.etcd


说明:

name 节点名称
data-dir 节点数据存储目录
initial-cluster 集群中所有节点的信息
initial-advertise-peer-urls 该节点同伴监听地址,这个值会告诉集群中其他节点
nitial-cluster-token 创建集群的 token,这个值每个集群保持唯一


上面三台 ETCD 都恢复完成后,依次登陆三台机器启动 ETCD

systemctl start etcd


三台 ETCD 启动完成,检查 ETCD 集群状态


ETCDCTL_API=3 etcdctl --cacert=/data/local/etcd/ssl/ca.pem --cert=/data/local/etcd/ssl/server.pem --key=/data/local/etcd/ssl/server-key.pem --endpoints=https://192.168.253.227:2379,https://192.168.253.228:2379,https://192.168.253.229:2379 endpoint health
#查看etcd成员列表
etcdctl member list -w table
+------------------+---------+--------+------------------------------+------------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+--------+------------------------------+------------------------------+------------+
| c0142b9a19898be | started | etcd-1 | https://192.168.253.227:2380 | https://192.168.253.227:2379 | false |
| 78d50c713efbd616 | started | etcd-2 | https://192.168.253.228:2380 | https://192.168.253.228:2379 | false |
| bcb81ad99e84629a | started | etcd-3 | https://192.168.253.229:2380 | https://192.168.253.229:2379 | false |
+------------------+---------+--------+------------------------------+------------------------------+------------+


依次启动apiserver

systemctl start kube-apiserver
kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}


查看pod是否恢复



kubectl  get po
NAME READY STATUS RESTARTS AGE
nginx 1/1 Running 0 18h


注意


  • flannel操作etcd使用的是v2的API,而kubernetes操作etcd使用的v3的API。
  • 若使用 v3 备份数据时存在 v2 的数据则不影响恢复。
  • 若使用 v2 备份数据时存在 v3 的数据则恢复失败。
  • 备份的节点最好不要单一,如果备份的节点,出现有没备份的情况,等恢复的时候,就悲剧了,最好备份同时在两个节点以上,然后给备份加个监控。
  • 备份ETCD集群时,只需要备份一个ETCD就行,恢复时,拿同一份备份数据恢复。
  • 恢复顺序:停止kube-apiserver --> 停止ETCD --> 恢复数据 --> 启动ETCD --> 启动kube-apiserver



举报

相关推荐

0 条评论