日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

zookeeper和etcd有状态服务部署

發布時間:2023/12/6 编程问答 35 豆豆
生活随笔 收集整理的這篇文章主要介紹了 zookeeper和etcd有状态服务部署 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

zookeeper和etcd有狀態服務部署實踐

?
  • docker?
  • etcd?
  • zookeeper?
  • kubernetes
?4k 次閱讀 ?·? 讀完需要 78 分鐘 0

一. 概述

kubernetes通過statefulset為zookeeper、etcd等這類有狀態的應用程序提供完善支持,statefulset具備以下特性:

  • 為pod提供穩定的唯一的網絡標識
  • 穩定值持久化存儲:通過pv/pvc來實現
  • 啟動和停止pod保證有序:優雅的部署和伸縮性

本文闡述了如何在k8s集群上部署zookeeper和etcd有狀態服務,并結合ceph實現數據持久化。

二. 總結

  • 使用k8s的statefulset、storageclass、pv、pvc和ceph的rbd,能夠很好的支持zookeeper、etcd這樣的有狀態服務部署到kubernetes集群上。
  • k8s不會主動刪除已經創建的pv、pvc對象,防止出現誤刪。

如果用戶確定刪除pv、pvc對象,同時還需要手動刪除ceph段的rbd鏡像。

  • 遇到的坑

storageclass中引用的ceph客戶端用戶,必須要有mon rw,rbd rwx權限。如果沒有mon write權限,會導致釋放rbd鎖失敗,無法將rbd鏡像掛載到其他的k8s worker節點。

  • zookeeper使用探針檢查zookeeper節點的健康狀態,如果節點不健康,k8s將刪除pod,并自動重建該pod,達到自動重啟zookeeper節點的目的。

因zookeeper 3.4版本的集群配置,是通過靜態加載文件zoo.cfg來實現的,所以當zookeeper節點pod ip變動后,需要重啟zookeeper集群中的所有節點。

  • etcd部署方式有待優化

本次試驗中使用靜態方式部署etcd集群,如果etcd節點變遷時,需要執行etcdctl member remove/add等命令手動配置etcd集群,嚴重限制了etcd集群自動故障恢復、擴容縮容的能力。因此,需要考慮對部署方式優化,改為使用DNS或者etcd descovery的動態方式部署etcd,才能讓etcd更好的運行在k8s上。

三. zookeeper集群部署

1. 下載鏡像

docker pull gcr.mirrors.ustc.edu.cn/google_containers/kubernetes-zookeeper:1.0-3.4.10 docker tag gcr.mirrors.ustc.edu.cn/google_containers/kubernetes-zookeeper:1.0-3.4.10 172.16.18.100:5000/gcr.io/google_containers/kubernetes-zookeeper:1.0-3.4.10 docker push 172.16.18.100:5000/gcr.io/google_containers/kubernetes-zookeeper:1.0-3.4.10

2. 定義ceph secret

cat << EOF | kubectl create -f - apiVersion: v1 data:key: QVFBYy9ndGFRUno4QlJBQXMxTjR3WnlqN29PK3VrMzI1a05aZ3c9PQo= kind: Secret metadata:creationTimestamp: 2017-11-20T10:29:05Zname: ceph-secretnamespace: defaultresourceVersion: "2954730"selfLink: /api/v1/namespaces/default/secrets/ceph-secretuid: a288ff74-cddd-11e7-81cc-000c29f99475 type: kubernetes.io/rbd EOF

3. 定義storageclass rbd存儲

cat << EOF | kubectl create -f - apiVersion: storage.k8s.io/v1 kind: StorageClass metadata:name: ceph parameters:adminId: adminadminSecretName: ceph-secretadminSecretNamespace: defaultfsType: ext4imageFormat: "2"imagefeatures: layeringmonitors: 172.16.13.223pool: k8suserId: adminuserSecretName: ceph-secret provisioner: kubernetes.io/rbd reclaimPolicy: Delete EOF

4. 創建zookeeper集群

使用rbd存儲zookeeper節點數據

cat << EOF | kubectl create -f - --- apiVersion: v1 kind: Service metadata:name: zk-hslabels:app: zk spec:ports:- port: 2888name: server- port: 3888name: leader-electionclusterIP: Noneselector:app: zk --- apiVersion: v1 kind: Service metadata:name: zk-cslabels:app: zk spec:ports:- port: 2181name: clientselector:app: zk --- apiVersion: policy/v1beta1 kind: PodDisruptionBudget metadata:name: zk-pdb spec:selector:matchLabels:app: zkmaxUnavailable: 1 --- apiVersion: apps/v1beta2 # for versions before 1.8.0 use apps/v1beta1 kind: StatefulSet metadata:name: zk spec:selector:matchLabels:app: zkserviceName: zk-hsreplicas: 3updateStrategy:type: RollingUpdatepodManagementPolicy: Paralleltemplate:metadata:labels:app: zkspec:affinity:podAntiAffinity:requiredDuringSchedulingIgnoredDuringExecution:- labelSelector:matchExpressions:- key: "app"operator: Invalues:- zktopologyKey: "kubernetes.io/hostname"containers:- name: kubernetes-zookeeperimagePullPolicy: Alwaysimage: "172.16.18.100:5000/gcr.io/google_containers/kubernetes-zookeeper:1.0-3.4.10"ports:- containerPort: 2181name: client- containerPort: 2888name: server- containerPort: 3888name: leader-electioncommand:- sh- -c- "start-zookeeper \--servers=3 \--data_dir=/var/lib/zookeeper/data \--data_log_dir=/var/lib/zookeeper/data/log \--conf_dir=/opt/zookeeper/conf \--client_port=2181 \--election_port=3888 \--server_port=2888 \--tick_time=2000 \--init_limit=10 \--sync_limit=5 \--heap=512M \--max_client_cnxns=60 \--snap_retain_count=3 \--purge_interval=12 \--max_session_timeout=40000 \--min_session_timeout=4000 \--log_level=INFO"readinessProbe:exec:command:- sh- -c- "zookeeper-ready 2181"initialDelaySeconds: 10timeoutSeconds: 5livenessProbe:exec:command:- sh- -c- "zookeeper-ready 2181"initialDelaySeconds: 10timeoutSeconds: 5volumeMounts:- name: datadirmountPath: /var/lib/zookeepersecurityContext:runAsUser: 1000fsGroup: 1000volumeClaimTemplates:- metadata:name: datadirannotations:volume.beta.kubernetes.io/storage-class: cephspec:accessModes: [ "ReadWriteOnce" ]resources:requests:storage: 1Gi EOF

查看創建結果

[root@172 zookeeper]# kubectl get no NAME STATUS ROLES AGE VERSION 172.16.20.10 Ready <none> 50m v1.8.2 172.16.20.11 Ready <none> 2h v1.8.2 172.16.20.12 Ready <none> 1h v1.8.2[root@172 zookeeper]# kubectl get po -owide NAME READY STATUS RESTARTS AGE IP NODE zk-0 1/1 Running 0 8m 192.168.5.162 172.16.20.10 zk-1 1/1 Running 0 1h 192.168.2.146 172.16.20.11[root@172 zookeeper]# kubectl get pv,pvc NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pv/pvc-226cb8f0-d322-11e7-9581-000c29f99475 1Gi RWO Delete Bound default/datadir-zk-0 ceph 1h pv/pvc-22703ece-d322-11e7-9581-000c29f99475 1Gi RWO Delete Bound default/datadir-zk-1 ceph 1hNAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE pvc/datadir-zk-0 Bound pvc-226cb8f0-d322-11e7-9581-000c29f99475 1Gi RWO ceph 1h pvc/datadir-zk-1 Bound pvc-22703ece-d322-11e7-9581-000c29f99475 1Gi RWO ceph 1h

zk-0 pod的rbd的鎖信息為

[root@ceph1 ceph]# rbd lock list kubernetes-dynamic-pvc-227b45e5-d322-11e7-90ab-000c29f99475 -p k8s --user admin There is 1 exclusive lock on this image. Locker ID Address client.24146 kubelet_lock_magic_172.16.20.10 172.16.20.10:0/1606152350

5. 測試pod遷移

嘗試將172.16.20.10節點設置為污點,讓zk-0 pod自動遷移到172.16.20.12

kubectl cordon 172.16.20.10[root@172 zookeeper]# kubectl get no NAME STATUS ROLES AGE VERSION 172.16.20.10 Ready,SchedulingDisabled <none> 58m v1.8.2 172.16.20.11 Ready <none> 2h v1.8.2 172.16.20.12 Ready <none> 1h v1.8.2kubectl delete po zk-0

觀察zk-0的遷移過程

[root@172 zookeeper]# kubectl get po -owide -w NAME READY STATUS RESTARTS AGE IP NODE zk-0 1/1 Running 0 14m 192.168.5.162 172.16.20.10 zk-1 1/1 Running 0 1h 192.168.2.146 172.16.20.11 zk-0 1/1 Terminating 0 16m 192.168.5.162 172.16.20.10 zk-0 0/1 Terminating 0 16m <none> 172.16.20.10 zk-0 0/1 Terminating 0 16m <none> 172.16.20.10 zk-0 0/1 Terminating 0 16m <none> 172.16.20.10 zk-0 0/1 Terminating 0 16m <none> 172.16.20.10 zk-0 0/1 Terminating 0 16m <none> 172.16.20.10 zk-0 0/1 Pending 0 0s <none> <none> zk-0 0/1 Pending 0 0s <none> 172.16.20.12 zk-0 0/1 ContainerCreating 0 0s <none> 172.16.20.12 zk-0 0/1 Running 0 3s 192.168.3.4 172.16.20.12

此時zk-0正常遷移到172.16.20.12
再查看rbd的鎖定信息

[root@ceph1 ceph]# rbd lock list kubernetes-dynamic-pvc-227b45e5-d322-11e7-90ab-000c29f99475 -p k8s --user admin There is 1 exclusive lock on this image. Locker ID Address client.24146 kubelet_lock_magic_172.16.20.10 172.16.20.10:0/1606152350 [root@ceph1 ceph]# rbd lock list kubernetes-dynamic-pvc-227b45e5-d322-11e7-90ab-000c29f99475 -p k8s --user admin There is 1 exclusive lock on this image. Locker ID Address client.24154 kubelet_lock_magic_172.16.20.12 172.16.20.12:0/3715989358

之前在另外一個ceph集群測試這個zk pod遷移的時候,總是報錯無法釋放lock,經分析應該是使用的ceph賬號沒有相應的權限,所以導致釋放lock失敗。記錄的報錯信息如下:

Nov 27 10:45:55 172 kubelet: W1127 10:45:55.551768 11556 rbd_util.go:471] rbd: no watchers on kubernetes-dynamic-pvc-f35a411e-d317-11e7-90ab-000c29f99475 Nov 27 10:45:55 172 kubelet: I1127 10:45:55.694126 11556 rbd_util.go:181] remove orphaned locker kubelet_lock_magic_172.16.20.12 from client client.171490: err exit status 13, output: 2017-11-27 10:45:55.570483 7fbdbe922d40 -1 did not load config file, using default settings. Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.600816 7fbdbe922d40 -1 Errors while parsing config file! Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.600824 7fbdbe922d40 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.600825 7fbdbe922d40 -1 parse_file: cannot open ~/.ceph/ceph.conf: (2) No such file or directory Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.600825 7fbdbe922d40 -1 parse_file: cannot open ceph.conf: (2) No such file or directory Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.602492 7fbdbe922d40 -1 Errors while parsing config file! Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.602494 7fbdbe922d40 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.602495 7fbdbe922d40 -1 parse_file: cannot open ~/.ceph/ceph.conf: (2) No such file or directory Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.602496 7fbdbe922d40 -1 parse_file: cannot open ceph.conf: (2) No such file or directory Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.651594 7fbdbe922d40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.k8s.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory Nov 27 10:45:55 172 kubelet: rbd: releasing lock failed: (13) Permission denied Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.682470 7fbdbe922d40 -1 librbd: unable to blacklist client: (13) Permission denied

k8s rbd volume的實現代碼:

if lock {// check if lock is already held for this host by matching lock_id and rbd lock idif strings.Contains(output, lock_id) {// this host already holds the lock, exitglog.V(1).Infof("rbd: lock already held for %s", lock_id)return nil}// clean up orphaned lock if no watcher on the imageused, statusErr := util.rbdStatus(&b)if statusErr == nil && !used {re := regexp.MustCompile("client.* " + kubeLockMagic + ".*")locks := re.FindAllStringSubmatch(output, -1)for _, v := range locks {if len(v) > 0 {lockInfo := strings.Split(v[0], " ")if len(lockInfo) > 2 {args := []string{"lock", "remove", b.Image, lockInfo[1], lockInfo[0], "--pool", b.Pool, "--id", b.Id, "-m", mon}args = append(args, secret_opt...)cmd, err = b.exec.Run("rbd", args...)# 執行rbd lock remove命令時返回了錯誤信息glog.Infof("remove orphaned locker %s from client %s: err %v, output: %s", lockInfo[1], lockInfo[0], err, string(cmd))}}}}// hold a lock: rbd lock addargs := []string{"lock", "add", b.Image, lock_id, "--pool", b.Pool, "--id", b.Id, "-m", mon}args = append(args, secret_opt...)cmd, err = b.exec.Run("rbd", args...)}

可以看到,rbd lock remove操作被拒絕了,原因是沒有權限rbd: releasing lock failed: (13) Permission denied。

6. 測試擴容

zookeeper集群節點數從2個擴為3個。
集群節點數為2時,zoo.cfg的配置中定義了兩個實例

zookeeper@zk-0:/opt/zookeeper/conf$ cat zoo.cfg #This file was autogenerated DO NOT EDIT clientPort=2181 dataDir=/var/lib/zookeeper/data dataLogDir=/var/lib/zookeeper/data/log tickTime=2000 initLimit=10 syncLimit=5 maxClientCnxns=60 minSessionTimeout=4000 maxSessionTimeout=40000 autopurge.snapRetainCount=3 autopurge.purgeInteval=12 server.1=zk-0.zk-hs.default.svc.cluster.local:2888:3888 server.2=zk-1.zk-hs.default.svc.cluster.local:2888:3888

使用kubectl edit statefulset zk命令修改replicas=3,start-zookeeper --servers=3,
此時觀察pod的變化

[root@172 zookeeper]# kubectl get po -owide -w NAME READY STATUS RESTARTS AGE IP NODE zk-0 1/1 Running 0 1h 192.168.5.170 172.16.20.10 zk-1 1/1 Running 0 1h 192.168.3.12 172.16.20.12 zk-2 0/1 Pending 0 0s <none> <none> zk-2 0/1 Pending 0 0s <none> 172.16.20.11 zk-2 0/1 ContainerCreating 0 0s <none> 172.16.20.11 zk-2 0/1 Running 0 1s 192.168.2.154 172.16.20.11 zk-2 1/1 Running 0 11s 192.168.2.154 172.16.20.11 zk-1 1/1 Terminating 0 1h 192.168.3.12 172.16.20.12 zk-1 0/1 Terminating 0 1h <none> 172.16.20.12 zk-1 0/1 Terminating 0 1h <none> 172.16.20.12 zk-1 0/1 Terminating 0 1h <none> 172.16.20.12 zk-1 0/1 Terminating 0 1h <none> 172.16.20.12 zk-1 0/1 Pending 0 0s <none> <none> zk-1 0/1 Pending 0 0s <none> 172.16.20.12 zk-1 0/1 ContainerCreating 0 0s <none> 172.16.20.12 zk-1 0/1 Running 0 2s 192.168.3.13 172.16.20.12 zk-1 1/1 Running 0 20s 192.168.3.13 172.16.20.12 zk-0 1/1 Terminating 0 1h 192.168.5.170 172.16.20.10 zk-0 0/1 Terminating 0 1h <none> 172.16.20.10 zk-0 0/1 Terminating 0 1h <none> 172.16.20.10 zk-0 0/1 Terminating 0 1h <none> 172.16.20.10 zk-0 0/1 Terminating 0 1h <none> 172.16.20.10 zk-0 0/1 Pending 0 0s <none> <none> zk-0 0/1 Pending 0 0s <none> 172.16.20.10 zk-0 0/1 ContainerCreating 0 0s <none> 172.16.20.10 zk-0 0/1 Running 0 2s 192.168.5.171 172.16.20.10 zk-0 1/1 Running 0 12s 192.168.5.171 172.16.20.10

可以看到zk-0/zk-1都重啟了,這樣可以加載新的zoo.cfg配置文件,保證集群正確配置。
新的zoo.cfg配置文件記錄了3個實例:

[root@172 ~]# kubectl exec zk-0 -- cat /opt/zookeeper/conf/zoo.cfg #This file was autogenerated DO NOT EDIT clientPort=2181 dataDir=/var/lib/zookeeper/data dataLogDir=/var/lib/zookeeper/data/log tickTime=2000 initLimit=10 syncLimit=5 maxClientCnxns=60 minSessionTimeout=4000 maxSessionTimeout=40000 autopurge.snapRetainCount=3 autopurge.purgeInteval=12 server.1=zk-0.zk-hs.default.svc.cluster.local:2888:3888 server.2=zk-1.zk-hs.default.svc.cluster.local:2888:3888 server.3=zk-2.zk-hs.default.svc.cluster.local:2888:3888

7. 測試縮容

縮容的時候,zk集群也自動重啟了所有的zk節點,縮容過程如下:

[root@172 ~]# kubectl get po -owide -w NAME READY STATUS RESTARTS AGE IP NODE zk-0 1/1 Running 0 5m 192.168.5.171 172.16.20.10 zk-1 1/1 Running 0 6m 192.168.3.13 172.16.20.12 zk-2 1/1 Running 0 7m 192.168.2.154 172.16.20.11 zk-2 1/1 Terminating 0 7m 192.168.2.154 172.16.20.11 zk-1 1/1 Terminating 0 7m 192.168.3.13 172.16.20.12 zk-2 0/1 Terminating 0 8m <none> 172.16.20.11 zk-1 0/1 Terminating 0 7m <none> 172.16.20.12 zk-2 0/1 Terminating 0 8m <none> 172.16.20.11 zk-1 0/1 Terminating 0 7m <none> 172.16.20.12 zk-1 0/1 Terminating 0 7m <none> 172.16.20.12 zk-1 0/1 Terminating 0 7m <none> 172.16.20.12 zk-1 0/1 Pending 0 0s <none> <none> zk-1 0/1 Pending 0 0s <none> 172.16.20.12 zk-1 0/1 ContainerCreating 0 0s <none> 172.16.20.12 zk-1 0/1 Running 0 2s 192.168.3.14 172.16.20.12 zk-2 0/1 Terminating 0 8m <none> 172.16.20.11 zk-2 0/1 Terminating 0 8m <none> 172.16.20.11 zk-1 1/1 Running 0 19s 192.168.3.14 172.16.20.12 zk-0 1/1 Terminating 0 7m 192.168.5.171 172.16.20.10 zk-0 0/1 Terminating 0 7m <none> 172.16.20.10 zk-0 0/1 Terminating 0 7m <none> 172.16.20.10 zk-0 0/1 Terminating 0 7m <none> 172.16.20.10 zk-0 0/1 Pending 0 0s <none> <none> zk-0 0/1 Pending 0 0s <none> 172.16.20.10 zk-0 0/1 ContainerCreating 0 0s <none> 172.16.20.10 zk-0 0/1 Running 0 3s 192.168.5.172 172.16.20.10 zk-0 1/1 Running 0 13s 192.168.5.172 172.16.20.10

四. etcd集群部署

1. 創建etcd集群

cat << EOF | kubectl create -f - apiVersion: v1 kind: Service metadata:name: "etcd"annotations: # Create endpoints also if the related pod isn't readyservice.alpha.kubernetes.io/tolerate-unready-endpoints: "true" spec:ports:- port: 2379name: client- port: 2380name: peerclusterIP: Noneselector:component: "etcd" --- apiVersion: apps/v1beta1 kind: StatefulSet metadata:name: "etcd"labels:component: "etcd" spec:serviceName: "etcd" # changing replicas value will require a manual etcdctl member remove/add # command (remove before decreasing and add after increasing)replicas: 3template:metadata:name: "etcd"labels:component: "etcd"spec:containers:- name: "etcd"image: "172.16.18.100:5000/quay.io/coreos/etcd:v3.2.3"ports:- containerPort: 2379name: client- containerPort: 2380name: peerenv:- name: CLUSTER_SIZEvalue: "3"- name: SET_NAMEvalue: "etcd"volumeMounts:- name: datamountPath: /var/run/etcdcommand:- "/bin/sh"- "-ecx"- |IP=$(hostname -i)for i in $(seq 0 $((${CLUSTER_SIZE} - 1))); dowhile true; doecho "Waiting for ${SET_NAME}-${i}.${SET_NAME} to come up"ping -W 1 -c 1 ${SET_NAME}-${i}.${SET_NAME}.default.svc.cluster.local > /dev/null && breaksleep 1sdonedonePEERS=""for i in $(seq 0 $((${CLUSTER_SIZE} - 1))); doPEERS="${PEERS}${PEERS:+,}${SET_NAME}-${i}=http://${SET_NAME}-${i}.${SET_NAME}.default.svc.cluster.local:2380"done # start etcd. If cluster is already initialized the `--initial-*` options will be ignored.exec etcd --name ${HOSTNAME} \--listen-peer-urls http://${IP}:2380 \--listen-client-urls http://${IP}:2379,http://127.0.0.1:2379 \--advertise-client-urls http://${HOSTNAME}.${SET_NAME}:2379 \--initial-advertise-peer-urls http://${HOSTNAME}.${SET_NAME}:2380 \--initial-cluster-token etcd-cluster-1 \--initial-cluster ${PEERS} \--initial-cluster-state new \--data-dir /var/run/etcd/default.etcd ## We are using dynamic pv provisioning using the "standard" storage class so ## this resource can be directly deployed without changes to minikube (since ## minikube defines this class for its minikube hostpath provisioner). In ## production define your own way to use pv claims.volumeClaimTemplates:- metadata:name: dataannotations:volume.beta.kubernetes.io/storage-class: cephspec:accessModes:- "ReadWriteOnce"resources:requests:storage: 1Gi EOF

創建完成之后的po,pv,pvc清單如下:

[root@172 etcd]# kubectl get po -owide NAME READY STATUS RESTARTS AGE IP NODE etcd-0 1/1 Running 0 15m 192.168.5.174 172.16.20.10 etcd-1 1/1 Running 0 15m 192.168.3.16 172.16.20.12 etcd-2 1/1 Running 0 5s 192.168.5.176 172.16.20.10

2. 測試縮容

kubectl scale statefulset etcd --replicas=2[root@172 ~]# kubectl get po -owide -w NAME READY STATUS RESTARTS AGE IP NODE etcd-0 1/1 Running 0 17m 192.168.5.174 172.16.20.10 etcd-1 1/1 Running 0 17m 192.168.3.16 172.16.20.12 etcd-2 1/1 Running 0 1m 192.168.5.176 172.16.20.10 etcd-2 1/1 Terminating 0 1m 192.168.5.176 172.16.20.10 etcd-2 0/1 Terminating 0 1m <none> 172.16.20.10

檢查集群健康

kubectl exec etcd-0 -- etcdctl cluster-healthfailed to check the health of member 42c8b94265b9b79a on http://etcd-2.etcd:2379: Get http://etcd-2.etcd:2379/health: dial tcp: lookup etcd-2.etcd on 10.96.0.10:53: no such host member 42c8b94265b9b79a is unreachable: [http://etcd-2.etcd:2379] are all unreachable member 9869f0647883a00d is healthy: got healthy result from http://etcd-1.etcd:2379 member c799a6ef06bc8c14 is healthy: got healthy result from http://etcd-0.etcd:2379 cluster is healthy

發現縮容后,etcd-2并沒有從etcd集群中自動刪除,可見這個etcd鏡像對自動擴容縮容的支持并不夠好。
我們手工刪除掉etcd-2

[root@172 etcd]# kubectl exec etcd-0 -- etcdctl member remove 42c8b94265b9b79a Removed member 42c8b94265b9b79a from cluster [root@172 etcd]# kubectl exec etcd-0 -- etcdctl cluster-health member 9869f0647883a00d is healthy: got healthy result from http://etcd-1.etcd:2379 member c799a6ef06bc8c14 is healthy: got healthy result from http://etcd-0.etcd:2379 cluster is healthy

3. 測試擴容

從etcd.yaml的啟動腳本中可以看出,擴容時新啟動一個etcd pod時參數--initial-cluster-state new,該etcd鏡像并不支持動態擴容,可以考慮使用基于dns動態部署etcd集群的方式來修改啟動腳本,這樣才能支持etcd cluster動態擴容。

總結

以上是生活随笔為你收集整理的zookeeper和etcd有状态服务部署的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。