基于RKE部署的rancher管理平台迁移
最近公司機房新上了ssd存儲盤,趕緊申請使用,已經無法忍受etcd超時打不開web頁面的現象了。更換成ssd,瞬間感受絲滑的酸爽。記錄下平臺遷移的過程。
要做平臺遷移,前提是新舊機的網絡要互通,kubernetes版本1.16.3 版本比較舊了回頭升級下。
一、rancher管理平臺
? ??
? ? ? ? ? ? ? ? ? ? ? 舊平臺
| ip | CPU | 內存(G) | 磁盤 |
| 172.26.179.146master | 4 | 8 | 40G(ceph磁盤) |
| 172.26.179.147 | 4 | 8 | 40G(ceph磁盤) |
| 172.26.179.148 | 4 | 8 | 40G(ceph磁盤) |
? ? ? ? ? ? ? ? ? ? ?新平臺
| ip | CPU | 內存(G) | 磁盤 |
| 172.25.149.111master | 4 | 8 | 40G(SSD磁盤) |
| 172.25.149.112 | 4 | 8 | 40G(SSD磁盤) |
| 172.25.149.113 | 4 | 8 | 40G(SSD磁盤) |
二、新環境準備(在三臺新機同時執行)
1.性能優化
echo " net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_tw_recycle = 0 net.bridge.bridge-nf-call-ip6tables=1 net.bridge.bridge-nf-call-iptables=1 net.ipv4.ip_forward=1 net.ipv4.conf.all.forwarding=1 net.ipv4.neigh.default.gc_thresh1=4096 net.ipv4.neigh.default.gc_thresh2=6144 net.ipv4.neigh.default.gc_thresh3=8192 net.ipv4.neigh.default.gc_interval=60 net.ipv4.neigh.default.gc_stale_time=120# 參考 https://github.com/prometheus/node_exporter#disabled-by-default kernel.perf_event_paranoid=-1#sysctls for k8s node config net.ipv4.tcp_slow_start_after_idle=0 net.core.rmem_max=16777216 fs.inotify.max_user_watches=524288 kernel.softlockup_all_cpu_backtrace=1 kernel.softlockup_panic=1 fs.file-max=2097152 fs.inotify.max_user_instances=8192 fs.inotify.max_queued_events=16384 vm.max_map_count=262144 fs.may_detach_mounts=1 net.core.netdev_max_backlog=16384 net.ipv4.tcp_wmem=4096 12582912 16777216 net.core.wmem_max=16777216 net.core.somaxconn=32768 net.ipv4.ip_forward=1 net.ipv4.tcp_max_syn_backlog=8096 net.ipv4.tcp_rmem=4096 12582912 16777216net.ipv6.conf.all.disable_ipv6=1 net.ipv6.conf.default.disable_ipv6=1 net.ipv6.conf.lo.disable_ipv6=1kernel.yama.ptrace_scope=0 vm.swappiness=0# 可以控制core文件的文件名中是否添加pid作為擴展。 kernel.core_uses_pid=1# Do not accept source routing net.ipv4.conf.default.accept_source_route=0 net.ipv4.conf.all.accept_source_route=0# Promote secondary addresses when the primary address is removed net.ipv4.conf.default.promote_secondaries=1 net.ipv4.conf.all.promote_secondaries=1# Enable hard and soft link protection fs.protected_hardlinks=1 fs.protected_symlinks=1# 源路由驗證 # see details in https://help.aliyun.com/knowledge_detail/39428.html net.ipv4.conf.all.rp_filter=0 net.ipv4.conf.default.rp_filter=0 net.ipv4.conf.default.arp_announce = 2 net.ipv4.conf.lo.arp_announce=2 net.ipv4.conf.all.arp_announce=2# see details in https://help.aliyun.com/knowledge_detail/41334.html net.ipv4.tcp_max_tw_buckets=5000 net.ipv4.tcp_syncookies=1 net.ipv4.tcp_fin_timeout=30 net.ipv4.tcp_synack_retries=2 kernel.sysrq=1" >> /etc/sysctl.conf sysctl -p cat >> /etc/security/limits.conf <<EOF * soft nofile 65535 * hard nofile 65536 EOF2.安裝docker
yum install -y yum-utils device-mapper-persistent-data lvm2 yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo yum install -y docker-ce-20.10.7-3.el7.x86_64 docker-ce-cli-20.10.7-3.el7.x86_64啟動docker
systemctl start docker systemctl enable docker systemctl status docker優化docker的配置
?cat > /etc/docker/daemon.json <<EOF {"oom-score-adjust": -1000,"log-driver": "json-file","log-opts": {"max-size": "500m","max-file": "3"},"registry-mirrors": ["https://7bezldxe.mirror.aliyuncs.com"] } EOF重新啟動docker
systemctl restart docker
3.時區優化
時間同步 timedatectl status timedatectl set-timezone Asia/Shanghai timedatectl set-ntp yes date4.創建 centos賬戶
#創建centos用戶 useradd centos #配置密碼 echo “新密碼”|passwd --stdin centos #將centos用戶加入docker用戶組 usermod -aG docker centos5.配置私有鏡像倉庫
根據個人情況配置
6.配置hosts及免密登錄
echo "172.25.149.122 k8s.example.com" >> /etc/hosts # 從172.26.179.146免密登錄3臺新機 在172.26.179.146上執行 ssh-copy-id centos@172.26.149.111 ssh-copy-id centos@172.26.149.112 ssh-copy-id centos@172.26.149.113根據提示輸入密碼即可完成免密配置 # 從172.26.149.111免密登錄6臺新機 在172.26.149.111上執行 ssh-keygen -t rsa #全部回車會生成密鑰 ssh-copy-id centos@172.26.179.146 ssh-copy-id centos@172.26.179.147 ssh-copy-id centos@172.26.179.148 ssh-copy-id centos@172.26.149.111 ssh-copy-id centos@172.26.149.112 ssh-copy-id centos@172.26.149.113根據提示輸入密碼即可完成免密配置三、在線熱遷移(172.26.179.146上執行)
1.修改cluster.yml?
將原來的配置文件修改為
nodes:- address: 172.26.179.146user: centosrole: [controlplane,worker,etcd]- address: 172.26.179.147user: centosrole: [controlplane,worker,etcd]- address: 172.26.179.148user: centosrole: [controlplane,worker,etcd]- address: 172.25.149.111user: centosrole: [controlplane,worker,etcd]- address: 172.25.149.112user: centosrole: [controlplane,worker,etcd]- address: 172.25.149.113user: rancherrole: [controlplane,worker,etcd] services:etcd:snapshot: truecreation: 6hretention: 24h private_registries: - url: 10.15.128.38user: adminpassword: Abc123@#!dddis_default: true #rke up --config ./cluster.yml #根據網絡情況而定耗時 ,最后成功時會看到如下字樣 INFO[0339] [sync] Successfully synced nodes Labels and Taints INFO[0339] [network] Setting up network plugin: canal INFO[0339] [addons] Saving ConfigMap for addon rke-network-plugin to Kubernetes INFO[0339] [addons] Successfully saved ConfigMap for addon rke-network-plugin to Kubernetes INFO[0339] [addons] Executing deploy job rke-network-plugin INFO[0339] [addons] Setting up coredns INFO[0339] [addons] Saving ConfigMap for addon rke-coredns-addon to Kubernetes INFO[0339] [addons] Successfully saved ConfigMap for addon rke-coredns-addon to Kubernetes INFO[0339] [addons] Executing deploy job rke-coredns-addon INFO[0339] [addons] CoreDNS deployed successfully.. INFO[0339] [dns] DNS provider coredns deployed successfully INFO[0339] [addons] Setting up Metrics Server INFO[0339] [addons] Saving ConfigMap for addon rke-metrics-addon to Kubernetes INFO[0339] [addons] Successfully saved ConfigMap for addon rke-metrics-addon to Kubernetes INFO[0339] [addons] Executing deploy job rke-metrics-addon INFO[0339] [addons] Metrics Server deployed successfully INFO[0339] [ingress] Setting up nginx ingress controller INFO[0339] [addons] Saving ConfigMap for addon rke-ingress-controller to Kubernetes INFO[0339] [addons] Successfully saved ConfigMap for addon rke-ingress-controller to Kubernetes INFO[0339] [addons] Executing deploy job rke-ingress-controller INFO[0339] [ingress] ingress controller nginx deployed successfully INFO[0339] [addons] Setting up user addons INFO[0339] [addons] no user addons defined INFO[0339] Finished building Kubernetes cluster successfully會在同目錄下產生兩個文件,kube_config_cluster.yml? cluster.rkestate ,保存好文件
連同cluster.yml、rke、kubectl拷貝至172.26.149.111的/home/rancher目錄下
以下命令均在172.26.149.111上執行
在172.26.149.111上執行 sudo chmod +x rke kubectl sudo mv {rke,kubectl} /bin/修改cluster.yml(在172.25.149.111上面執行)
修改后的內容如下
nodes:- address: 172.25.149.111user: centosrole: [controlplane,worker,etcd]- address: 172.25.149.112user: centosrole: [controlplane,worker,etcd]- address: 172.25.149.113user: rancherrole: [controlplane,worker,etcd] services:etcd:snapshot: truecreation: 6hretention: 24h private_registries: - url: 10.15.128.38user: adminpassword: Abc123@#!dddis_default: true執行升級,將舊平臺的機器從集群移除。
#rke up --config ./cluster.yml #根據網絡情況而定耗時 ,最后成功時會看到如下字樣 INFO[0339] [sync] Successfully synced nodes Labels and Taints INFO[0339] [network] Setting up network plugin: canal INFO[0339] [addons] Saving ConfigMap for addon rke-network-plugin to Kubernetes INFO[0339] [addons] Successfully saved ConfigMap for addon rke-network-plugin to Kubernetes INFO[0339] [addons] Executing deploy job rke-network-plugin INFO[0339] [addons] Setting up coredns INFO[0339] [addons] Saving ConfigMap for addon rke-coredns-addon to Kubernetes INFO[0339] [addons] Successfully saved ConfigMap for addon rke-coredns-addon to Kubernetes INFO[0339] [addons] Executing deploy job rke-coredns-addon INFO[0339] [addons] CoreDNS deployed successfully.. INFO[0339] [dns] DNS provider coredns deployed successfully INFO[0339] [addons] Setting up Metrics Server INFO[0339] [addons] Saving ConfigMap for addon rke-metrics-addon to Kubernetes INFO[0339] [addons] Successfully saved ConfigMap for addon rke-metrics-addon to Kubernetes INFO[0339] [addons] Executing deploy job rke-metrics-addon INFO[0339] [addons] Metrics Server deployed successfully INFO[0339] [ingress] Setting up nginx ingress controller INFO[0339] [addons] Saving ConfigMap for addon rke-ingress-controller to Kubernetes INFO[0339] [addons] Successfully saved ConfigMap for addon rke-ingress-controller to Kubernetes INFO[0339] [addons] Executing deploy job rke-ingress-controller INFO[0339] [ingress] ingress controller nginx deployed successfully INFO[0339] [addons] Setting up user addons INFO[0339] [addons] no user addons defined INFO[0339] Finished building Kubernetes cluster successfully配置環境
#在profile文件末尾添加kube_config_rancher-cluster.yml文件路徑并保存#其實可以自定義位置 export KUBECONFIG=/home/rancher/kube_config_cluster.yml [root@k8s-master ~]# echo "source <(kubectl completion bash)" >> ~/.bashrc [root@k8s-master ~]# source ~/.bashrc [root@k8s-master ~]# su - rancher [rancher@k8s-master ~]# echo "source <(kubectl completion bash)" >> ~/.bashrc [rancher@k8s-master ~]# source ~/.bashrc測試集群
通過kubectl測試您的連接,并查看您的所有節點是否處于Ready狀態 [centos@k8s-master ~]# kubectl get node [centos@k8s-master ~]# kubectl get pods --all-namespaces四、nginx切換
在nginx服務器上切換nginx.config ,修改配置前保存一份nginx.config?
切換完后,觀察一周時間,由于使用了ssd盤,現在的性能要比之前好很多,不會再報etcd超時的問題,順利解決了etcd超時管理web頁面打不開的現象。瞬間絲滑!
總結
以上是生活随笔為你收集整理的基于RKE部署的rancher管理平台迁移的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 医疗器械YY0505-2012、YY97
- 下一篇: 以太网100Mhz频率为什么可以达到带宽