日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

使用Zabbix监控ZooKeeper服务的健康状态

發(fā)布時間:2025/5/22 编程问答 18 豆豆
生活随笔 收集整理的這篇文章主要介紹了 使用Zabbix监控ZooKeeper服务的健康状态 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

一 應(yīng)用場景描述

在目前公司的業(yè)務(wù)中,沒有太多使用ZooKeeper作為協(xié)同服務(wù)的場景。但是我們將使用Codis作為Redis的集群部署方案,Codis依賴ZooKeeper來存儲配置信息。所以做好ZooKeeper的監(jiān)控也很重要。


二 ZooKeeper監(jiān)控要點(diǎn)

系統(tǒng)監(jiān)控

內(nèi)存使用量 ? ?ZooKeeper應(yīng)當(dāng)完全運(yùn)行在內(nèi)存中,不能使用到SWAP。Java Heap大小不能超過可用內(nèi)存。

Swap使用量 ? ?使用Swap會降低ZooKeeper的性能,設(shè)置vm.swappiness = 0

網(wǎng)絡(luò)帶寬占用 ? 如果發(fā)現(xiàn)ZooKeeper性能降低關(guān)注下網(wǎng)絡(luò)帶寬占用情況和丟包情況,通常情況下ZooKeeper是20%寫入80%讀入

磁盤使用量 ? ?ZooKeeper數(shù)據(jù)目錄使用情況需要注意

磁盤I/O ? ? ?ZooKeeper的磁盤寫入是異步的,所以不會存在很大的I/O請求,如果ZooKeeper和其他I/O密集型服務(wù)公用應(yīng)該關(guān)注下磁盤I/O情況


ZooKeeper監(jiān)控

zk_avg/min/max_latency? ? 響應(yīng)一個客戶端請求的時間,建議這個時間大于10個Tick就報警

zk_outstanding_requests? ? ? ? 排隊(duì)請求的數(shù)量,當(dāng)ZooKeeper超過了它的處理能力時,這個值會增大,建議設(shè)置報警閥值為10

zk_packets_received? ? ? 接收到客戶端請求的包數(shù)量

zk_packets_sent? ? ? ? 發(fā)送給客戶單的包數(shù)量,主要是響應(yīng)和通知

zk_max_file_descriptor_count ? 最大允許打開的文件數(shù),由ulimit控制

zk_open_file_descriptor_count ? ?打開文件數(shù)量,當(dāng)這個值大于允許值得85%時報警

Mode ? ? ? ? ? ? ? ?運(yùn)行的角色,如果沒有加入集群就是standalone,加入集群式follower或者leader

zk_followers ? ? ? ? ?leader角色才會有這個輸出,集合中follower的個數(shù)。正常的值應(yīng)該是集合成員的數(shù)量減1

zk_pending_syncs? ? ? ?leader角色才會有這個輸出,pending syncs的數(shù)量

zk_znode_count ? ? ? ? znodes的數(shù)量

zk_watch_count ? ? ? ? watches的數(shù)量

Java Heap Size ? ? ? ? ZooKeeper Java進(jìn)程的


#?echo?ruok|nc?127.0.0.1?2181 imok#?echo?mntr|nc?127.0.0.1?2181 zk_version 3.4.6-1569965,?built?on?02/20/2014?09:09?GMT zk_avg_latency 0 zk_max_latency 0 zk_min_latency 0 zk_packets_received 11 zk_packets_sent 10 zk_num_alive_connections 1 zk_outstanding_requests 0 zk_server_state leader zk_znode_count 17159 zk_watch_count 0 zk_ephemerals_count 1 zk_approximate_data_size 6666471 zk_open_file_descriptor_count 29 zk_max_file_descriptor_count 102400 zk_followers 2 zk_synced_followers 2 zk_pending_syncs 0#?echo?srvr|nc?127.0.0.1?2181 Zookeeper?version:?3.4.6-1569965,?built?on?02/20/2014?09:09?GMT Latency?min/avg/max:?0/0/0 Received:?26 Sent:?25 Connections:?1 Outstanding:?0 Zxid:?0x500000000 Mode:?leader Node?count:?17159


三 編寫Zabbix監(jiān)控ZooKeeper的腳本和配置文件

要讓Zabbix收集到這些監(jiān)控數(shù)據(jù),有兩種方法一種是每個監(jiān)控項(xiàng)目通過zabbix agent單獨(dú)獲取,主動監(jiān)控和被動監(jiān)控都可以。還有一種方法就是將這些監(jiān)控數(shù)據(jù)一次性使用zabbix_sender全部發(fā)送給zabbix。這里我們選擇第二種方式。那么采用zabbix_sender一次性發(fā)送全部監(jiān)控數(shù)據(jù)的腳本就不能像通過zabbix agent這樣逐個獲取監(jiān)控項(xiàng)目來編寫腳本。

首先想辦法將監(jiān)控項(xiàng)目匯集成一個字典,然后遍歷這個字典,將字典中的key:value對通過zabbix_sender的-k和-o參數(shù)指定發(fā)送出去


echo mntr|nc 127.0.0.1 2181

這條命令可以使用Python的subprocess模塊調(diào)用,也可以使用socket模塊去訪問2181端口然后發(fā)送命令獲取數(shù)據(jù),獲取到mntr執(zhí)行的數(shù)據(jù)后還需要將其轉(zhuǎn)化成為字典數(shù)據(jù)

即需要將這種樣式的數(shù)據(jù)

zk_version 3.4.6-1569965,?built?on?02/20/2014?09:09?GMT zk_avg_latency 0 zk_max_latency 0 zk_min_latency 0 zk_packets_received 91 zk_packets_sent 90 zk_num_alive_connections 1 zk_outstanding_requests 0 zk_server_state follower zk_znode_count 17159 zk_watch_count 0 zk_ephemerals_count 1 zk_approximate_data_size 6666471 zk_open_file_descriptor_count 27 zk_max_file_descriptor_count 102400


轉(zhuǎn)換成為這樣的數(shù)據(jù)

{'zk_followers':?2,?'zk_outstanding_requests':?0,?'zk_approximate_data_size':?6666471,?'zk_packets_sent':?2089,?'zk_pending_syncs':?0,?'zk_avg_latency':?0,?'zk_version':?'3.4.6-1569965,?built?on?02/20/2014?09:09?GMT',?'zk_watch_count':?2,?'zk_packets_received':?2090,?'zk_open_file_descriptor_count':?30,?'zk_server_ruok':?'imok',?'zk_server_state':?'leader',?'zk_synced_followers':?2,?'zk_max_latency':?28,?'zk_num_alive_connections':?2,?'zk_min_latency':?0,?'zk_ephemerals_count':?1,?'zk_znode_count':?17159,?'zk_max_file_descriptor_count':?102400}



到最后需要使用zabbix_sender發(fā)送的數(shù)據(jù)格式這個樣子的

zookeeper.status[zk_version]這是key的名稱

zookeeper.status[zk_outstanding_requests]:0 zookeeper.status[zk_approximate_data_size]:6666471 zookeeper.status[zk_packets_sent]:48 zookeeper.status[zk_avg_latency]:0 zookeeper.status[zk_version]:3.4.6-1569965,?built?on?02/20/2014?09:09?GMT zookeeper.status[zk_watch_count]:0 zookeeper.status[zk_packets_received]:49 zookeeper.status[zk_open_file_descriptor_count]:27 zookeeper.status[zk_server_ruok]:imok zookeeper.status[zk_server_state]:follower zookeeper.status[zk_max_latency]:0 zookeeper.status[zk_num_alive_connections]:1 zookeeper.status[zk_min_latency]:0 zookeeper.status[zk_ephemerals_count]:1 zookeeper.status[zk_znode_count]:17159 zookeeper.status[zk_max_file_descriptor_count]:102400



精簡代碼如下:

#!/usr/bin/python import?socket #from?StringIO?import?StringIO from?cStringIO?import?StringIO s=socket.socket() s.connect(('localhost',2181)) s.send('mntr') data_mntr=s.recv(2048) s.close() #print?data_mntr h=StringIO(data_mntr) result={} zresult={} for?line?in??h.readlines():key,value=map(str.strip,line.split('\t'))zkey='zookeeper.status'?+?'['?+?key?+?']'zvalue=valueresult[key]=valuezresult[zkey]=zvalue print?result print?'\n\n' print?zresult#?python?test.py? {'zk_outstanding_requests':?'0',?'zk_approximate_data_size':?'6666471',?'zk_max_latency':?'0',?'zk_avg_latency':?'0',?'zk_version':?'3.4.6-1569965,?built?on?02/20/2014?09:09?GMT',?'zk_watch_count':?'0',?'zk_num_alive_connections':?'1',?'zk_open_file_descriptor_count':?'27',?'zk_server_state':?'follower',?'zk_packets_sent':?'542',?'zk_packets_received':?'543',?'zk_min_latency':?'0',?'zk_ephemerals_count':?'1',?'zk_znode_count':?'17159',?'zk_max_file_descriptor_count':?'102400'}{'zookeeper.status[zk_watch_count]':?'0',?'zookeeper.status[zk_avg_latency]':?'0',?'zookeeper.status[zk_max_latency]':?'0',?'zookeeper.status[zk_approximate_data_size]':?'6666471',?'zookeeper.status[zk_server_state]':?'follower',?'zookeeper.status[zk_num_alive_connections]':?'1',?'zookeeper.status[zk_min_latency]':?'0',?'zookeeper.status[zk_outstanding_requests]':?'0',?'zookeeper.status[zk_packets_received]':?'543',?'zookeeper.status[zk_ephemerals_count]':?'1',?'zookeeper.status[zk_znode_count]':?'17159',?'zookeeper.status[zk_packets_sent]':?'542',?'zookeeper.status[zk_open_file_descriptor_count]':?'27',?'zookeeper.status[zk_max_file_descriptor_count]':?'102400',?'zookeeper.status[zk_version]':?'3.4.6-1569965,?built?on?02/20/2014?09:09?GMT'}



詳細(xì)代碼如下:

#!/usr/bin/python"""?Check?Zookeeper?Clusterzookeeper?version?should?be?newer?than?3.4.x#?echo?mntr|nc?127.0.0.1?2181 zk_version 3.4.6-1569965,?built?on?02/20/2014?09:09?GMT zk_avg_latency 0 zk_max_latency 4 zk_min_latency 0 zk_packets_received 84467 zk_packets_sent 84466 zk_num_alive_connections 3 zk_outstanding_requests 0 zk_server_state follower zk_znode_count 17159 zk_watch_count 2 zk_ephemerals_count 1 zk_approximate_data_size 6666471 zk_open_file_descriptor_count 29 zk_max_file_descriptor_count 102400#?echo?ruok|nc?127.0.0.1?2181 imok"""import?sys import?socket import?re import?subprocess from?StringIO?import?StringIO import?oszabbix_sender?=?'/opt/app/zabbix/sbin/zabbix_sender' zabbix_conf?=?'/opt/app/zabbix/conf/zabbix_agentd.conf' send_to_zabbix?=?1#############?get?zookeeper?server?status class?ZooKeeperServer(object):def?__init__(self,?host='localhost',?port='2181',?timeout=1):self._address?=?(host,?int(port))self._timeout?=?timeoutself._result??=?{}def?_create_socket(self):return?socket.socket()def?_send_cmd(self,?cmd):"""?Send?a?4letter?word?command?to?the?server?"""s?=?self._create_socket()s.settimeout(self._timeout)s.connect(self._address)s.send(cmd)data?=?s.recv(2048)s.close()return?datadef?get_stats(self):"""?Get?ZooKeeper?server?stats?as?a?map?"""data_mntr?=?self._send_cmd('mntr')data_ruok?=?self._send_cmd('ruok')if?data_mntr:result_mntr?=?self._parse(data_mntr)if?data_ruok:result_ruok?=?self._parse_ruok(data_ruok)self._result?=?dict(result_mntr.items()?+?result_ruok.items())if?not?self._result.has_key('zk_followers')?and?not?self._result.has_key('zk_synced_followers')?and?not?self._result.has_key('zk_pending_syncs'):#####?the?tree?metrics?only?exposed?on?leader?role?zookeeper?server,?we?just?set?the?followers'?to?0leader_only?=?{'zk_followers':0,'zk_synced_followers':0,'zk_pending_syncs':0}????self._result?=?dict(result_mntr.items()?+?result_ruok.items()?+?leader_only.items()?)return?self._result??def?_parse(self,?data):"""?Parse?the?output?from?the?'mntr'?4letter?word?command?"""h?=?StringIO(data)result?=?{}for?line?in?h.readlines():try:key,?value?=?self._parse_line(line)result[key]?=?valueexcept?ValueError:pass?#?ignore?broken?linesreturn?resultdef?_parse_ruok(self,?data):"""?Parse?the?output?from?the?'ruok'?4letter?word?command?"""h?=?StringIO(data)result?=?{}ruok?=?h.readline()if?ruok:result['zk_server_ruok']?=?ruokreturn?resultdef?_parse_line(self,?line):try:key,?value?=?map(str.strip,?line.split('\t'))except?ValueError:raise?ValueError('Found?invalid?line:?%s'?%?line)if?not?key:raise?ValueError('The?key?is?mandatory?and?should?not?be?empty')try:value?=?int(value)except?(TypeError,?ValueError):passreturn?key,?valuedef?get_pid(self): #??ps?-ef|grep?java|grep?zookeeper|awk?'{print?$2}'pidarg?=?'''ps?-ef|grep?java|grep?zookeeper|grep?-v?grep|awk?'{print?$2}'?'''?pidout?=?subprocess.Popen(pidarg,shell=True,stdout=subprocess.PIPE)pid?=?pidout.stdout.readline().strip('\n')return?piddef?send_to_zabbix(self,?metric):key?=?"zookeeper.status["?+??metric?+?"]"if?send_to_zabbix?>?0:#print?key?+?":"?+?str(self._result[metric])try:subprocess.call([zabbix_sender,?"-c",?zabbix_conf,?"-k",?key,?"-o",?str(self._result[metric])?],?stdout=FNULL,?stderr=FNULL,?shell=False)except?OSError,?detail:print?"Something?went?wrong?while?exectuting?zabbix_sender?:?",?detailelse:print?"Simulation:?the?following?command?would?be?execucted?:\n",?zabbix_sender,?"-c",?zabbix_conf,?"-k",?key,?"-o",?self._result[metric],?"\n"def?usage():"""Display?program?usage"""print?"\nUsage?:?",?sys.argv[0],?"?alive|all"print?"Modes?:?\n\talive?:?Return?pid?of?running?zookeeper\n\tall?:?Send?zookeeper?stats?as?well"sys.exit(1)accepted_modes?=?['alive',?'all']if?len(sys.argv)?==?2?and?sys.argv[1]?in?accepted_modes:mode?=?sys.argv[1] else:usage()zk?=?ZooKeeperServer() #??print?zk.get_stats() pid?=?zk.get_pid()if?pid?!=?""?and??mode?==?'all':zk.get_stats()#?print?zk._resultFNULL?=?open(os.devnull,?'w')for?key?in?zk._result:zk.send_to_zabbix(key)FNULL.close()print?pidelif?pid?!=?""?and?mode?==?"alive":print?pid else:print?0




zabbix配置文件check_zookeeper.conf

UserParameter=zookeeper.status[*],/usr/bin/python?/opt/app/zabbix/sbin/check_zookeeper.py?$1


重新啟動zabbix agent服務(wù)








四 制作Zabbix監(jiān)控ZooKeeper的模板并設(shè)置報警閥值

模板參見附件













參考文檔:

https://blog.serverdensity.com/how-to-monitor-zookeeper/

https://github.com/apache/zookeeper/tree/trunk/src/contrib/monitoring

http://john88wang.blog.51cto.com/2165294/1708302







轉(zhuǎn)載于:https://blog.51cto.com/john88wang/1745339

總結(jié)

以上是生活随笔為你收集整理的使用Zabbix监控ZooKeeper服务的健康状态的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。