hadoop loadBalance源码分析
項(xiàng)目hbase數(shù)據(jù)庫(kù)出現(xiàn)很詭異的assignment ,region移動(dòng)的src和dest都是同一臺(tái)regionserver,不過(guò)時(shí)間戳不同,啟動(dòng)的只有一個(gè)regionserver, 不知道怎么出現(xiàn)了兩個(gè)時(shí)間戳
分析下源碼解決一下?
loadbalance只有一個(gè)實(shí)現(xiàn) org.apache.hadoop.hbase.master.DefaultLoadBalancer 在HMaster中會(huì)啟動(dòng)一個(gè)線(xiàn)程?org.apache.hadoop.hbase.Chore,按照設(shè)定的hbase.balancer.period(默認(rèn)300000ms,五分鐘),遍歷所有表,根據(jù)每個(gè)表在regionserver中的region數(shù)量做balance,有一個(gè)平衡系數(shù)hbase.regions.slop(默認(rèn)0.2),根據(jù)region總數(shù)算出平均region值,avg×0.8 取整作為最小值,avg×1.2取整作為最大值,regionserver上超過(guò)最大值要移走,小于最小值要移動(dòng)region過(guò)來(lái)。否則打印目前的平衡狀態(tài)。 assignmentManager 根據(jù)上述步驟生成的RegionPlan, 從src移動(dòng)region到desc ?src和desc都是ServerName對(duì)象 HMaster啟動(dòng)時(shí)會(huì)等待region servers注冊(cè)到serverManager // Wait for region servers to report in. this.serverManager.waitForRegionServers(status); // Check zk for regionservers that are up but didn't register for (ServerName sn: this.regionServerTracker.getOnlineServers()) { if (!this.serverManager.isServerOnline(sn)) { // Not registered; add it.LOG.info("Registering server found up in zk but who has not yet " +"reported in: " + sn);this.serverManager.recordNewServer(sn, HServerLoad.EMPTY_HSERVERLOAD);} }serverManager線(xiàn)程sleep一定時(shí)間,等待HRegionServer注冊(cè)
HRegionServer.java:
// Try and register with the Master; tell it we are here. Break if// server is stopped or the clusterup flag is down or hdfs went wacky.while (keepLooping()) {MapWritable w = reportForDuty();if (w == null) {LOG.warn("reportForDuty failed; sleeping and then retrying.");this.sleeper.sleep();} else {handleReportForDutyResponse(w);break;}}HRegionServer 注冊(cè)之后進(jìn)入mainloop
// The main run loop.while (!this.stopped && isHealthy()) {long now = System.currentTimeMillis();
if ((now - lastMsg) >= msgInterval) {
doMetrics();
tryRegionServerReport();
lastMsg = System.currentTimeMillis();
}
? }
每隔hbase.regionserver.msginterval時(shí)間(默認(rèn)3秒),進(jìn)行一次注冊(cè)嘗試,如果服務(wù)器ip和端口不在已注冊(cè)列表中,則添加ServerName進(jìn)map
ServerManager.java
void regionServerReport(ServerName sn, HServerLoad hsl)throws YouAreDeadException, PleaseHoldException {checkIsDead(sn, "REPORT");if (!this.onlineServers.containsKey(sn)) {// Already have this host+port combo and its just different start code? checkAlreadySameHostPort(sn);// Just let the server in. Presume master joining a running cluster.// recordNewServer is what happens at the end of reportServerStartup.// The only thing we are skipping is passing back to the regionserver// the ServerName to use. Here we presume a master has already done// that so we'll press on with whatever it gave us for ServerName. recordNewServer(sn, hsl);} else {this.onlineServers.put(sn, hsl);}}recordNewServer 會(huì)打印 ServerName對(duì)象的ip 端口和時(shí)間戳信息
同一個(gè)region server注冊(cè)的ServerName對(duì)象 會(huì)擁有同樣的時(shí)間戳?
this.startcode = System.currentTimeMillis();...result = this.hbaseMaster.regionServerStartup(port, this.startcode, now);...this.serverNameFromMasterPOV = new ServerName(hostnameFromMasterPOV, this.isa.getPort(), this.startcode);...this.hbaseMaster.regionServerReport(this.serverNameFromMasterPOV.getVersionedBytes(), hsl);?
region server啟動(dòng)時(shí)startCode是固定死的,按照這個(gè)流程是不會(huì)出現(xiàn)相同IP和端口,但時(shí)間戳不同的region server跑在線(xiàn)上的?
如果一臺(tái)機(jī)器上啟動(dòng)了兩個(gè)region server 會(huì)把時(shí)間戳小的移出,下次添加進(jìn)時(shí)間戳大的進(jìn)去
我們遇到的問(wèn)題是時(shí)間戳不同的regionserver被注冊(cè)在了master上,并且相互之間做region move
?
轉(zhuǎn)載于:https://www.cnblogs.com/shenguanpu/archive/2012/07/30/2615214.html
總結(jié)
以上是生活随笔為你收集整理的hadoop loadBalance源码分析的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: Java私有构造函数不能阻止继承
- 下一篇: MS509Team-----------