日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

Hadoop+Hbase分布式集群架构“完全篇”

發(fā)布時(shí)間:2025/3/15 编程问答 41 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Hadoop+Hbase分布式集群架构“完全篇” 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

本文收錄在Linux運(yùn)維企業(yè)架構(gòu)實(shí)戰(zhàn)系列

前言:本篇博客是博主踩過無(wú)數(shù)坑,反復(fù)查閱資料,一步步搭建,操作完成后整理的個(gè)人心得,分享給大家~~~

1、認(rèn)識(shí)Hadoop和Hbase

1.1 hadoop簡(jiǎn)單介紹

  Hadoop是一個(gè)使用java編寫的Apache開放源代碼框架,它允許使用簡(jiǎn)單的編程模型跨大型計(jì)算機(jī)的大型數(shù)據(jù)集進(jìn)行分布式處理Hadoop框架工作的應(yīng)用程序可以在跨計(jì)算機(jī)群集提供分布式存儲(chǔ)和計(jì)算的環(huán)境中工作。Hadoop旨在從單一服務(wù)器擴(kuò)展到數(shù)千臺(tái)機(jī)器,每臺(tái)機(jī)器都提供本地計(jì)算和存儲(chǔ)。

?

1.2 Hadoop架構(gòu)

Hadoop框架包括以下四個(gè)模塊:

  • ?Hadoop Common:這些是其他Hadoop模塊所需的Java庫(kù)和實(shí)用程序。這些庫(kù)提供文件系統(tǒng)和操作系統(tǒng)級(jí)抽象,并包含啟動(dòng)Hadoop所需的必要Java文件和腳本。
  • ?Hadoop YARN:這是作業(yè)調(diào)度和集群資源管理框架
  • ?Hadoop分布式文件系統(tǒng)(HDFS:提供對(duì)應(yīng)用程序數(shù)據(jù)的高吞吐量訪問的分布式文件系統(tǒng)。
  • ?Hadoop MapReduce 這是基于YARN的大型數(shù)據(jù)集并行處理系統(tǒng)。

我們可以使用下圖來(lái)描述Hadoop框架中可用的這四個(gè)組件。

  自2012年以來(lái),術(shù)語(yǔ)“Hadoop”通常不僅指向上述基本模塊,而且還指向可以安裝在Hadoop之上或之外的其他軟件包,例如Apache PigApache HiveApache HBaseApache火花等

?

1.3 Hadoop如何工作?

(1)階段1

  用戶/應(yīng)用程序可以通過指定以下項(xiàng)目向Hadoophadoop作業(yè)客戶端)提交所需的進(jìn)程:

  • ?分布式文件系統(tǒng)中輸入和輸出文件的位置。
  • ?java類以jar文件的形式包含了mapreduce功能的實(shí)現(xiàn)。
  • ?通過設(shè)置作業(yè)特定的不同參數(shù)來(lái)進(jìn)行作業(yè)配置。

(2)階段2

  然后,Hadoop作業(yè)客戶端將作業(yè)(jar /可執(zhí)行文件等)和配置提交給JobTrackerJobTracker負(fù)責(zé)將軟件/配置分發(fā)到從站,調(diào)度任務(wù)和監(jiān)視它們,向作業(yè)客戶端提供狀態(tài)和診斷信息。

(3)階段3

  不同節(jié)點(diǎn)上的TaskTrackers根據(jù)MapReduce實(shí)現(xiàn)執(zhí)行任務(wù),并將reduce函數(shù)的輸出存儲(chǔ)到文件系統(tǒng)的輸出文件中。

?

1.4 Hadoop的優(yōu)點(diǎn)

  • ?Hadoop框架允許用戶快速編寫和測(cè)試分布式系統(tǒng)。它是高效的,它自動(dòng)分配數(shù)據(jù)并在機(jī)器上工作,反過來(lái)利用CPU核心的底層并行性。
  • ?Hadoop不依賴硬件提供容錯(cuò)和高可用性(FTHA,而是Hadoop庫(kù)本身被設(shè)計(jì)為檢測(cè)和處理應(yīng)用層的故障。
  • ?服務(wù)器可以動(dòng)態(tài)添加或從集群中刪除,Hadoop繼續(xù)運(yùn)行而不會(huì)中斷。
  • ?Hadoop的另一大優(yōu)點(diǎn)是,除了是開放源碼,它是所有平臺(tái)兼容的,因?yàn)樗腔?/span>Java的。

?

1.5 HBase介紹

  Hbase全稱為Hadoop?Database,即hbasehadoop的數(shù)據(jù)庫(kù),是一個(gè)分布式的存儲(chǔ)系統(tǒng)Hbase利用HadoopHDFS作為其文件存儲(chǔ)系統(tǒng)利用HadoopMapReduce來(lái)處理Hbase中的海量數(shù)據(jù)利用zookeeper作為其協(xié)調(diào)工具?

?

1.6 HBase體系架構(gòu)

Client

  • ?包含訪問HBase的接口并維護(hù)cache來(lái)加快對(duì)HBase的訪問

Zookeeper

  • ?保證任何時(shí)候,集群中只有一個(gè)master
  • ?存貯所有Region的尋址入口。
  • ?實(shí)時(shí)監(jiān)控Region server的上線和下線信息。并實(shí)時(shí)通知Master
  • ?存儲(chǔ)HBaseschematable元數(shù)據(jù)

Master

  • ?Region server分配region
  • ?負(fù)責(zé)Region server的負(fù)載均衡
  • ?發(fā)現(xiàn)失效的Region server并重新分配其上的region
  • ?管理用戶對(duì)table的增刪改操作

RegionServer

  • ?Region server維護(hù)region,處理對(duì)這些regionIO請(qǐng)求
  • ?Region server負(fù)責(zé)切分在運(yùn)行過程中變得過大的region 

HLog(WAL log)

  • ?HLog文件就是一個(gè)普通的Hadoop Sequence FileSequence File KeyHLogKey對(duì)象,HLogKey中記錄了寫入數(shù)據(jù)的歸屬信息,除了tableregion名字外,同時(shí)還包括sequence numbertimestamptimestamp寫入時(shí)間sequence number的起始值為0,或者是最近一次存入文件系 統(tǒng)中sequence number
  • ?HLog SequeceFileValueHBaseKeyValue對(duì)象,即對(duì)應(yīng)HFile中的 KeyValue

Region

  • ?HBase自動(dòng)把表水平劃分成多個(gè)區(qū)域(region),每個(gè)region會(huì)保存一個(gè)表 里面某段連續(xù)的數(shù)據(jù);每個(gè)表一開始只有一個(gè)region,隨著數(shù)據(jù)不斷插 入表,region不斷增大,當(dāng)增大到一個(gè)閥值的時(shí)候,region就會(huì)等分會(huì) 兩個(gè)新的region(裂變);
  • ?當(dāng)table中的行不斷增多,就會(huì)有越來(lái)越多的region。這樣一張完整的表 被保存在多個(gè)Regionserver上。

Memstore storefile

  • ?一個(gè)region由多個(gè)store組成,一個(gè)store對(duì)應(yīng)一個(gè)CF(列族)
  • ?store包括位于內(nèi)存中的memstore和位于磁盤的storefile寫操作先寫入 memstore,當(dāng)memstore中的數(shù)據(jù)達(dá)到某個(gè)閾值,hregionserver會(huì)啟動(dòng) flashcache進(jìn)程寫入storefile,每次寫入形成單獨(dú)的一個(gè)storefile
  • ?當(dāng)storefile文件的數(shù)量增長(zhǎng)到一定閾值后,系統(tǒng)會(huì)進(jìn)行合并(minormajor compaction),在合并過程中會(huì)進(jìn)行版本合并和刪除工作 (majar),形成更大的storefile
  • ?當(dāng)一個(gè)region所有storefile的大小和超過一定閾值后,會(huì)把當(dāng)前的region 分割為兩個(gè),并由hmaster分配到相應(yīng)的regionserver服務(wù)器,實(shí)現(xiàn)負(fù)載均衡。
  • ?客戶端檢索數(shù)據(jù),先在memstore找,找不到再找storefile
  • ?HRegionHBase中分布式存儲(chǔ)和負(fù)載均衡的最小單元。最小單元就表 示不同的HRegion可以分布在不同的HRegion server上。
  • ?HRegion由一個(gè)或者多個(gè)Store組成,每個(gè)store保存一個(gè)columns family
  • ?每個(gè)Strore又由一個(gè)memStore0至多個(gè)StoreFile組成。

?

2、安裝搭建hadoop

2.1 配置說明

本次集群搭建共三臺(tái)機(jī)器,具體說明下:

主機(jī)名IP說明
hadoop01192.168.10.101DataNode、NodeManager、ResourceManager、NameNode
hadoop02192.168.10.102DataNode、NodeManager、SecondaryNameNode
hadoop03192.168.10.106DataNode、NodeManager

?

2.2 安裝前準(zhǔn)備

2.2.1 機(jī)器配置說明

$ cat /etc/redhat-release CentOS Linux release 7.3.1611 (Core) $ uname -r 3.10.0-514.el7.x86_64

注:本集群內(nèi)所有進(jìn)程均由clsn用戶啟動(dòng);要在集群所有服務(wù)器都進(jìn)行操作。

?

2.2.2 關(guān)閉selinux、防火墻

[along@hadoop01 ~]$ sestatus SELinux status: disabled [root@hadoop01 ~]$ iptables -F [along@hadoop01 ~]$ systemctl status firewalld.service ● firewalld.service - firewalld - dynamic firewall daemonLoaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)Active: inactive (dead)Docs: man:firewalld(1)

  

2.2.3 準(zhǔn)備用戶

$ id along uid=1000(along) gid=1000(along) groups=1000(along)

  

2.2.4 修改hosts文件,域名解析

$ cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6192.168.10.101 hadoop01 192.168.10.102 hadoop02 192.168.10.103 hadoop03

  

2.2.5 同步時(shí)間

$ yum -y install ntpdate $ sudo ntpdate cn.pool.ntp.org

  

2.2.6 ssh互信配置

1)生成密鑰對(duì),一直回車即可

[along@hadoop01 ~]$ ssh-keygen

2)保證每臺(tái)服務(wù)器各自都有對(duì)方的公鑰

---along用戶 [along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub 127.0.0.1 [along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop01 [along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop02 [along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop03 ---root用戶 [along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub 127.0.0.1 [along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop01 [along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop02 [along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop03

注:要在集群所有服務(wù)器都進(jìn)行操作

3)驗(yàn)證無(wú)秘鑰認(rèn)證登錄

[along@hadoop02 ~]$ ssh along@hadoop01 [along@hadoop02 ~]$ ssh along@hadoop02 [along@hadoop02 ~]$ ssh along@hadoop03

  

2.3 配置jdk

在三臺(tái)機(jī)器上都需要操作

[root@hadoop01 ~]# tar -xvf jdk-8u201-linux-x64.tar.gz -C /usr/local [root@hadoop01 ~]# chown along.along -R /usr/local/jdk1.8.0_201/ [root@hadoop01 ~]# ln -s /usr/local/jdk1.8.0_201/ /usr/local/jdk [root@hadoop01 ~]# cat /etc/profile.d/jdk.sh export JAVA_HOME=/usr/local/jdk PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH [root@hadoop01 ~]# source /etc/profile.d/jdk.sh [along@hadoop01 ~]$ java -version java version "1.8.0_201" Java(TM) SE Runtime Environment (build 1.8.0_201-b09) Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)

  

2.4 安裝hadoop

[root@hadoop01 ~]# wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.2.0/hadoop-3.2.0.tar.gz [root@hadoop01 ~]# tar -xvf hadoop-3.2.0.tar.gz -C /usr/local/ [root@hadoop01 ~]# chown along.along -R /usr/local/hadoop-3.2.0/ [root@hadoop01 ~]# ln -s /usr/local/hadoop-3.2.0/ /usr/local/hadoop

  

3、配置啟動(dòng)hadoop

3.1 ?hadoop-env.sh 配置hadoop環(huán)境變量

[along@hadoop01 ~]$ cd /usr/local/hadoop/etc/hadoop/ [along@hadoop01 hadoop]$ vim hadoop-env.sh export JAVA_HOME=/usr/local/jdk export HADOOP_HOME=/usr/local/hadoop export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop

  

3.2 core-site.xml 配置HDFS

[along@hadoop01 hadoop]$ vim core-site.xml <configuration><!-- 指定HDFS默認(rèn)(namenode)的通信地址 --><property><name>fs.defaultFS</name><value>hdfs://hadoop01:9000</value></property><!-- 指定hadoop運(yùn)行時(shí)產(chǎn)生文件的存儲(chǔ)路徑 --><property><name>hadoop.tmp.dir</name><value>/data/hadoop/tmp</value></property> </configuration> [root@hadoop01 ~]# mkdir /data/hadoop

  

3.3 hdfs-site.xml 配置namenode

[along@hadoop01 hadoop]$ vim hdfs-site.xml <configuration><!-- 設(shè)置namenode的http通訊地址 --><property><name>dfs.namenode.http-address</name><value>hadoop01:50070</value></property><!-- 設(shè)置secondarynamenode的http通訊地址 --><property><name>dfs.namenode.secondary.http-address</name><value>hadoop02:50090</value></property><!-- 設(shè)置namenode存放的路徑 --><property><name>dfs.namenode.name.dir</name><value>/data/hadoop/name</value></property><!-- 設(shè)置hdfs副本數(shù)量 --><property><name>dfs.replication</name><value>2</value></property><!-- 設(shè)置datanode存放的路徑 --><property><name>dfs.datanode.data.dir</name><value>/data/hadoop/datanode</value></property><property><name>dfs.permissions</name><value>false</value></property> </configuration> [root@hadoop01 ~]# mkdir /data/hadoop/name -p [root@hadoop01 ~]# mkdir /data/hadoop/datanode -p

  

3.4 mapred-site.xml 配置框架

[along@hadoop01 hadoop]$ vim mapred-site.xml <configuration><!-- 通知框架MR使用YARN --><property><name>mapreduce.framework.name</name><value>yarn</value></property><property><name>mapreduce.application.classpath</name><value>/usr/local/hadoop/etc/hadoop,/usr/local/hadoop/share/hadoop/common/*,/usr/local/hadoop/share/hadoop/common/lib/*,/usr/local/hadoop/share/hadoop/hdfs/*,/usr/local/hadoop/share/hadoop/hdfs/lib/*,/usr/local/hadoop/share/hadoop/mapreduce/*,/usr/local/hadoop/share/hadoop/mapreduce/lib/*,/usr/local/hadoop/share/hadoop/yarn/*,/usr/local/hadoop/share/hadoop/yarn/lib/*</value></property> </configuration>

  

3.5 yarn-site.xml 配置resourcemanager

[along@hadoop01 hadoop]$ vim yarn-site.xml <configuration><property><name>yarn.resourcemanager.hostname</name><value>hadoop01</value></property><property><description>The http address of the RM web application.</description><name>yarn.resourcemanager.webapp.address</name><value>${yarn.resourcemanager.hostname}:8088</value></property><property><description>The address of the applications manager interface in the RM.</description><name>yarn.resourcemanager.address</name><value>${yarn.resourcemanager.hostname}:8032</value></property><property><description>The address of the scheduler interface.</description><name>yarn.resourcemanager.scheduler.address</name><value>${yarn.resourcemanager.hostname}:8030</value></property><property><name>yarn.resourcemanager.resource-tracker.address</name><value>${yarn.resourcemanager.hostname}:8031</value></property><property><description>The address of the RM admin interface.</description><name>yarn.resourcemanager.admin.address</name><value>${yarn.resourcemanager.hostname}:8033</value></property> </configuration>

  

3.6 配置masters & slaves

[along@hadoop01 hadoop]$ echo 'hadoop02' >> /usr/local/hadoop/etc/hadoop/masters [along@hadoop01 hadoop]$ echo 'hadoop03 hadoop01' >> /usr/local/hadoop/etc/hadoop/slaves

  

3.7 啟動(dòng)前準(zhǔn)備

3.7.1 準(zhǔn)備啟動(dòng)腳本

啟動(dòng)腳本文件全部位于 /usr/local/hadoop/sbin 文件夾下:

1)修改 start-dfs.sh stop-dfs.sh 文件添加:

[along@hadoop01 ~]$ vim /usr/local/hadoop/sbin/start-dfs.sh [along@hadoop01 ~]$ vim /usr/local/hadoop/sbin/stop-dfs.sh HDFS_DATANODE_USER=along HADOOP_SECURE_DN_USER=hdfs HDFS_NAMENODE_USER=along HDFS_SECONDARYNAMENODE_USER=along

2)修改start-yarn.sh stop-yarn.sh文件添加:

[along@hadoop01 ~]$ vim /usr/local/hadoop/sbin/start-yarn.sh [along@hadoop01 ~]$ vim /usr/local/hadoop/sbin/stop-yarn.sh YARN_RESOURCEMANAGER_USER=along HADOOP_SECURE_DN_USER=yarn YARN_NODEMANAGER_USER=along

  

3.7.2 授權(quán)

[root@hadoop01 ~]# chown -R along.along /usr/local/hadoop-3.2.0/ [root@hadoop01 ~]# chown -R along.along /data/hadoop/

  

3.7.3 配置hadoop命令環(huán)境變量

[root@hadoop01 ~]# vim /etc/profile.d/hadoop.sh [root@hadoop01 ~]# cat /etc/profile.d/hadoop.sh export HADOOP_HOME=/usr/local/hadoop PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

  

3.7.4 集群初始化

[root@hadoop01 ~]# vim /data/hadoop/rsync.sh #在集群內(nèi)所有機(jī)器上都創(chuàng)建所需要的目錄 for i in hadoop02 hadoop03do sudo rsync -a /data/hadoop $i:/data/ done #復(fù)制hadoop配置到其他機(jī)器 for i in hadoop02 hadoop03do sudo rsync -a /usr/local/hadoop-3.2.0/etc/hadoop $i:/usr/local/hadoop-3.2.0/etc/ done [root@hadoop01 ~]# /data/hadoop/rsync.sh

  

3.8 啟動(dòng)hadoop集群

3.8.1 第一次啟動(dòng)前需要格式化,集群所有服務(wù)器都需要;

[along@hadoop01 ~]$ hdfs namenode -format ... ... /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at hadoop01/192.168.10.101 ************************************************************/ [along@hadoop02 ~]$ hdfs namenode -format /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at hadoop02/192.168.10.102 ************************************************************/ [along@hadoop03 ~]$ hdfs namenode -format /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at hadoop03/192.168.10.103 ************************************************************/

  

3.8.2 啟動(dòng)并驗(yàn)證集群

1)啟動(dòng)namenodedatanode

[along@hadoop01 ~]$ start-dfs.sh [along@hadoop02 ~]$ start-dfs.sh [along@hadoop03 ~]$ start-dfs.sh [along@hadoop01 ~]$ jps 4480 DataNode 4727 Jps 4367 NameNode [along@hadoop02 ~]$ jps 4082 Jps 3958 SecondaryNameNode 3789 DataNode [along@hadoop03 ~]$ jps 2689 Jps 2475 DataNode

2)啟動(dòng)YARN

[along@hadoop01 ~]$ start-yarn.sh [along@hadoop02 ~]$ start-yarn.sh [along@hadoop03 ~]$ start-yarn.sh [along@hadoop01 ~]$ jps 4480 DataNode 4950 NodeManager 5447 NameNode 5561 Jps 4842 ResourceManager [along@hadoop02 ~]$ jps 3958 SecondaryNameNode 4503 Jps 3789 DataNode 4367 NodeManager [along@hadoop03 ~]$ jps 12353 Jps 12226 NodeManager 2475 DataNode

  

3.9 集群?jiǎn)?dòng)成功

1)網(wǎng)頁(yè)訪問:http://hadoop01:8088

該頁(yè)面為ResourceManager 管理界面,在上面可以看到集群中的三臺(tái)Active Nodes

2)網(wǎng)頁(yè)訪問:http://hadoop01:50070/dfshealth.html#tab-datanode

該頁(yè)面為NameNode管理頁(yè)面

到此hadoop集群已經(jīng)搭建完畢!!!

?

4、安裝配置Hbase

4.1 安裝Hbase

[root@hadoop01 ~]# wget https://mirrors.tuna.tsinghua.edu.cn/apache/hbase/1.4.9/hbase-1.4.9-bin.tar.gz [root@hadoop01 ~]# tar -xvf hbase-1.4.9-bin.tar.gz -C /usr/local/ [root@hadoop01 ~]# chown -R along.along /usr/local/hbase-1.4.9/ [root@hadoop01 ~]# ln -s /usr/local/hbase-1.4.9/ /usr/local/hbase

注:當(dāng)前時(shí)間2018.03.08hbase-2.1版本有問題;也可能是我配置的問題,hbase會(huì)啟動(dòng)失敗;所以,我降級(jí)到了hbase-1.4.9版本。

?

4.2 配置Hbase

4.2.1 hbase-env.sh 配置hbase環(huán)境變量

[root@hadoop01 ~]# cd /usr/local/hbase/conf/ [root@hadoop01 conf]# vim hbase-env.sh export JAVA_HOME=/usr/local/jdk export HBASE_CLASSPATH=/usr/local/hbase/conf

  

4.2.2 hbase-site.xml 配置hbase

[root@hadoop01 conf]# vim hbase-site.xml <configuration> <property><name>hbase.rootdir</name><!-- hbase存放數(shù)據(jù)目錄 --><value>hdfs://hadoop01:9000/hbase/hbase_db</value><!-- 端口要和Hadoop的fs.defaultFS端口一致--> </property> <property><name>hbase.cluster.distributed</name><!-- 是否分布式部署 --><value>true</value> </property> <property><name>hbase.zookeeper.quorum</name><!-- zookooper 服務(wù)啟動(dòng)的節(jié)點(diǎn),只能為奇數(shù)個(gè) --><value>hadoop01,hadoop02,hadoop03</value> </property> <property><!--zookooper配置、日志等的存儲(chǔ)位置,必須為以存在 --><name>hbase.zookeeper.property.dataDir</name><value>/data/hbase/zookeeper</value> </property> <property><!--hbase master --><name>hbase.master</name><value>hadoop01</value> </property> <property><!--hbase web 端口 --><name>hbase.master.info.port</name><value>16666</value> </property> </configuration>

?注:zookeeper有這樣一個(gè)特性:

  • ?集群中只要有過半的機(jī)器是正常工作的,那么整個(gè)集群對(duì)外就是可用的。
  • ?也就是說如果有2個(gè)zookeeper,那么只要有1個(gè)死了zookeeper就不能用了,因?yàn)?/span>1沒有過半,所以2個(gè)zookeeper的死亡容忍度為0
  • ?同理,要是有3個(gè)zookeeper,一個(gè)死了,還剩下2個(gè)正常的,過半了,所以3個(gè)zookeeper的容忍度為1
  • ?再多列舉幾個(gè):2->0 ; 3->1 ; 4->1 ; 5->2 ; 6->2 會(huì)發(fā)現(xiàn)一個(gè)規(guī)律,2n2n-1的容忍度是一樣的,都是n-1,所以為了更加高效,何必增加那一個(gè)不必要的zookeeper

?

4.2.3 指定集群節(jié)點(diǎn)

[root@hadoop01 conf]# vim regionservers hadoop01 hadoop02 hadoop03

  

5、啟動(dòng)Hbase集群

5.1 配置hbase命令環(huán)境變量

[root@hadoop01 ~]# vim /etc/profile.d/hbase.sh export HBASE_HOME=/usr/local/hbase PATH=$HBASE_HOME/bin:$PATH

  

5.2 啟動(dòng)前準(zhǔn)備

[root@hadoop01 ~]# mkdir -p /data/hbase/zookeeper [root@hadoop01 ~]# vim /data/hbase/rsync.sh #在集群內(nèi)所有機(jī)器上都創(chuàng)建所需要的目錄 for i in hadoop02 hadoop03do sudo rsync -a /data/hbase $i:/data/sudo scp -p /etc/profile.d/hbase.sh $i:/etc/profile.d/ done #復(fù)制hbase配置到其他機(jī)器 for i in hadoop02 hadoop03do sudo rsync -a /usr/local/hbase-2.1.3 $i:/usr/local/ done [root@hadoop01 conf]# chown -R along.along /data/hbase [root@hadoop01 ~]# /data/hbase/rsync.sh hbase.sh 100% 62 0.1KB/s 00:00 hbase.sh 100% 62 0.1KB/s 00:00

  

5.3 啟動(dòng)hbase

注:只需在hadoop01服務(wù)器上操作即可。

1)啟動(dòng)

[along@hadoop01 ~]$ start-hbase.sh hadoop03: running zookeeper, logging to /usr/local/hbase/logs/hbase-along-zookeeper-hadoop03.out hadoop01: running zookeeper, logging to /usr/local/hbase/logs/hbase-along-zookeeper-hadoop01.out hadoop02: running zookeeper, logging to /usr/local/hbase/logs/hbase-along-zookeeper-hadoop02.out ... ...

2)驗(yàn)證

---主hbase [along@hadoop01 ~]$ jps 4480 DataNode 23411 HQuorumPeer # zookeeper進(jìn)程 4950 NodeManager 24102 Jps 5447 NameNode 23544 HMaster # hbase master進(jìn)程 4842 ResourceManager 23711 HRegionServer ---2個(gè)從 [along@hadoop02 ~]$ jps 12948 HRegionServer # hbase slave進(jìn)程 3958 SecondaryNameNode 13209 Jps 12794 HQuorumPeer # zookeeper進(jìn)程 3789 DataNode 4367 NodeManager [along@hadoop03 ~]$ jps 12226 NodeManager 19559 Jps 19336 HRegionServer # hbase slave進(jìn)程 19178 HQuorumPeer # zookeeper進(jìn)程 2475 DataNode

  

5.4 頁(yè)面查看hbase狀態(tài)

網(wǎng)頁(yè)訪問http://hadoop01:16666

?

6、簡(jiǎn)單操作Hbase

6.1 hbase shell基本操作命令

名稱

命令表達(dá)式

創(chuàng)建表

create '表名稱','列簇名稱1','列簇名稱2'.......

添加記錄

put '表名稱', '行名稱','列簇名稱:',''

查看記錄

get '表名稱','行名稱'

查看表中的記錄總數(shù)

count '表名稱'

刪除記錄

delete '表名',行名稱','列簇名稱'

刪除表

①disable '表名稱' ②drop '表名稱'

查看所有記錄

scan '表名稱'

查看某個(gè)表某個(gè)列中所有數(shù)據(jù)

scan '表名稱',['列簇名稱:']

更新記錄

即重寫一遍進(jìn)行覆蓋

?

6.2 一般操作

1)啟動(dòng)hbase 客戶端

[along@hadoop01 ~]$ hbase shell #需要等待一些時(shí)間 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hbase-1.4.9/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] HBase Shell Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. Version 1.4.9, rd625b212e46d01cb17db9ac2e9e927fdb201afa1, Wed Dec 5 11:54:10 PST 2018hbase(main):001:0>

  

2)查詢集群狀態(tài)

hbase(main):001:0> status 1 active master, 0 backup masters, 3 servers, 0 dead, 0.6667 average load

  

3)查詢hive版本

hbase(main):002:0> version 1.4.9, rd625b212e46d01cb17db9ac2e9e927fdb201afa1, Wed Dec 5 11:54:10 PST 2018

  

6.3 DDL操作

1)創(chuàng)建一個(gè)demo表,包含 idinfo 兩個(gè)列簇

hbase(main):001:0> create 'demo','id','info' 0 row(s) in 23.2010 seconds=> Hbase::Table - demo

  

2)獲得表的描述

hbase(main):002:0> list TABLE demo 1 row(s) in 0.6380 seconds=> ["demo"] ---獲取詳細(xì)描述 hbase(main):003:0> describe 'demo' Table demo is ENABLED demo COLUMN FAMILIES DESCRIPTION {NAME => 'id', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => ' 0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} {NAME => 'info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS = > 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS =>'0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 2 row(s) in 0.3500 seconds

  

3)刪除一個(gè)列簇

注:任何刪除操作,都需要先disable

hbase(main):004:0> disable 'demo' 0 row(s) in 2.5930 secondshbase(main):006:0> alter 'demo',{NAME=>'info',METHOD=>'delete'} Updating all regions with the new schema... 1/1 regions updated. Done. 0 row(s) in 4.3410 secondshbase(main):007:0> describe 'demo' Table demo is DISABLED demo COLUMN FAMILIES DESCRIPTION {NAME => 'id', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'F ALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 1 row(s) in 0.1510 seconds

  

4)刪除一個(gè)表

要先disable,drop

hbase(main):008:0> list TABLE demo 1 row(s) in 0.1010 seconds=> ["demo"] hbase(main):009:0> disable 'demo' 0 row(s) in 0.0480 secondshbase(main):010:0> is_disabled 'demo' #判斷表是否disable true 0 row(s) in 0.0210 secondshbase(main):013:0> drop 'demo' 0 row(s) in 2.3270 secondshbase(main):014:0> list #已經(jīng)刪除成功 TABLE 0 row(s) in 0.0250 seconds=> [] hbase(main):015:0> is_enabled 'demo' #查詢是否存在demo表ERROR: Unknown table demo!

  

6.4 DML操作

1)插入數(shù)據(jù)

hbase(main):024:0> create 'demo','id','info' 0 row(s) in 10.0720 seconds=> Hbase::Table - demo hbase(main):025:0> is_enabled 'demo' true 0 row(s) in 0.1930 secondshbase(main):030:0> put 'demo','example','id:name','along' 0 row(s) in 0.0180 secondshbase(main):039:0> put 'demo','example','id:sex','male' 0 row(s) in 0.0860 secondshbase(main):040:0> put 'demo','example','id:age','24' 0 row(s) in 0.0120 secondshbase(main):041:0> put 'demo','example','id:company','taobao' 0 row(s) in 0.3840 secondshbase(main):042:0> put 'demo','taobao','info:addres','china' 0 row(s) in 0.1910 secondshbase(main):043:0> put 'demo','taobao','info:company','alibaba' 0 row(s) in 0.0300 secondshbase(main):044:0> put 'demo','taobao','info:boss','mayun' 0 row(s) in 0.1260 seconds

  

2)獲取demo表的數(shù)據(jù)

hbase(main):045:0> get 'demo','example' COLUMN CELL id:age timestamp=1552030411620, value=24 id:company timestamp=1552030467196, value=taobao id:name timestamp=1552030380723, value=along id:sex timestamp=1552030392249, value=male 1 row(s) in 0.8850 secondshbase(main):046:0> get 'demo','taobao' COLUMN CELL info:addres timestamp=1552030496973, value=china info:boss timestamp=1552030532254, value=mayun info:company timestamp=1552030520028, value=alibaba 1 row(s) in 0.2500 secondshbase(main):047:0> get 'demo','example','id' COLUMN CELL id:age timestamp=1552030411620, value=24 id:company timestamp=1552030467196, value=taobao id:name timestamp=1552030380723, value=along id:sex timestamp=1552030392249, value=male 1 row(s) in 0.3150 secondshbase(main):048:0> get 'demo','example','info' COLUMN CELL 0 row(s) in 0.0200 secondshbase(main):049:0> get 'demo','taobao','id' COLUMN CELL 0 row(s) in 0.0410 secondshbase(main):053:0> get 'demo','taobao','info' COLUMN CELL info:addres timestamp=1552030496973, value=china info:boss timestamp=1552030532254, value=mayun info:company timestamp=1552030520028, value=alibaba 1 row(s) in 0.0240 secondshbase(main):055:0> get 'demo','taobao','info:boss' COLUMN CELL info:boss timestamp=1552030532254, value=mayun 1 row(s) in 0.1810 seconds

  

3)更新一條記錄

hbase(main):056:0> put 'demo','example','id:age','88' 0 row(s) in 0.1730 secondshbase(main):057:0> get 'demo','example','id:age' COLUMN CELL id:age timestamp=1552030841823, value=88 1 row(s) in 0.1430 seconds

  

4)獲取時(shí)間戳數(shù)據(jù)

大家應(yīng)該看到timestamp這個(gè)標(biāo)記

hbase(main):059:0> get 'demo','example',{COLUMN=>'id:age',TIMESTAMP=>1552030841823} COLUMN CELL id:age timestamp=1552030841823, value=88 1 row(s) in 0.0200 secondshbase(main):060:0> get 'demo','example',{COLUMN=>'id:age',TIMESTAMP=>1552030411620} COLUMN CELL id:age timestamp=1552030411620, value=24 1 row(s) in 0.0930 seconds

  

5)全表顯示

hbase(main):061:0> scan 'demo' ROW COLUMN+CELL example column=id:age, timestamp=1552030841823, value=88 example column=id:company, timestamp=1552030467196, value=taobao example column=id:name, timestamp=1552030380723, value=along example column=id:sex, timestamp=1552030392249, value=male taobao column=info:addres, timestamp=1552030496973, value=china taobao column=info:boss, timestamp=1552030532254, value=mayun taobao column=info:company, timestamp=1552030520028, value=alibaba 2 row(s) in 0.3880 seconds

  

6)刪除idexample'id:age'字段

hbase(main):062:0> delete 'demo','example','id:age' 0 row(s) in 1.1360 secondshbase(main):063:0> get 'demo','example' COLUMN CELL id:company timestamp=1552030467196, value=taobao id:name timestamp=1552030380723, value=along id:sex timestamp=1552030392249, value=male

  

7)刪除整行

hbase(main):070:0> deleteall 'demo','taobao' 0 row(s) in 1.8140 secondshbase(main):071:0> get 'demo','taobao' COLUMN CELL 0 row(s) in 0.2200 seconds

  

8)給example這個(gè)id增加'id:age'字段,并使用counter實(shí)現(xiàn)遞增

hbase(main):072:0> incr 'demo','example','id:age' COUNTER VALUE = 1 0 row(s) in 3.2200 secondshbase(main):073:0> get 'demo','example','id:age' COLUMN CELL id:age timestamp=1552031388997, value=\x00\x00\x00\x00\x00\x00\x00\x01 1 row(s) in 0.0280 secondshbase(main):074:0> incr 'demo','example','id:age' COUNTER VALUE = 2 0 row(s) in 0.0340 secondshbase(main):075:0> incr 'demo','example','id:age' COUNTER VALUE = 3 0 row(s) in 0.0420 secondshbase(main):076:0> get 'demo','example','id:age' COLUMN CELL id:age timestamp=1552031429912, value=\x00\x00\x00\x00\x00\x00\x00\x03 1 row(s) in 0.0690 secondshbase(main):077:0> get_counter 'demo','example','id:age' #獲取當(dāng)前count值 COUNTER VALUE = 3

  

9)清空整個(gè)表

hbase(main):078:0> truncate 'demo' Truncating 'demo' table (it may take a while):- Disabling table...- Truncating table... 0 row(s) in 33.0820 seconds

可以看出hbase是先disable掉該表,然后drop,最后重新create該表來(lái)實(shí)現(xiàn)清空該表。

?

轉(zhuǎn)載于:https://www.cnblogs.com/along21/p/10496468.html

總結(jié)

以上是生活随笔為你收集整理的Hadoop+Hbase分布式集群架构“完全篇”的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。