當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

hbase java框架_Hadoop学习笔记—15.HBase框架学习（基础实践篇）

發布時間：2025/4/5 编程问答 36 豆豆

生活随笔收集整理的這篇文章主要介紹了 hbase java框架_Hadoop学习笔记—15.HBase框架学习（基础实践篇）小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

一、HBase的安裝配置

1.1 偽分布模式安裝

偽分布模式安裝即在一臺計算機上部署HBase的各個角色，HMaster、HRegionServer以及ZooKeeper都在一臺計算機上來模擬。

首先，準備好HBase的安裝包，我這里使用的是HBase-0.94.7的版本，已經上傳至百度網盤之中(URL：http://pan.baidu.com/s/1pJ3HTY7)

(1)通過FTP將hbase的安裝包拷貝到虛擬機hadoop-master中，并執行一系列操作：解壓縮、重命名、設置環境變量

①解壓縮：tar -zvxf hbase-0.94.7-security.tar.gz

②重命名：mv hbase-94.7-security hbase

③設置環境變量：vim /etc/profile，增加內容如下，修改后重新生效：source /etc/profile

export HBASE_HOME=/usr/local/hbase

export PATH=.:$HADOOP_HOME/bin:$HBASE_HOME/bin:$ZOOKEEPER_HOME/bin:$JAVA_HOME/bin:$PATH

(2)進入hbase/conf目錄下，修改hbase-env.sh文件：

export JAVA_HOME=/usr/local/jdk

export HBASE_MANAGES_ZK=true #告訴HBase使用它自己的zookeeper實例，分布式模式下需要設置為false

(3)在hbase/conf目錄下，繼續修改hbase-site.xml文件：

hbase.rootdir

hdfs://hadoop-master:9000/hbase

hbase.cluster.distributed

true

hbase.zookeeper.quorum

hadoop-master

dfs.replication

(4)【可選步湊】修改regionservers文件，將localhost改為主機名：hadoop-master

(5)啟動HBase：start-hbase.sh

PS：由上一篇可知，HBase是建立在Hadoop HDFS之上的，因此在啟動HBase之前要確保已經啟動了Hadoop，啟動Hadoop的命令是：start-all.sh

(6)驗證是否啟動HBase：jps

由上圖發現，多了三個java進程：HMaster、HRegionServer以及HQuorumPeer。

還可以通過訪問HBase的Web接口查看：http://hadoop-master:60010

1.2 分布式模式安裝

本次安裝在1.1節的偽分布模式的基礎上進行修改搭建分布式模式，本次的集群實驗環境結構如下圖所示：

由上圖可知，HMaster角色是192.168.80.100(主機名：hadoop-master)，而兩個HRegionServer角色則是兩臺192.168.80.101(主機名：hadoop-slave1)和192.168.80.102(主機名：hadoop-slave2)組成的。

(1)修改hadoop-master服務器上的的幾個關鍵配置文件：

①修改hbase/conf/hbase-env.sh：將最后一行修改為如下內容

export HBASE_MANAGES_ZK=false ?#不使用HBase自帶的zookeeper實例

②修改hbase/conf/regionservers：將原來的hadoop-master改為如下內容

hadoop-slave1

hadoop-slave2

(2)將hadoop-master上的hbase文件夾與/etc/profile配置文件整體復制到hadoop-slave1與hadoop-slave2中：

scp -r /usr/local/hbase hadoop-slave1:/usr/local/

scp -r /usr/local/hbase hadoop-slave2:/usr/local/

scp /etc/profile hadoop-slave1:/etc/

scp /etc/profile hadoop-slave2:/etc/

(3)在hadoop-slave1與hadoop-slave2中使配置文件生效：

source /etc/profile

(4)在hadoop-master中啟動Hadoop、Zookeeper與HBase：(注意先后順序)

start-all.sh

zkServer.sh start

start-hbase.sh

(5)在HBase的Web接口中查看Hbase集群狀態：

二、HBase Shell基本命令

2.1 DDL：創建與刪除表

(1)創建表：

>create 'users','user_id','address','info'

#這里創建了一張表users,有三個列族user_id,address,info

獲取表users的具體描述：

>describe 'users'

(2)列出所有表：

>list

(3)刪除表：在HBase中刪除表需要兩步，首先disable，其次drop

>disable 'users'

>drop 'users'

2.2 DML：增刪查改

(1)增加記錄：put

>put 'users','xiaoming','info:age','24';

>put 'users','xiaoming','info:birthday','1987-06-17';

>put 'users','xiaoming','info:company','alibaba';

>put 'users','xiaoming','address:contry','china';

>put 'users','xiaoming','address:province','zhejiang';

>put 'users','xiaoming','address:city','hangzhou';

(2)掃描users表的所有記錄：scan

>scan 'users'

(3)獲取一條記錄

①取得一個id(row_key)的所有數據

>get 'users','xiaoming'

②獲取一個id的一個列族的所有數據

>get 'users','xiaoming','info'

③獲取一個id，一個列族中一個列的所有數據

>get 'users','xiaoming','info:age'

(4)更新一條記錄：依然put

例如：更新users表中小明的年齡為29

>put 'users','xiaoming','info:age' ,'29'

>get 'users','xiaoming','info:age

(5)刪除記錄：delete與deleteall

①刪除xiaoming的值的'info:age'字段

>delete 'users','xiaoming','info:age'

②刪除xiaoming的整行信息

>deleteall 'users','xiaoming'

2.3 Other：其他幾個比較有用的命令

(1)count：統計行數

>count 'users'

(2)truncate：清空指定表

>truncate 'users'

三、HBase Java API操作

3.1 預備工作

(1)導入HBase的項目jar包

(2)導入HBase/lib下的所有依賴jar包

3.2 HBase Java開發必備：獲取配置

/** 獲取HBase配置*/

private staticConfiguration getConfiguration()

{

Configuration conf=HBaseConfiguration.create();

conf.set("hbase.rootdir","hdfs://hadoop-master:9000/hbase");//使用eclipse時必須添加這個，否則無法定位

conf.set("hbase.zookeeper.quorum","hadoop-master");returnconf;

}

3.3 使用HBaseAdmin進行DDL操作

(1)創建表

/** 創建表*/

private static voidcreateTable()throwsIOException {

HBaseAdmin admin= newHBaseAdmin(getConfiguration());if(admin.tableExists(TABLE_NAME)) {

System.out.println("The table is existed!");

}else{

HTableDescriptor tableDesc= newHTableDescriptor(TABLE_NAME);

tableDesc.addFamily(newHColumnDescriptor(FAMILY_NAME));

admin.createTable(tableDesc);

System.out.println("Create table success!");

}

(2)刪除表

/** 刪除表*/

private static voiddropTable(String tableName)throwsIOException {

HBaseAdmin admin= newHBaseAdmin(getConfiguration());if(admin.tableExists(tableName)){try{

admin.disableTable(tableName);

admin.deleteTable(tableName);

}catch(IOException e) {

e.printStackTrace();

System.out.println("Delete "+tableName+" failed!");

}

System.out.println("Delete "+tableName+" success!");

}

3.4 使用HTable進行DML操作

(1)新增記錄

public static voidputRecord(String tableName, String row,

String columnFamily, String column, String data)throwsIOException{

HTable table= newHTable(getConfiguration(), tableName);

Put p1= newPut(Bytes.toBytes(row));

p1.add(Bytes.toBytes(columnFamily), Bytes.toBytes(column), Bytes.toBytes(data));

table.put(p1);

System.out.println("put'"+row+"',"+columnFamily+":"+column+"','"+data+"'");

}

(2)讀取記錄

public static void getRecord(String tableName, String row) throwsIOException{

HTable table= newHTable(getConfiguration(), tableName);

Get get= newGet(Bytes.toBytes(row));

Result result=table.get(get);

System.out.println("Get: "+result);

}

(3)全表掃描

public static void scan(String tableName) throwsIOException{

HTable table= newHTable(getConfiguration(), tableName);

Scan scan= newScan();

ResultScanner scanner=table.getScanner(scan);for(Result result : scanner) {

System.out.println("Scan: "+result);

}

3.5 API實戰：詳單入庫

結合本筆記第五篇《自定義類型處理手機上網日志》的手機上網日志為背景，我們要做的就是將日志通過MapReduce導入到HBase中進行存儲。該日志的數據結構定義如下圖所示：(該文件的下載地址為：http://pan.baidu.com/s/1dDzqHWX)

(1)在HBase中通過Shell創建一張表：wlan_log

> create 'wlan_log','cf'

這里為了簡單定義，之定義了一個列族cf

(2)在ecplise中新建一個類：BatchImportJob，該類的代碼如下所示：

packagehbase;importjava.text.SimpleDateFormat;importjava.util.Date;importorg.apache.hadoop.conf.Configuration;importorg.apache.hadoop.hbase.client.Put;importorg.apache.hadoop.hbase.mapreduce.TableOutputFormat;importorg.apache.hadoop.hbase.mapreduce.TableReducer;importorg.apache.hadoop.hbase.util.Bytes;importorg.apache.hadoop.io.LongWritable;importorg.apache.hadoop.io.NullWritable;importorg.apache.hadoop.io.Text;importorg.apache.hadoop.mapreduce.Counter;importorg.apache.hadoop.mapreduce.Job;importorg.apache.hadoop.mapreduce.Mapper;importorg.apache.hadoop.mapreduce.lib.input.FileInputFormat;importorg.apache.hadoop.mapreduce.lib.input.TextInputFormat;public classBatchImportJob {static class BatchImportMapper extendsMapper{

SimpleDateFormat dateformat1= new SimpleDateFormat("yyyyMMddHHmmss");

Text v2= newText();protected voidmap(LongWritable key, Text value, Context context)throwsjava.io.IOException, InterruptedException {final String[] splited = value.toString().split("\t");try{final Date date = new Date(Long.parseLong(splited[0].trim()));final String dateFormat =dateformat1.format(date);

String rowKey= splited[1] + ":" +dateFormat;

v2.set(rowKey+ "\t" +value.toString());

context.write(key, v2);

}catch(NumberFormatException e) {final Counter counter = context.getCounter("BatchImportJob","ErrorFormat");

counter.increment(1L);

System.out.println("出錯了" + splited[0] + " " +e.getMessage());

}

};

}static class BatchImportReducer extendsTableReducer{protected voidreduce(LongWritable key,

java.lang.Iterablevalues, Context context)throwsjava.io.IOException, InterruptedException {for(Text text : values) {final String[] splited = text.toString().split("\t");final Put put = new Put(Bytes.toBytes(splited[0]));

put.add(Bytes.toBytes("cf"), Bytes.toBytes("date"),

Bytes.toBytes(splited[1]));

put.add(Bytes.toBytes("cf"), Bytes.toBytes("msisdn"),

Bytes.toBytes(splited[2]));//省略其他字段，調用put.add(....)即可

context.write(NullWritable.get(), put);

}

};

}public static void main(String[] args) throwsException {final Configuration configuration = newConfiguration();//設置zookeeper

configuration.set("hbase.zookeeper.quorum", "hadoop-master");//設置hbase表名稱

configuration.set(TableOutputFormat.OUTPUT_TABLE, "wlan_log");//將該值改大，防止hbase超時退出

configuration.set("dfs.socket.timeout", "180000");final Job job = new Job(configuration, "HBaseBatchImportJob");

job.setMapperClass(BatchImportMapper.class);

job.setReducerClass(BatchImportReducer.class);//設置map的輸出，不設置reduce的輸出類型

job.setMapOutputKeyClass(LongWritable.class);

job.setMapOutputValueClass(Text.class);

job.setInputFormatClass(TextInputFormat.class);//不再設置輸出路徑，而是設置輸出格式類型

job.setOutputFormatClass(TableOutputFormat.class);

FileInputFormat.setInputPaths(job,"hdfs://hadoop-master:9000/testdir/input/HTTP_20130313143750.dat");boolean success = job.waitForCompletion(true);if(success) {

System.out.println("Bath import to HBase success!");

System.exit(0);

}else{

System.out.println("Batch import to HBase failed!");

System.exit(1);

}

View Code

通過執行后，在HBase中通過Shell命令(list)查看導入結果：

(3)在eclipse中新建一個類：MobileLogQueryApp，對已經存儲的wlan_log進行查詢的Java開發，該類的代碼如下所示：

packagehbase;importjava.io.IOException;importorg.apache.hadoop.conf.Configuration;importorg.apache.hadoop.hbase.HBaseConfiguration;importorg.apache.hadoop.hbase.HColumnDescriptor;importorg.apache.hadoop.hbase.HTableDescriptor;importorg.apache.hadoop.hbase.client.Get;importorg.apache.hadoop.hbase.client.HBaseAdmin;importorg.apache.hadoop.hbase.client.HTable;importorg.apache.hadoop.hbase.client.Put;importorg.apache.hadoop.hbase.client.Result;importorg.apache.hadoop.hbase.client.ResultScanner;importorg.apache.hadoop.hbase.client.Scan;importorg.apache.hadoop.hbase.util.Bytes;public classMobileLogQueryApp {private static final String TABLE_NAME = "wlan_log";private static final String FAMILY_NAME = "cf";/*** HBase Java API基本使用示例

*@throwsException*/

public static void main(String[] args) throwsException {

scan(TABLE_NAME,"13600217502");

System.out.println();

scanPeriod(TABLE_NAME,"136");

}/** 查詢手機13600217502的所有上網記錄*/

public static voidscan(String tableName, String mobileNum)throwsIOException {

HTable table= newHTable(getConfiguration(), tableName);

Scan scan= newScan();

scan.setStartRow(Bytes.toBytes(mobileNum+ ":/"));

scan.setStopRow(Bytes.toBytes(mobileNum+ "::"));

ResultScanner scanner=table.getScanner(scan);int i = 0;for(Result result : scanner) {

System.out.println("Scan: " + i + " " +result);

i++;

}

}/** 查詢134號段的所有上網記錄*/

public static voidscanPeriod(String tableName, String period)throwsIOException {

HTable table= newHTable(getConfiguration(), tableName);

Scan scan= newScan();

scan.setStartRow(Bytes.toBytes(period+ "/"));

scan.setStopRow(Bytes.toBytes(period+ ":"));

scan.setMaxVersions(1);

ResultScanner scanner=table.getScanner(scan);int i = 0;for(Result result : scanner) {

System.out.println("Scan: " + i + " " +result);

i++;

}

}/** 獲取HBase配置*/

private staticConfiguration getConfiguration() {

Configuration conf=HBaseConfiguration.create();

conf.set("hbase.rootdir", "hdfs://hadoop-master:9000/hbase");//使用eclipse時必須添加這個，否則無法定位

conf.set("hbase.zookeeper.quorum", "hadoop-master");returnconf;

}

View Code

這里主要進行了兩個查詢操作：按指定手機號碼查詢和按指定手機號碼網段區間查詢，執行結果如下所示：

參考資料

作者：周旭龍

本文版權歸作者和博客園共有，歡迎轉載，但未經作者同意必須保留此段聲明，且在文章頁面明顯位置給出原文鏈接。

《新程序員》：云原生和全面數字化實踐50位技術專家共同創作，文字、視頻、音頻交互閱讀

總結

以上是生活随笔為你收集整理的hbase java框架_Hadoop学习笔记—15.HBase框架学习（基础实践篇）的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：农村信用社二类卡可以升级吗
下一篇： java mongodb排序查询_jav