日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Hive与Hbase结合使用

發布時間:2025/4/9 编程问答 17 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Hive与Hbase结合使用 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

hive的啟動需要使用到zookeeper, 所以, 要么自己搭建zookeeper, 要么跟其它東西一起使用, 我這里做的是跟hbase一起使用的zookeeper, 因為hbase自帶zookeeper, hbase啟動就會啟動zookeeper, 而hive默認會連接本機的2181端口, 所以我這里選擇在slaver3上使用hive.


集群的搭建以及機器的分配見hadoop搭建: http://phey.cc/multinode_hadoop20.html
以及hbase集群搭建http://phey.cc/Install_hbase_cluster.html


解壓hive包后拷貝環境變量模板到指定文件
[cc@slaver3 ~]$ cp hive-0.12.0-cdh5.0.1/conf/hive-env.sh.template hive-0.12.0-cdh5.0.1/conf/hive-env.sh
[cc@slaver3 ~]$ ▊




編輯環境變量, 一個是hadoop的安裝目錄,一個是hbase的jar位置,如果hbase和hive的jar版本不對會報錯
[cc@slaver3 ~]$ vim hive-0.12.0-cdh5.0.1/conf/hive-env.sh
export HADOOP_HOME=/home/cc/hadoop-2.3.0-cdh5.0.0
export HIVE_AUX_JARS_PATH=/home/cc/hbase-0.96.1.1-cdh5.0.1/lib
[cc@slaver3 ~]$ ▊




啟動hive,指定hbase的RPC端口
[cc@slaver3 hive-0.12.0-cdh5.0.1]$ bin/hive -hiveconf hbase.master=master1:60000
14/08/21 21:34:07 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
14/08/21 21:34:07 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/08/21 21:34:07 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
14/08/21 21:34:07 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
14/08/21 21:34:07 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
14/08/21 21:34:07 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/08/21 21:34:07 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative


Logging initialized using configuration in jar:file:/home/cc/hive-0.12.0-cdh5.0.1/lib/hive-common-0.12.0-cdh5.0.1.jar!/hive-log4j.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/cc/hadoop-2.3.0-cdh5.0.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/cc/hive-0.12.0-cdh5.0.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/cc/hbase-0.96.1.1-cdh5.0.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
hive> ▊




在hive中建表。在hive新建一個名為cctabl的表,這個表映射到hbase的表名是cc,cctable表里面的int類型的key對應了cc表里面的row key,cctable里面類型是string的value對應了cc表里面的cf:val
hive> CREATE TABLE cctable (key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:val") TBLPROPERTIES ("hbase.table.name" = "cc");
OK
Time taken: 9.302 seconds
hive> ▊




在hbase中可以看到結果, 創建了一個表
hbase(main):011:0> list
TABLE ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ??
cc ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
1 row(s) in 0.0250 seconds


=> ["cc"]
hbase(main):012:0> describe 'cc'
DESCRIPTION ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ENABLED ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ??
?'cc', {NAME => 'cf', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIO true ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
?NS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', KEEP_DELETED_CELLS => 'false', ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ??
? BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'} ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ??
1 row(s) in 0.0390 seconds


hbase(main):013:0> ▊




如果不希望hive去創建表而是使用hbase已經有的表, 那么創建表的時候加上external參數就可以了, 例如
hive> CREATE EXTERNAL TABLE cctable (key int, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:val") TBLPROPERTIES ("hbase.table.name" = "cc");
OK
Time taken: 9.302 seconds
hive> ▊




由于有那個映射關系,所以往hbase里面插入數據可以在hive里面查詢,不過hive不支持類似于mysql的insert語句,所以方便的也只能往hbase里面插入數據


向hbase里面插入數據
hbase(main):013:0> put 'cc', '1', 'cf:val', 'hello cc!'
0 row(s) in 0.0120 seconds


hbase(main):014:0> ▊




在hive里面查詢
hive> select * from cctable;
OK
1 ? ? ? hello cc!
Time taken: 30.838 seconds, Fetched: 1 row(s)
hive> ▊




雖然hbase不支持count方法去計算行數,但是hive可以,不過這個會被轉換成mapreduce過程,去執行
hive> select count(*) from cctable;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
? set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
? set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
? set mapred.reduce.tasks=<number>
Starting Job = job_1408532552242_0005, Tracking URL = http://master1:8088/proxy/application_1408532552242_0005/
Kill Command = /home/cc/hadoop-2.3.0-cdh5.0.0/bin/hadoop job ?-kill job_1408532552242_0005
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2014-08-22 10:15:00,861 Stage-1 map = 0%, ?reduce = 0%
2014-08-22 10:15:14,479 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 2.73 sec
2014-08-22 10:15:15,525 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 2.73 sec
2014-08-22 10:15:16,572 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 2.73 sec
2014-08-22 10:15:17,617 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 2.73 sec
2014-08-22 10:15:18,662 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 2.73 sec
2014-08-22 10:15:19,707 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 2.73 sec
2014-08-22 10:15:20,753 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 2.73 sec
2014-08-22 10:15:21,798 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 2.73 sec
2014-08-22 10:15:22,842 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 2.73 sec
2014-08-22 10:15:23,889 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 2.73 sec
2014-08-22 10:15:24,937 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 2.73 sec
2014-08-22 10:15:25,991 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 2.73 sec
2014-08-22 10:15:27,040 Stage-1 map = 100%, ?reduce = 0%, Cumulative CPU 2.73 sec
2014-08-22 10:15:28,094 Stage-1 map = 100%, ?reduce = 100%, Cumulative CPU 4.89 sec
2014-08-22 10:15:29,143 Stage-1 map = 100%, ?reduce = 100%, Cumulative CPU 4.89 sec
MapReduce Total cumulative CPU time: 4 seconds 890 msec
Ended Job = job_1408532552242_0005
MapReduce Jobs Launched:?
Job 0: Map: 1 ?Reduce: 1 ? Cumulative CPU: 4.89 sec ? HDFS Read: 236 HDFS Write: 2 SUCCESS
Total MapReduce CPU Time Spent: 4 seconds 890 msec
OK
1
Time taken: 77.571 seconds, Fetched: 1 row(s)
hive> ▊

轉載于:https://www.cnblogs.com/jamesf/p/4751467.html

總結

以上是生活随笔為你收集整理的Hive与Hbase结合使用的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。