日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Hive之bucket表使用场景

發(fā)布時間:2023/12/14 编程问答 21 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Hive之bucket表使用场景 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

來源:https://www.cnblogs.com/duanxingxing/p/5156951.html

?

前言

bucket table(桶表)是對數(shù)據(jù)進行哈希取值,然后放到不同文件中存儲;

?

應用場景

當數(shù)據(jù)量比較大,我們需要更快的完成任務,多個map和reduce進程是唯一的選擇。
但是如果輸入文件是一個的話,map任務只能啟動一個。
此時bucket table是個很好的選擇,通過指定CLUSTERED的字段,將文件通過hash打散成多個小文件。

?

create table test (id int,name string ) CLUSTERED BY(id) SORTED BY(name) INTO 32 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘/t’;

?

?

執(zhí)行insert前不要忘記設置

set hive.enforce.bucketing = true;

強制采用多個reduce進行輸出

?

hive> INSERT OVERWRITE TABLE test select * from test09; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 32 In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number> In order to set a constant number of reducers:set mapred.reduce.tasks=<number> Starting Job = job_201103070826_0018, Tracking URL = http://hadoop00:50030/jobdetails.jsp?jobid=job_201103070826_0018 Kill Command = /home/hjl/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=hadoop00:9001 -kill job_201103070826_0018 2011-03-08 11:34:23,055 Stage-1 map = 0%, reduce = 0% 2011-03-08 11:34:27,084 Stage-1 map = 6%, reduce = 0% ************************************************* Ended Job = job_201103070826_0018 Loading data to table test 5 Rows loaded to test OK Time taken: 175.036 seconds

?

?

hive的sunwg_test11文件夾下面出現(xiàn)了32個文件,而不是一個文件

?

[hadoop@hadoop00 ~]$ hadoop fs -ls /ticketdev/test Found 32 items -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:20 /ticketdev/test/attempt_201103070826_0018_r_000000_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:20 /ticketdev/test/attempt_201103070826_0018_r_000001_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:20 /ticketdev/test/attempt_201103070826_0018_r_000002_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:20 /ticketdev/test/attempt_201103070826_0018_r_000003_0 -rw-r–r– 3 ticketdev hadoop 8 2011-03-08 11:20 /ticketdev/test/attempt_201103070826_0018_r_000004_0 -rw-r–r– 3 ticketdev hadoop 9 2011-03-08 11:20 /ticketdev/test/attempt_201103070826_0018_r_000005_0 -rw-r–r– 3 ticketdev hadoop 8 2011-03-08 11:20 /ticketdev/test/attempt_201103070826_0018_r_000006_0 -rw-r–r– 3 ticketdev hadoop 9 2011-03-08 11:20 /ticketdev/test/attempt_201103070826_0018_r_000007_0 -rw-r–r– 3 ticketdev hadoop 9 2011-03-08 11:20 /ticketdev/test/attempt_201103070826_0018_r_000008_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:20 /ticketdev/test/attempt_201103070826_0018_r_000009_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:20 /ticketdev/test/attempt_201103070826_0018_r_000010_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:20 /ticketdev/test/attempt_201103070826_0018_r_000011_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:20 /ticketdev/test/attempt_201103070826_0018_r_000012_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:20 /ticketdev/test/attempt_201103070826_0018_r_000013_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000014_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000015_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000016_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000017_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000018_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000019_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000020_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000021_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000022_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000023_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000024_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000025_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000026_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000027_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000028_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000029_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000030_0 -rw-r–r– 3 ticketdev hadoop 0 2011-03-08 11:21 /ticketdev/test/attempt_201103070826_0018_r_000031_0

?

?

文件被打散后,可以啟動多個mapreduce task
當執(zhí)行一些操作的時候,你會發(fā)現(xiàn)系統(tǒng)啟動了32個map任務

總結

以上是生活随笔為你收集整理的Hive之bucket表使用场景的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。