當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

HIve学习：Hive分区修改

發(fā)布時間：2025/3/19 编程问答 37 豆豆

生活随笔收集整理的這篇文章主要介紹了 HIve学习：Hive分区修改小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

文章目錄

什么是Hive的分區(qū)
- 分區(qū)意義
- 分區(qū)技術(shù)
- 分區(qū)方法和本質(zhì)
- 創(chuàng)建一級分區(qū)表
- 創(chuàng)建二級分區(qū)表
如何修改Hive的分區(qū)
- 查看分區(qū)
- 添加分區(qū)
- 分區(qū)名稱修改
- 修改分區(qū)路徑
- 刪除分區(qū)
- 分區(qū)類別
- hive的嚴(yán)格模式
- - 笛卡爾積
  - 分區(qū)表沒有分區(qū)字段過濾
  - order by不帶limit查詢
  - bigint和string比較
  - bigint和double比較
- hive讀寫模式：

什么是Hive的分區(qū)

分區(qū)意義

hive分區(qū)的意義是避免全表掃描，從而提高查詢效率。默認使用全表掃描。

分區(qū)技術(shù)

[PARTITIONED BY (COLUMNNAME COLUMNTYPE [COMMENT 'COLUMN COMMENT'],...)]

1、hive的分區(qū)名區(qū)分大小寫
2、hive的分區(qū)字段是一個偽字段，但是可以用來進行操作
3、一張表可以有一個或者多個分區(qū)，并且分區(qū)下面也可以有一個或者多個分區(qū)。
4、分區(qū)字段使用表外字段

分區(qū)方法和本質(zhì)

分區(qū)的方式：使用日期、地域等方式將數(shù)據(jù)分散開
分區(qū)的本質(zhì)：在表的目錄或者是分區(qū)的目錄下再創(chuàng)建目錄，分區(qū)的目錄名為指定字段=值(比如:dt=2019-09-09)

創(chuàng)建一級分區(qū)表

create table if not exists part1( id int, name string ) partitioned by (dt string) row format delimited fields terminated by ' ';

加載數(shù)據(jù)

load data local inpath '/home/hivedata/t1' overwrite into table part1 partition(dt='2019-09-09'); load data local inpath '/hivedata/user.txt' into table part1 partition(dt='2018-03-20');

查詢語句

select * from part1 where dt='2018-03-20'

創(chuàng)建二級分區(qū)表

create table if not exists part2( id int, name string ) partitioned by (year int,month int) row format delimited fields terminated by ' ';

加載數(shù)據(jù)

load data local inpath '/home/hivedata/t1' overwrite into table part2 partition(year=2019,month=9); load data local inpath '/home/hivedata/t' overwrite into table part2 partition(year=2019,month=10);

查詢語句

select * from part2 where year=2019 and month=10;

如何修改Hive的分區(qū)

查看分區(qū)

show partitions 表名;

添加分區(qū)

alter table part1 add partition(dt='2019-09-10'); alter table part1 add partition(dt='2019-09-13') partition(dt='2019-09-12'); alter table part1 add partition(dt='2019-09-11') location '/user/hive/warehouse/qf1704.db/part1/dt=2019-09-10';

分區(qū)名稱修改

alter table part1 partition(dt='2019-09-10') rename to partition(dt='2019-09-14');

修改分區(qū)路徑

--錯誤使用 alter table part1 partition(dt='2019-09-14') set location '/user/hive/warehouse/qf24.db/part1/dt=2019-09-09'; --正確使用，決對路徑 alter table part1 partition(dt='2019-09-14') set location 'hdfs://hadoo01:9000/user/hive/warehouse/qf24.db/part1/dt=2019-09-09';

刪除分區(qū)

alter table part1 drop partition(dt='2019-09-14'); alter table part1 drop partition(dt='2019-09-12'),partition(dt='2019-09-13');

分區(qū)類別

靜態(tài)分區(qū)：加載數(shù)據(jù)到指定分區(qū)的值。
動態(tài)分區(qū)：數(shù)據(jù)未知，根據(jù)分區(qū)的值來確定需要創(chuàng)建的分區(qū)。
混合分區(qū)：靜態(tài)和動態(tài)都有。

set hive.exec.dynamic.partition=true set hive.exec.dynamic.partition.mode=strict/nonstrict set hive.exec.max.dynamic.partitions=1000 set hive.exec.max.dynamic.partitions.pernode=100

strict:嚴(yán)格模式必須至少一個靜態(tài)分區(qū)
nostrict：可以所有的都為動態(tài)分區(qū)，但是建議盡量評估動態(tài)分區(qū)的數(shù)量。

使用案例：

create table dy_part1( id int, name string ) partitioned by (dt string) row format delimited fields terminated by ' ' ;load data local inpath '/home/hivedata/t1' overwrite into table dy_part1 partition(dt='2019-09-09');set hive.exec.mode.local.auto=true; insert into table dy_part1 partition(dt) select id, name, dt from part1 ;混合分區(qū)： create table if not exists dy_part2( id int, name string ) partitioned by (year int,month int) row format delimited fields terminated by ' ' ;set hive.exec.mode.local.auto=true; set hive.exec.dynamic.partition.mode=strict; insert into table dy_part2 partition(year=2019,month) select id, name, month from part2 where year=2019 ;

hive的嚴(yán)格模式

<property><name>hive.mapred.mode</name><value>nonstrict</value><description>The mode in which the Hive operations are being performed.In strict mode, some risky queries are not allowed to run. They include:Cartesian Product.No partition being picked up for a query.Comparing bigints and strings.Comparing bigints and doubles.Orderby without limit.</description></property>

笛卡爾積

set hive.mapred.mode=strict; select * from dy_part1 d1 join dy_part2 d2 ;

分區(qū)表沒有分區(qū)字段過濾

set hive.mapred.mode=strict; select * from dy_part1 d1 where d1.dt='2019-09-09' ;不行 select * from dy_part1 d1 where d1.id > 2 ;select * from dy_part2 d2 where d2.year >= 2019 ;

order by不帶limit查詢

select * from log3 order by id desc ;

bigint和string比較

(bigint和string比較)Comparing bigints and strings.

bigint和double比較

(bigint和double比較)Comparing bigints and doubles.

hive讀寫模式：

Hive是一個嚴(yán)格的讀時模式。寫數(shù)據(jù)不管數(shù)據(jù)正確性，讀的時候，不對則用NULL替代。
mysql是一個的寫時模式。寫的時候檢查語法，不okay就會報錯。

load data local inpath '/home/hivedata/t' into table t_user; insert into stu(id,sex) value(1,abc);

總結(jié)

以上是生活随笔為你收集整理的HIve学习：Hive分区修改的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

分区
Hive

上一篇： Kafka笔记：kafka原理简介以及架
下一篇： 2020年终总结一下吧