當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

HiveQL：数据操作

發(fā)布時間：2024/7/5 编程问答 30 豆豆

生活随笔收集整理的這篇文章主要介紹了 HiveQL：数据操作小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

文章目錄

- 1. 向管理表中裝載數(shù)據(jù)
- 2. 通過查詢語句向表中插入數(shù)據(jù)
- 3. 動態(tài)分區(qū)插入
- 4. 從單個查詢語句創(chuàng)建表并加載數(shù)據(jù)
- 5. 導(dǎo)出數(shù)據(jù)

學(xué)習(xí)自《Hive編程指南》

1. 向管理表中裝載數(shù)據(jù)

hive (default)> load data local inpath "/home/hadoop/workspace/student.txt"> overwrite into table student1;

分區(qū)表可以跟 partition (key1 = v1, key2 = v2, …)

有 local ：復(fù)制本地路徑文件到 hdfs
無 local：移動 hdfs 文件到新的 hdfs 路徑

overwrite：目標文件夾中的數(shù)據(jù)將會被刪除
沒有 overwrite ：把新增加的文件添加到目標文件夾中，不刪除原數(shù)據(jù)

inpath 后的路徑下，不能包含任何文件夾

2. 通過查詢語句向表中插入數(shù)據(jù)

hadoop@dblab-VirtualBox:~/workspace$ cat stu.txt 1 michael male china 2 ming male china1 3 haha female china 4 huahua female china1

創(chuàng)建表，加載數(shù)據(jù)

hive (default)> create table stu(> id int,> name string,> sex string,> country string)> row format delimited fields terminated by '\t';hive (default)> load data local inpath '/home/hadoop/workspace/stu.txt'> into table stu;

通過 select 語句向其他表填入數(shù)據(jù)

hive (default)> create table employee(> name string,> country string)> row format delimited fields terminated by '\t'; hive (default)> from stu s> insert overwrite table employee> select s.name, s.country where s.id%2=1; WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = hadoop_20210408224138_1df23614-7945-40c0-9a4d-df88e4f58ea1 Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Job running in-process (local Hadoop) 2021-04-08 22:41:40,081 Stage-1 map = 100%, reduce = 0% Ended Job = job_local1437521177_0001 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to directory hdfs://localhost:9000/user/hive/warehouse/employee/.hive-staging_hive_2021-04-08_22-41-38_345_1863326332876590299-1/-ext-10000 Loading data to table default.employee MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 83 HDFS Write: 180 SUCCESS Total MapReduce CPU Time Spent: 0 msec hive (default)> select * from employee; OK michael china haha china

向多表插入數(shù)據(jù)

hive (default)> from stu s> insert into table employee> select s.name, s.country where s.sex='female'> insert into table employee1> select s.name, s.country where s.sex='male'; WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = hadoop_20210408230623_bc69bccf-348e-467d-b88e-498664f27017 Total jobs = 5 Launching Job 1 out of 5 Number of reduce tasks is set to 0 since there's no reduce operator Job running in-process (local Hadoop) 2021-04-08 23:06:24,405 Stage-2 map = 100%, reduce = 0% Ended Job = job_local2065691620_0003 Stage-5 is selected by condition resolver. Stage-4 is filtered out by condition resolver. Stage-6 is filtered out by condition resolver. Stage-11 is selected by condition resolver. Stage-10 is filtered out by condition resolver. Stage-12 is filtered out by condition resolver. Moving data to directory hdfs://localhost:9000/user/hive/warehouse/employee/.hive-staging_hive_2021-04-08_23-06-23_001_7974131043339100692-1/-ext-10000 Moving data to directory hdfs://localhost:9000/user/hive/warehouse/employee1/.hive-staging_hive_2021-04-08_23-06-23_001_7974131043339100692-1/-ext-10002 Loading data to table default.employee Loading data to table default.employee1 MapReduce Jobs Launched: Stage-Stage-2: HDFS Read: 470 HDFS Write: 474 SUCCESS Total MapReduce CPU Time Spent: 0 msechive (default)> select * from employee; ming china1 huahua china1 haha china huahua china1hive (default)> select * from employee1; michael china ming china1

3. 動態(tài)分區(qū)插入

hive (default)> from stu s> insert overwrite table employee2> partition (country, sex)> select s.id, s.name, s.country, s.sex;hive (default)> select * from employee2; OK 3 haha china female 1 michael china male 4 huahua china1 female 2 ming china1 male

4. 從單個查詢語句創(chuàng)建表并加載數(shù)據(jù)

表的模式由 select 生成

hive (default)> create table employee3> as select id, name from stu> where country='china';hive (default)> select * from employee3; 1 michael 3 haha

此功能不能用于外部表（數(shù)據(jù)沒有裝載，在外部）

5. 導(dǎo)出數(shù)據(jù)

hive (default)> from stu s> insert overwrite local directory '/tmp/employee'> select s.id, s.name, s.sex> where country='china';

可以同時寫入多個文件，insert 重復(fù)寫幾次

hive (default)> ! ls /tmp/employee -r; 000000_0hive (default)> ! cat /tmp/employee/000000_0; 1michaelmale 3hahafemale

總結(jié)

以上是生活随笔為你收集整理的HiveQL：数据操作的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇：安装 paddleocr 报错 gcc:
下一篇： LeetCode 2086. 从房屋收集