日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

HiveQL:查询

發(fā)布時(shí)間:2024/7/5 编程问答 36 豆豆
生活随笔 收集整理的這篇文章主要介紹了 HiveQL:查询 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

文章目錄

    • 1. select from
      • 1.1 正則表達(dá)式指定列
      • 1.2 使用列值計(jì)算
      • 1.3 使用函數(shù)
      • 1.4 limit 限制返回行數(shù)
      • 1.5 別名 as name
      • 1.6 case when then 語句
    • 2. where 語句
    • 3. JOIN 優(yōu)化
    • 4. 抽樣查詢
    • 5. union all

學(xué)自《hive編程指南》

1. select from

hive (default)> create table employees(> name string,> salary float,> subordinates array<string>,> deductions map<string, float>,> address struct<street:string, city:string, state:string, zip:int>)> partitioned by(country string, state string);hive (default)> load data local inpath "/home/hadoop/workspace/employees.txt"> overwrite into table employees> partition(country='US', state='CA'); Loading data to table default.employees partition (country=US, state=CA)hive (default)> select * from employees; John Doe 100000.0 ["Mary Smith","Todd Jones"] {"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1} {"street":"1 Michigan Ave.","city":"Chicago","state":"IL","zip":60600} US CA Mary Smith 80000.0 ["Bill King"] {"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1} {"street":"100 Ontario St.","city":"Chicago","state":"IL","zip":60601} US CA Todd Jones 70000.0 [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"street":"200 Chicago Ave.","city":"Oak Park","state":"IL","zip":60700} US CA Bill King 60000.0 [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"street":"300 Obscure Dr.","city":"Obscuria","state":"IL","zip":60100} US CA Boss Man 200000.0 ["John Doe","Fred Finance"] {"Federal Taxes":0.3,"State Taxes":0.07,"Insurance":0.05} {"street":"1 Pretentious Drive.","city":"Chicago","state":"IL","zip":60500} US CA Fred Finance 150000.0 ["Stacy Accountant"] {"Federal Taxes":0.3,"State Taxes":0.07,"Insurance":0.05} {"street":"2 Pretentious Drive.","city":"Chicago","state":"IL","zip":60500} US CA Stacy Accountant 60000.0 [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"street":"300 Main St.","city":"Naperville","state":"IL","zip":60563} US CA
  • 可以對(duì)表起別名
hive (default)> select name, salary from employees; hive (default)> select e.name, e.salary from employees e;John Doe 100000.0 Mary Smith 80000.0 Todd Jones 70000.0 Bill King 60000.0 Boss Man 200000.0 Fred Finance 150000.0 Stacy Accountant 60000.0
  • 提取數(shù)組元素 [idx],不存在為NULL,提取出的字符串也沒有引號(hào)
hive (default)> select e.name, e.subordinates[0] from employees e;John Doe Mary Smith Mary Smith Bill King Todd Jones NULL Bill King NULL Boss Man John Doe Fred Finance Stacy Accountant Stacy Accountant NULL
  • 提取 map 元素 [key]
hive (default)> select e.name, e.deductions['State Taxes'] from employees e;John Doe 0.05 Mary Smith 0.05 Todd Jones 0.03 Bill King 0.03 Boss Man 0.07 Fred Finance 0.07 Stacy Accountant 0.03
  • 提取 struct 中的元素,使用 .
hive (default)> select e.name, e.address.city from employees e;John Doe Chicago Mary Smith Chicago Todd Jones Oak Park Bill King Obscuria Boss Man Chicago Fred Finance Chicago Stacy Accountant Naperville

1.1 正則表達(dá)式指定列

select `price.*` from stocks;

以 price為前綴的列

1.2 使用列值計(jì)算

  • 計(jì)算稅后薪資
hive (default)> select upper(name), salary, deductions['Federal Taxes'],> round(salary*(1-deductions['Federal Taxes'])) from employees;JOHN DOE 100000.0 0.2 80000.0 MARY SMITH 80000.0 0.2 64000.0 TODD JONES 70000.0 0.15 59500.0 BILL KING 60000.0 0.15 51000.0 BOSS MAN 200000.0 0.3 140000.0 FRED FINANCE 150000.0 0.3 105000.0 STACY ACCOUNTANT 60000.0 0.15 51000.0

1.3 使用函數(shù)

  • 聚合函數(shù)
select count(*), avg(salary) from employees; set hive.map.aggr=true; # 可以提高聚合性能,但需要更多內(nèi)存 select distinct address.city from employees; # distinct 去重
  • 表生成函數(shù),將單列擴(kuò)展為多行或者多列
hive (default)> select explode(subordinates) as sub from employees;Mary Smith Todd Jones Bill King John Doe Fred Finance Stacy Accountant
  • 內(nèi)置函數(shù)

1.4 limit 限制返回行數(shù)

limit n 返回 n 行

1.5 別名 as name

1.6 case when then 語句

hive (default)> select name, salary,> case when salary < 50000 then 'low'> else 'high'> end as bracket from employees;John Doe 100000.0 high Mary Smith 80000.0 high Todd Jones 70000.0 high Bill King 60000.0 high Boss Man 200000.0 high Fred Finance 150000.0 high Stacy Accountant 60000.0 high

2. where 語句

  • 過濾條件
  • like, rlike(正則)
hive (default)> select name, address.street from employees where address.street like "%Ave."; OK John Doe 1 Michigan Ave. Todd Jones 200 Chicago Ave.hive (default)> select name, address.street from employees where address.street like "%Chi%"; OK Todd Jones 200 Chicago Ave.hive (default)> select name, address.street from employees where address.street rlike ".*(Chicago|Ontario).*"; OK Mary Smith 100 Ontario St. Todd Jones 200 Chicago Ave.

3. JOIN 優(yōu)化

多個(gè)表 join 把小的表放在左邊

4. 抽樣查詢

  • 分桶抽樣
hive> select name from employees tablesample(bucket 3 out of 4 on rand()); John Doehive> select name from employees tablesample(bucket 3 out of 4 on rand()); Boss Man Fred Finance
  • 不使用 rand(), 每次結(jié)果是一樣的
hive> select name from employees tablesample(bucket 3 out of 4 on name); Mary Smith Todd Joneshive> select name from employees tablesample(bucket 3 out of 4 on name); Mary Smith Todd Jones
  • 百分比抽樣
hive> select name from employees tablesample(70 percent);John Doe Mary Smith Todd Jones Bill King Boss Man

5. union all

將多個(gè)表進(jìn)行合并,每個(gè)表必須有相同的列,且字段類型一致

hive> select name from(> select e1.name from employees e1 where e1.name like "Mary%"> union all> select e2.name from employees e2 where e2.name like "Bill%"> ) name_tab> sort by name;WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Query ID = hadoop_20210411221203_b3dde291-8596-4b91-95e0-707eeaa873f6 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers:set hive.exec.reducers.max=<number> In order to set a constant number of reducers:set mapreduce.job.reduces=<number> Job running in-process (local Hadoop) 2021-04-11 22:12:04,856 Stage-1 map = 100%, reduce = 100% Ended Job = job_local1468526053_0003 MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 31360 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msecBill King Mary Smith

總結(jié)

以上是生活随笔為你收集整理的HiveQL:查询的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。