生活随笔
收集整理的這篇文章主要介紹了
Hive分析窗口函数(一) SUM,AVG,MIN,MAX
小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
Hive分析窗口函數(shù)(一) SUM,AVG,MIN,MAX
Hive分析窗口函數(shù)(一) SUM,AVG,MIN,MAX
Hive中提供了越來越多的分析函數(shù),用于完成負(fù)責(zé)的統(tǒng)計(jì)分析。抽時(shí)間將所有的分析窗口函數(shù)理一遍,將陸續(xù)發(fā)布。
今天先看幾個(gè)基礎(chǔ)的,SUM、AVG、MIN、MAX。
用于實(shí)現(xiàn)分組內(nèi)所有和連續(xù)累積的統(tǒng)計(jì)。
數(shù)據(jù)準(zhǔn)備
CREATE EXTERNAL TABLE lxw1234?(cookieid?string,createtime?string,?--daypv INT)?ROW FORMAT DELIMITEDFIELDS TERMINATED BY?','stored?as?textfile location?'/tmp/lxw11/';?DESC lxw1234;cookieid STRINGcreatetime STRINGpv INT?hive>?select?*?from?lxw1234;OKcookie1?2015-04-10?1cookie1?2015-04-11?5cookie1?2015-04-12?7cookie1?2015-04-13?3cookie1?2015-04-14?2cookie1?2015-04-15?4cookie1?2015-04-16?4SUM — 注意,結(jié)果和ORDER BY相關(guān),默認(rèn)為升序
SELECT cookieid,createtime,pv,SUM(pv)?OVER(PARTITION BY cookieid ORDER BY createtime)?AS pv1,?--?默認(rèn)為從起點(diǎn)到當(dāng)前行SUM(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)?AS pv2,?--從起點(diǎn)到當(dāng)前行,結(jié)果同pv1SUM(pv)?OVER(PARTITION BY cookieid)?AS pv3,?--分組內(nèi)所有行SUM(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN?3?PRECEDING AND CURRENT ROW)?AS pv4,?--當(dāng)前行+往前3行SUM(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN?3?PRECEDING AND?1?FOLLOWING)?AS pv5,?--當(dāng)前行+往前3行+往后1行SUM(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)?AS pv6?---當(dāng)前行+往后所有行FROM lxw1234;?cookieid createtime pv pv1 pv2 pv3 pv4 pv5 pv6-----------------------------------------------------------------------------cookie1?2015-04-10?1?1?1?26?1?6?26cookie1?2015-04-11?5?6?6?26?6?13?25cookie1?2015-04-12?7?13?13?26?13?16?20cookie1?2015-04-13?3?16?16?26?16?18?13cookie1?2015-04-14?2?18?18?26?17?21?10cookie1?2015-04-15?4?22?22?26?16?20?8cookie1?2015-04-16?4?26?26?26?13?13?4?
pv1: 分組內(nèi)從起點(diǎn)到當(dāng)前行的pv累積,如,11號(hào)的pv1=10號(hào)的pv+11號(hào)的pv, 12號(hào)=10號(hào)+11號(hào)+12號(hào)
pv2: 同pv1
pv3: 分組內(nèi)(cookie1)所有的pv累加
pv4: 分組內(nèi)當(dāng)前行+往前3行,如,11號(hào)=10號(hào)+11號(hào), 12號(hào)=10號(hào)+11號(hào)+12號(hào), 13號(hào)=10號(hào)+11號(hào)+12號(hào)+13號(hào), 14號(hào)=11號(hào)+12號(hào)+13號(hào)+14號(hào)
pv5: 分組內(nèi)當(dāng)前行+往前3行+往后1行,如,14號(hào)=11號(hào)+12號(hào)+13號(hào)+14號(hào)+15號(hào)=5+7+3+2+4=21
pv6: 分組內(nèi)當(dāng)前行+往后所有行,如,13號(hào)=13號(hào)+14號(hào)+15號(hào)+16號(hào)=3+2+4+4=13,14號(hào)=14號(hào)+15號(hào)+16號(hào)=2+4+4=10
?
如果不指定ROWS BETWEEN,默認(rèn)為從起點(diǎn)到當(dāng)前行;
如果不指定ORDER BY,則將分組內(nèi)所有值累加;
關(guān)鍵是理解ROWS BETWEEN含義,也叫做WINDOW子句:
PRECEDING:往前
FOLLOWING:往后
CURRENT ROW:當(dāng)前行
UNBOUNDED:起點(diǎn),UNBOUNDED PRECEDING 表示從前面的起點(diǎn), UNBOUNDED FOLLOWING:表示到后面的終點(diǎn)
–其他AVG,MIN,MAX,和SUM用法一樣。
--AVGSELECT cookieid,createtime,pv,AVG(pv)?OVER(PARTITION BY cookieid ORDER BY createtime)?AS pv1,?--?默認(rèn)為從起點(diǎn)到當(dāng)前行AVG(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)?AS pv2,?--從起點(diǎn)到當(dāng)前行,結(jié)果同pv1AVG(pv)?OVER(PARTITION BY cookieid)?AS pv3, --分組內(nèi)所有行AVG(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN?3?PRECEDING AND CURRENT ROW)?AS pv4,?--當(dāng)前行+往前3行AVG(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN?3?PRECEDING AND?1?FOLLOWING)?AS pv5,?--當(dāng)前行+往前3行+往后1行AVG(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)?AS pv6?---當(dāng)前行+往后所有行FROM lxw1234;cookieid createtime pv pv1 pv2 pv3 pv4 pv5 pv6-----------------------------------------------------------------------------cookie1?2015-04-10?1?1.0?1.0?3.7142857142857144?1.0?3.0?3.7142857142857144cookie1?2015-04-11?5?3.0?3.0?3.7142857142857144?3.0?4.333333333333333?4.166666666666667cookie1?2015-04-12?7?4.333333333333333?4.333333333333333?3.7142857142857144?4.333333333333333?4.0?4.0cookie1?2015-04-13?3?4.0?4.0?3.7142857142857144?4.0?3.6?3.25cookie1?2015-04-14?2?3.6?3.6?3.7142857142857144?4.25?4.2?3.3333333333333335cookie1?2015-04-15?4?3.6666666666666665?3.6666666666666665?3.7142857142857144?4.0?4.0?4.0cookie1?2015-04-16?4?3.7142857142857144?3.7142857142857144?3.7142857142857144?3.25?3.25?4.0--MINSELECT cookieid,createtime,pv,MIN(pv)?OVER(PARTITION BY cookieid ORDER BY createtime)?AS pv1,?--?默認(rèn)為從起點(diǎn)到當(dāng)前行MIN(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)?AS pv2,?--從起點(diǎn)到當(dāng)前行,結(jié)果同pv1MIN(pv)?OVER(PARTITION BY cookieid)?AS pv3, --分組內(nèi)所有行MIN(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN?3?PRECEDING AND CURRENT ROW)?AS pv4,?--當(dāng)前行+往前3行MIN(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN?3?PRECEDING AND?1?FOLLOWING)?AS pv5,?--當(dāng)前行+往前3行+往后1行MIN(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)?AS pv6?---當(dāng)前行+往后所有行FROM lxw1234;?cookieid createtime pv pv1 pv2 pv3 pv4 pv5 pv6-----------------------------------------------------------------------------cookie1?2015-04-10?1?1?1?1?1?1?1cookie1?2015-04-11?5?1?1?1?1?1?2cookie1?2015-04-12?7?1?1?1?1?1?2cookie1?2015-04-13?3?1?1?1?1?1?2cookie1?2015-04-14?2?1?1?1?2?2?2cookie1?2015-04-15?4?1?1?1?2?2?4cookie1?2015-04-16?4?1?1?1?2?2?4----MAXSELECT cookieid,createtime,pv,MAX(pv)?OVER(PARTITION BY cookieid ORDER BY createtime)?AS pv1,?--?默認(rèn)為從起點(diǎn)到當(dāng)前行MAX(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)?AS pv2,?--從起點(diǎn)到當(dāng)前行,結(jié)果同pv1MAX(pv)?OVER(PARTITION BY cookieid)?AS pv3, --分組內(nèi)所有行MAX(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN?3?PRECEDING AND CURRENT ROW)?AS pv4,?--當(dāng)前行+往前3行MAX(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN?3?PRECEDING AND?1?FOLLOWING)?AS pv5,?--當(dāng)前行+往前3行+往后1行MAX(pv)?OVER(PARTITION BY cookieid ORDER BY createtime ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)?AS pv6?---當(dāng)前行+往后所有行FROM lxw1234;?cookieid createtime pv pv1 pv2 pv3 pv4 pv5 pv6-----------------------------------------------------------------------------cookie1?2015-04-10?1?1?1?7?1?5?7cookie1?2015-04-11?5?5?5?7?5?7?7cookie1?2015-04-12?7?7?7?7?7?7?7cookie1?2015-04-13?3?7?7?7?7?7?4cookie1?2015-04-14?2?7?7?7?7?7?4cookie1?2015-04-15?4?7?7?7?7?7?4cookie1?2015-04-16?4?7?7?7?4?4?4其他函數(shù)的介紹將陸續(xù)整理發(fā)布。。
轉(zhuǎn)載于:https://www.cnblogs.com/mobiwangyue/p/8087031.html
總結(jié)
以上是生活随笔為你收集整理的Hive分析窗口函数(一) SUM,AVG,MIN,MAX的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。