Hive中实现有序,有序concat拼接,有序集合,hive方法操作命令,与自带方法列表
生活随笔
收集整理的這篇文章主要介紹了
Hive中实现有序,有序concat拼接,有序集合,hive方法操作命令,与自带方法列表
小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
前言
記得以前用過這個(gè)函數(shù),這次開發(fā)怎么都找不到了,不常用的原因,也是筆記沒做好
方法一
- GROUP_CONCAT(distinct id ORDER BY id DESC SEPARATOR ‘_’)
其他
- CONCAT(‘My’, NULL, ‘QL’)
- CONCAT_WS(’,’,‘First name’,NULL,‘Last Name’)
- CONCAT_WS(SEPARATOR ,collect_set(column))
方法二
- concat_ws(’,’,sort_array(collect_set(concat(content_id,’#&’,SCORE))))
方法三
- 自定義udf
方法四 變通
- concat_ws(’,’,sort_array(collect_set(concat(1-score,’#&’,content_id,’#&’,SCORE)))) item_score
- 采用序號,row_numbet
后續(xù)
hive 方法 查看 添加 刪除
- hive 函數(shù)大全當(dāng)前版本
- 發(fā)現(xiàn)個(gè)問題,注冊的udf 刪不了了
讀文件3種方式
1.for line in `cat functions.txt`; do echo "desc function '${line}';" >> asdf.txt; done2. for line in `cat functions.txt` doecho ${line} done3. cat functions.txt | while read line doecho $line done4. while read line doecho $line done < functions.txt* 時(shí)有問題,當(dāng)前目錄下文件desc function 'app3.0.log' desc function 'application_1577181410627_109940.log' desc function 'asdf.txt' desc function 'data' desc function 'dealer.sh' desc function 'flume' desc function 'functions.txt' desc function 'qwer.txt' desc function 'showallfuncs.sh' desc function 'sqoop' desc function 'test' desc function 'test.sh' desc function 'wxapp_kafka.log' desc function 'wxapp_sqlserver_open.shhive 自帶函數(shù)
- 當(dāng)前版本250個(gè) 有3個(gè)函數(shù)描述文件跨行 高亮標(biāo)出
| ! | ! a-邏輯非 | ! a - Logical not |
| != | a!=b-如果a不等于b,則返回TRUE | a != b - Returns TRUE if a is not equal to b |
| $sum0 | $sum0(x)-返回一組數(shù)字的總和,如果為空,則返回零 | $sum0(x) - Returns the sum of a set of numbers, zero if empty |
| % | a%b-返回a除以b時(shí)的余數(shù) | a % b - Returns the remainder when dividing a by b |
| & | a&b-按位與 | a & b - Bitwise and |
| * | a*b-將a乘以b | a * b - Multiplies a by b |
| + | a+b-返回a+b | a + b - Returns a+b |
| - | a-b-返回差分a-b | a - b - Returns the difference a-b |
| / | a/b-將a除以b | a / b - Divide a by b |
| < | a<b-如果a小于b,則返回TRUE | a < b - Returns TRUE if a is less than b |
| <= | a<=b-如果a不大于b,則返回TRUE | a <= b - Returns TRUE if a is not greater than b |
| <=> | 對于非空操作數(shù),a<=>b-返回相同的結(jié)果,如果兩個(gè)操作數(shù)都為null,則返回TRUE;如果其中一個(gè)操作數(shù)為null,則返回FALSE | a <=> b - Returns same result with EQUAL(=) operator for non-null operands, but returns TRUE if both are NULL, FALSE if one of the them is NULL |
| <> | a<>b-如果a不等于b,則返回TRUE | a <> b - Returns TRUE if a is not equal to b |
| = | a=b-如果a等于b,則返回TRUE,否則返回false | a = b - Returns TRUE if a equals b and false otherwise |
| == | a==b-如果a等于b,則返回TRUE,否則返回false | a == b - Returns TRUE if a equals b and false otherwise |
| > | a>b-如果a大于b,則返回TRUE | a > b - Returns TRUE if a is greater than b |
| >= | a>=b-如果a不小于b,則返回TRUE | a >= b - Returns TRUE if a is not smaller than b |
| ^ | a^b—按位異或 | a ^ b - Bitwise exclusive or |
| abs | abs(x)-返回x的絕對值 | abs(x) - returns the absolute value of x |
| acos | acos(x)-如果-1<=x<=1,則返回x的反余弦;否則返回NULL | acos(x) - returns the arc cosine of x if -1<=x<=1 or NULL otherwise |
| add_months | add_months(start_date,num_months,output_date_format)-返回開始日期后num_months的日期。 | add_months(start_date, num_months, output_date_format) - Returns the date that is num_months after start_date. |
| and | a1和a2還有。。。和-邏輯and | a1 and a2 and … and an - Logical and |
| array | array(n0,n1…)-用給定的元素創(chuàng)建一個(gè)數(shù)組 | array(n0, n1…) - Creates an array with the given elements |
| array_contains | array_contains(array,value)-如果數(shù)組包含值,則返回TRUE。 | array_contains(array, value) - Returns TRUE if the array contains value. |
| ascii | ascii(str)-返回str的第一個(gè)字符的數(shù)值 | ascii(str) - returns the numeric value of the first character of str |
| asin | asin(x)-如果-1<=x<=1,則返回x的弧正弦;否則返回NULL | asin(x) - returns the arc sine of x if -1<=x<=1 or NULL otherwise |
| assert_true | assert\u true(condition)-如果“condition”不為true,則引發(fā)異常。 | assert_true(condition) - Throw an exception if ‘condition’ is not true. |
| atan | 返回x的atan(arctan)(x以弧度表示) | atan(x) - returns the atan (arctan) of x (x is in radians) |
| avg | 平均數(shù)(a)的返回?cái)?shù)集 | avg(x) - Returns the mean of a set of numbers |
| base64 | base64(bin)-將參數(shù)從二進(jìn)制轉(zhuǎn)換為base64字符串 | base64(bin) - Convert the argument from binary to a base 64 string |
| between | 在a之間[不是]在b和c之間-評估a是否在b和c之間 | between a [NOT] BETWEEN b AND c - evaluate if a is [not] in between b and c |
| bin | bin(n)-以二進(jìn)制形式返回n | bin(n) - returns n in binary |
| bround | bround(x[,d])—使用半偶數(shù)舍入模式將x舍入到d個(gè)小數(shù)位。 | bround(x[, d]) - round x to d decimal places using HALF_EVEN rounding mode. |
| case | CASE a WHEN b THEN c[WHEN d THEN e]*[ELSE f]END-當(dāng)a=b時(shí),返回c;當(dāng)a=d時(shí),返回e;否則返回f | CASE a WHEN b THEN c [WHEN d THEN e]* [ELSE f] END - When a = b, returns c; when a = d, return e; else return f |
| cbrt | cbrt(double)-返回double值的立方根。 | cbrt(double) - Returns the cube root of a double value. |
| ceil | ceil(x)-找到不小于x的最小整數(shù) | ceil(x) - Find the smallest integer not smaller than x |
| ceiling | 天花板(x)-找到不小于x的最小整數(shù) | ceiling(x) - Find the smallest integer not smaller than x |
| chr | chr(str)-將n(其中n:[0,256)轉(zhuǎn)換為ascii等效值,作為varchar如果n小于0返回空字符串。如果n>256,返回chr(n%256)。 | chr(str) - convert n where n : [0, 256) into the ascii equivalent as a varchar.If n is less than 0 return the empty string. If n > 256, return chr(n % 256). |
| coalesce | coalesce(a1,a2,…)—返回第一個(gè)非空參數(shù) | coalesce(a1, a2, …) - Returns the first non-null argument |
| collect_list | collect\u list(x)-返回具有重復(fù)項(xiàng)的對象列表 | collect_list(x) - Returns a list of objects with duplicates |
| collect_set | collect_set(x)-返回一組消除了重復(fù)元素的對象 | collect_set(x) - Returns a set of objects with duplicate elements eliminated |
| compute_stats | compute_stats(x)-返回一組基元類型值的統(tǒng)計(jì)摘要。 | compute_stats(x) - Returns the statistical summary of a set of primitive type values. |
| concat | 混凝土(str1,str2。。。strN)-返回str1、str2、。。。strN或concat(bin1,bin2。。。binN)-返回二進(jìn)制數(shù)據(jù)bin1,bin2,…中字節(jié)的串聯(lián)。。。賓恩 | concat(str1, str2, … strN) - returns the concatenation of str1, str2, … strN or concat(bin1, bin2, … binN) - returns the concatenation of bytes in binary data bin1, bin2, … binN |
| concat_ws | concat_ws(separator,[string | array(string)]+)—返回由分隔符分隔的字符串的串聯(lián)。 |
| context_ngrams | 上下文語法(expr,array<string1,string2,…>,k,pf)估計(jì)符合指定上下文的前k個(gè)最頻繁的n-gram。第二個(gè)參數(shù)指定一個(gè)字符串,指定n個(gè)gram元素的位置,空值代表必須由n-gram元素填充的“blank”。 | context_ngrams(expr, array<string1, string2, …>, k, pf) estimates the top-k most frequent n-grams that fit into the specified context. The second parameter specifies a string of words that specify the positions of the n-gram elements, with a null value standing in for a ‘blank’ that must be filled by an n-gram element. |
| conv | conv(num,from_base,to_base)-將num from_base轉(zhuǎn)換為_base | conv(num, from_base, to_base) - convert num from from_base to to_base |
| corr | corr(x,y)-返回皮爾遜相關(guān)系數(shù) | corr(x,y) - Returns the Pearson coefficient of correlation |
| 在一組數(shù)對之間 | between a set of number pairs | |
| cos | cos(x)-返回x的余弦(x以弧度表示) | cos(x) - returns the cosine of x (x is in radians) |
| count | count(*)-返回已檢索行的總數(shù),包括包含空值的行。 | count(*) - Returns the total number of retrieved rows, including rows containing NULL values. |
| count(expr)-返回提供的表達(dá)式為非NULL的行數(shù)。 | count(expr) - Returns the number of rows for which the supplied expression is non-NULL. | |
| count(DISTINCT expr[,expr…])—返回所提供表達(dá)式唯一且非空的行數(shù)。 | count(DISTINCT expr[, expr…]) - Returns the number of rows for which the supplied expression(s) are unique and non-NULL. | |
| covar_pop | covar_pop(x,y)-返回一組數(shù)對的總體協(xié)方差 | covar_pop(x,y) - Returns the population covariance of a set of number pairs |
| covar_samp | covar_samp(x,y)-返回一組數(shù)對的樣本協(xié)方差 | covar_samp(x,y) - Returns the sample covariance of a set of number pairs |
| crc32 | crc32(str或bin)-計(jì)算字符串或二進(jìn)制參數(shù)的循環(huán)冗余校驗(yàn)值,并返回bigint值。 | crc32(str or bin) - Computes a cyclic redundancy check value for string or binary argument and returns bigint value. |
| create_union | create_union(tag,obj1,obj2,obj3,…)—為給定的標(biāo)記創(chuàng)建一個(gè)與對象的聯(lián)合 | create_union(tag, obj1, obj2, obj3, …) - Creates a union with the object for given tag |
| cume_dist | 函數(shù)“cume_dist”沒有文檔 | There is no documentation for function ‘cume_dist’ |
| current_database | current_database()-返回當(dāng)前使用的數(shù)據(jù)庫名稱 | current_database() - returns currently using database name |
| current_date | current_date()—返回查詢計(jì)算開始時(shí)的當(dāng)前日期。同一查詢中所有當(dāng)前日期的調(diào)用都返回相同的值。 | current_date() - Returns the current date at the start of query evaluation. All calls of current_date within the same query return the same value. |
| current_timestamp | current_timestamp()—返回查詢計(jì)算開始時(shí)的當(dāng)前時(shí)間戳。在同一個(gè)查詢中對當(dāng)前時(shí)間戳的所有調(diào)用都返回相同的值。 | current_timestamp() - Returns the current timestamp at the start of query evaluation. All calls of current_timestamp within the same query return the same value. |
| current_user | current_user()—返回當(dāng)前用戶名 | current_user() - Returns current user name |
| date_add | date_add(start_date,num_days)-返回開始日期后num_days的日期。 | date_add(start_date, num_days) - Returns the date that is num_days after start_date. |
| date_format | date_format(date/timestamp/string,fmt)-將日期/時(shí)間戳/string轉(zhuǎn)換為日期格式fmt指定格式的字符串值。 | date_format(date/timestamp/string, fmt) - converts a date/timestamp/string to a value of string in the format specified by the date format fmt. |
| date_sub | date_sub(start_date,num_days)-返回開始日期之前num_days的日期。 | date_sub(start_date, num_days) - Returns the date that is num_days before start_date. |
| datediff | datediff(date1,date2)-返回date1和date2之間的天數(shù) | datediff(date1, date2) - Returns the number of days between date1 and date2 |
| day | day(param)-返回日期/時(shí)間戳所在月份的日期,或interval的day組件 | day(param) - Returns the day of the month of date/timestamp, or day component of interval |
| dayofmonth | dayofmonth(param)-返回日期/時(shí)間戳所在月份的日期,或間隔的日組件 | dayofmonth(param) - Returns the day of the month of date/timestamp, or day component of interval |
| dayofweek | dayofweek(param)-返回日期/時(shí)間戳的星期幾(1=星期日,2=星期一,…,7=星期六) | dayofweek(param) - Returns the day of the week of date/timestamp (1 = Sunday, 2 = Monday, …, 7 = Saturday) |
| decode | decode(bin,str)-使用第二個(gè)參數(shù)字符集解碼第一個(gè)參數(shù) | decode(bin, str) - Decode the first argument using the second argument character set |
| default.produdfone | 功能’默認(rèn)值.produdfone’不存在。 | Function ‘default.produdfone’ does not exist. |
| degrees | 度(x)-將弧度轉(zhuǎn)換為度 | degrees(x) - Converts radians to degrees |
| dense_rank | 沒有關(guān)于函數(shù)“稠密等級”的文檔 | There is no documentation for function ‘dense_rank’ |
| div | a div b-將a除以b四舍五入到長整數(shù) | a div b - Divide a by b rounded to the long integer |
| e | e()—返回e | e() - returns E |
| elt | elt(n,str1,str2,…)—返回第n個(gè)字符串 | elt(n, str1, str2, …) - returns the n-th string |
| encode | encode(str,str)-使用第二個(gè)參數(shù)字符集對第一個(gè)參數(shù)進(jìn)行編碼 | encode(str, str) - Encode the first argument using the second argument character set |
| ewah_bitmap | ewah_bitmap(expr)-返回列的ewah壓縮位圖表示。 | ewah_bitmap(expr) - Returns an EWAH-compressed bitmap representation of a column. |
| ewah_bitmap_and | ewah_bitmap_and(b1,b2)-返回兩個(gè)位圖中按位“與”的ewah壓縮位圖。 | ewah_bitmap_and(b1, b2) - Return an EWAH-compressed bitmap that is the bitwise AND of two bitmaps. |
| ewah_bitmap_empty | ewah_bitmap_empty(bitmap)-測試ewah壓縮位圖是否全為零的謂詞 | ewah_bitmap_empty(bitmap) - Predicate that tests whether an EWAH-compressed bitmap is all zeros |
| ewah_bitmap_or | ewah_bitmap_or(b1,b2)-返回兩個(gè)位圖中按位或的ewah壓縮位圖。 | ewah_bitmap_or(b1, b2) - Return an EWAH-compressed bitmap that is the bitwise OR of two bitmaps. |
| exp | 返回x的冪 | exp(x) - Returns e to the power of x |
| explode | 分解(a)-將數(shù)組a的元素拆分為多行,或?qū)⒂成涞脑夭鸱譃槎嘈泻投嗔?/td> | explode(a) - separates the elements of array a into multiple rows, or the elements of a map into multiple rows and columns |
| factorial | factorial(int)-返回n個(gè)factorial。有效n為[0…20]。 | factorial(int) - Returns n factorial. Valid n is [0…20]. |
| field | 字段(str,str1,str2,…)—返回str1,str2,….中str的索引,。。。列表或0(如果未找到) | field(str, str1, str2, …) - returns the index of str in the str1,str2,… list or 0 if not found |
| find_in_set | find_in_set(str,str_array)-返回str_數(shù)組中第一個(gè)出現(xiàn)的str,其中str_array是逗號分隔的字符串。如果任一參數(shù)為null,則返回null。如果第一個(gè)參數(shù)有逗號,則返回0。 | find_in_set(str,str_array) - Returns the first occurrence of str in str_array where str_array is a comma-delimited string. Returns null if either argument is null. Returns 0 if the first argument has any commas. |
| first_value | 函數(shù)“first_value”沒有文檔 | There is no documentation for function ‘first_value’ |
| floor | floor(x)-查找不大于x的最大整數(shù) | floor(x) - Find the largest integer not greater than x |
| floor_day | floor_day(param)-返回一天粒度的時(shí)間戳 | floor_day(param) - Returns the timestamp at a day granularity |
| floor_hour | floor_hour(param)-返回小時(shí)粒度的時(shí)間戳 | floor_hour(param) - Returns the timestamp at a hour granularity |
| floor_minute | floor_minute(param)-以分鐘粒度返回時(shí)間戳 | floor_minute(param) - Returns the timestamp at a minute granularity |
| floor_month | floor_month(param)-返回月份粒度的時(shí)間戳 | floor_month(param) - Returns the timestamp at a month granularity |
| floor_quarter | floor_quarter(param)-返回四分之一粒度的時(shí)間戳 | floor_quarter(param) - Returns the timestamp at a quarter granularity |
| floor_second | floor\u second(param)-返回秒粒度的時(shí)間戳 | floor_second(param) - Returns the timestamp at a second granularity |
| floor_week | floor\u week(param)-以周粒度返回時(shí)間戳 | floor_week(param) - Returns the timestamp at a week granularity |
| floor_year | floor_year(param)-返回以年為單位的時(shí)間戳 | floor_year(param) - Returns the timestamp at a year granularity |
| format_number | format_number(X,D或F)-將數(shù)字X格式化為“#,###,############################。如果D為0,則結(jié)果沒有小數(shù)點(diǎn)或小數(shù)部分。它的功能應(yīng)該類似于MySQL的格式 | format_number(X, D or F) - Formats the number X to a format like ‘#,###,###.##’, rounded to D decimal places, Or Uses the format specified F to format, and returns the result as a string. If D is 0, the result has no decimal point or fractional part. This is supposed to function like MySQL’s FORMAT |
| from_unixtime | from\u unixtime(unix_time,format)-返回指定格式的unix時(shí)間 | from_unixtime(unix_time, format) - returns unix_time in the specified format |
| from_utc_timestamp | from_utc_timestamp(timestamp,string timezone)-假定給定的時(shí)間戳為utc并轉(zhuǎn)換為給定的時(shí)區(qū)(從配置單元0.8.0開始) | from_utc_timestamp(timestamp, string timezone) - Assumes given timestamp is UTC and converts to given timezone (as of Hive 0.8.0) |
| get_json_object | get_json_object(json_txt,path)-從path中提取一個(gè)json對象 | get_json_object(json_txt, path) - Extract a json object from path |
| get_splits | get_splits(string,int)-返回被引用表string的長度為int的序列化splits數(shù)組。 | get_splits(string,int) - Returns an array of length int serialized splits for the referenced tables string. |
| greatest | 最大值(v1,v2,…)—返回值列表中的最大值 | greatest(v1, v2, …) - Returns the greatest value in a list of values |
| grouping | 分組(a,b)-指示中的指定列表達(dá)式是否聚合。返回1表示聚合,返回0表示未聚合。 | grouping(a, b) - Indicates whether a specified column expression in is aggregated or not. Returns 1 for aggregated or 0 for not aggregated. |
| hash | hash(a1,a2,…)—返回參數(shù)的哈希值 | hash(a1, a2, …) - Returns a hash value of the arguments |
| hex | 十六進(jìn)制(n、bin或str)-將參數(shù)轉(zhuǎn)換為十六進(jìn)制 | hex(n, bin, or str) - Convert the argument to hexadecimal |
| histogram_numeric | histogram_numeric(expr,nb)-使用nb bin計(jì)算數(shù)值“expr”的直方圖。 | histogram_numeric(expr, nb) - Computes a histogram on numeric ‘expr’ using nb bins. |
| hour | hour(param)-返回字符串/timestamp/interval的小時(shí)組成 | hour(param) - Returns the hour componemnt of the string/timestamp/interval |
| if | IF(expr1,expr2,expr3)-如果expr1為真(expr1<>0和expr1<>NULL),則IF()返回expr2;否則返回expr3。IF()返回?cái)?shù)值或字符串值,具體取決于使用它的上下文。 | IF(expr1,expr2,expr3) - If expr1 is TRUE (expr1 <> 0 and expr1 <> NULL) then IF() returns expr2; otherwise it returns expr3. IF() returns a numeric or string value, depending on the context in which it is used. |
| in | test in(val1,val2…)-如果test等于任何valN,則返回true | test in(val1, val2…) - returns true if test equals any valN |
| in_file | in_file(str,filename)-如果str出現(xiàn)在文件中,則返回true | in_file(str, filename) - Returns true if str appears in the file |
| index | index(a,n)-返回 | index(a, n) - Returns the n-th element of a |
| initcap | initcap(str)-返回str,每個(gè)單詞的第一個(gè)字母都是大寫,所有其他字母都是小寫。單詞用空格分隔。 | initcap(str) - Returns str, with the first letter of each word in uppercase, all other letters in lowercase. Words are delimited by white space. |
| inline | inline(ARRAY(STRUCT()[,STRUCT()])-將數(shù)組和結(jié)構(gòu)分解為表 | inline( ARRAY( STRUCT()[,STRUCT()] - explodes and array and struct into a table |
| instr | instr(str,substr)-返回str中第一次出現(xiàn)substr的索引 | instr(str, substr) - Returns the index of the first occurance of substr in str |
| internal_interval | 內(nèi)部間隔(intervalType,intervalArg) | internal_interval(intervalType,intervalArg) |
| isnotnull | isnotnull a-如果a不為NULL,則返回true,否則返回false | isnotnull a - Returns true if a is not NULL and false otherwise |
| isnull | isnull a-如果a為NULL,則返回true,否則返回false | isnull a - Returns true if a is NULL and false otherwise |
| java_method | java_方法(class,method[,arg1[,arg2…]])使用反射調(diào)用方法 | java_method(class,method[,arg1[,arg2…]]) calls method with reflection |
| json_tuple | json元組(jsonStr,p1,p2,…,pn)類似get_json_對象,但它使用多個(gè)名稱并返回一個(gè)元組。所有的輸入?yún)?shù)和輸出列類型都是字符串。 | json_tuple(jsonStr, p1, p2, …, pn) - like get_json_object, but it takes multiple names and return a tuple. All the input parameters and output column types are string. |
| lable.produdfone | 功能’produdfone標(biāo)簽’不存在。 | Function ‘lable.produdfone’ does not exist. |
| lag | LAG(標(biāo)量表達(dá)式[,offset][,default])OVER([query_partition_clause]order_by_子句);LAG函數(shù)用于訪問前一行的數(shù)據(jù)。 | LAG (scalar_expression [,offset] [,default]) OVER ([query_partition_clause] order_by_clause); The LAG function is used to access data from a previous row. |
| last_day | last_day(date)-返回日期所屬月份的最后一天。 | last_day(date) - Returns the last day of the month which the date belongs to. |
| last_value | 函數(shù)“l(fā)ast_value”沒有文檔 | There is no documentation for function ‘last_value’ |
| lcase | lcase(str)-返回所有字符都改為小寫的str | lcase(str) - Returns str with all characters changed to lowercase |
| lead | LEAD(標(biāo)量_expression[,offset][,default])OVER([query_partition_clause]order_by_子句);LEAD函數(shù)用于從下一行返回?cái)?shù)據(jù)。 | LEAD (scalar_expression [,offset] [,default]) OVER ([query_partition_clause] order_by_clause); The LEAD function is used to return data from the next row. |
| least | least(v1,v2,…)—返回值列表中的最小值 | least(v1, v2, …) - Returns the least value in a list of values |
| length | length(str | binary)-返回str的長度或二進(jìn)制數(shù)據(jù)中的字節(jié)數(shù) |
| levenshtein | levenshtein(str1,str2)-此函數(shù)計(jì)算兩個(gè)字符串之間的levenshtein距離。 | levenshtein(str1, str2) - This function calculates the Levenshtein distance between two strings. |
| like | like(str,pattern)-檢查str是否與pattern匹配 | like(str, pattern) - Checks if str matches pattern |
| ln | ln(x)-返回x的自然對數(shù) | ln(x) - Returns the natural logarithm of x |
| locate | locate(substr,str[,pos])—返回str中第一個(gè)在pos位置之后出現(xiàn)的substr的位置 | locate(substr, str[, pos]) - Returns the position of the first occurance of substr in str after position pos |
| log | log([b],x)-返回以b為底的x的對數(shù) | log([b], x) - Returns the logarithm of x with base b |
| log10 | log10(x)-返回以10為底的x的對數(shù) | log10(x) - Returns the logarithm of x with base 10 |
| log2 | log2(x)-返回以2為底的x的對數(shù) | log2(x) - Returns the logarithm of x with base 2 |
| logged_in_user | logged_in_user()-返回登錄用戶名 | logged_in_user() - Returns logged in user name |
| lower | lower(str)-返回所有字符都改為小寫的str | lower(str) - Returns str with all characters changed to lowercase |
| lpad | lpad(str,len,pad)-返回str,left padded with pad的長度為len | lpad(str, len, pad) - Returns str, left-padded with pad to a length of len |
| ltrim | ltrim(str)-刪除str中的前導(dǎo)空格字符 | ltrim(str) - Removes the leading space characters from str |
| map | map(key0,value0,key1,value1…)-使用給定的鍵/值對創(chuàng)建映射 | map(key0, value0, key1, value1…) - Creates a map with the given key/value pairs |
| map_keys | map_keys(map)-返回包含輸入映射鍵的無序數(shù)組。 | map_keys(map) - Returns an unordered array containing the keys of the input map. |
| map_values | map_values(map)-返回包含輸入映射值的無序數(shù)組。 | map_values(map) - Returns an unordered array containing the values of the input map. |
| mask | 屏蔽給定值 | masks the given value |
| mask_first_n | 屏蔽值的前n個(gè)字符 | masks the first n characters of the value |
| mask_hash | 返回給定值的哈希值 | returns hash of the given value |
| mask_last_n | 遮罩值的最后n個(gè)字符 | masks the last n characters of the value |
| mask_show_first_n | 屏蔽值的前n個(gè)字符之外的所有字符 | masks all but first n characters of the value |
| mask_show_last_n | 屏蔽值的最后n個(gè)字符 | masks all but last n characters of the value |
| matchpath | 沒有函數(shù)“matchpath”的文檔 | There is no documentation for function ‘matchpath’ |
| max | max(expr)-返回expr的最大值 | max(expr) - Returns the maximum value of expr |
| md5 | md5(str或bin)-計(jì)算字符串或二進(jìn)制文件的md5128位校驗(yàn)和。 | md5(str or bin) - Calculates an MD5 128-bit checksum for the string or binary. |
| min | min(expr)-返回expr的最小值 | min(expr) - Returns the minimum value of expr |
| minute | minute(param)-返回字符串/timestamp/interval的分鐘組件 | minute(param) - Returns the minute component of the string/timestamp/interval |
| month | month(param)-返回日期/時(shí)間戳/間隔的月份組件 | month(param) - Returns the month component of the date/timestamp/interval |
| months_between | months\u between(date1,date2)-返回日期date1和date2之間的月數(shù) | months_between(date1, date2) - returns number of months between dates date1 and date2 |
| named_struct | named_struct(name1,val1,name2,val2,…)—使用給定的字段名和值創(chuàng)建一個(gè)結(jié)構(gòu) | named_struct(name1, val1, name2, val2, …) - Creates a struct with the given field names and values |
| negative | 負(fù)a-返回-a | negative a - Returns -a |
| next_day | next_day(start_date,week的day)-返回晚于start_date并按指示命名的第一個(gè)日期。 | next_day(start_date, day_of_week) - Returns the first date which is later than start_date and named as indicated. |
| ngrams | 由串的數(shù)組組成的數(shù)組pf’是控制內(nèi)存使用的可選精度因子。 | ngrams(expr, n, k, pf) - Estimates the top-k n-grams in rows that consist of sequences of strings, represented as arrays of strings, or arrays of arrays of strings. ‘pf’ is an optional precision factor that controls memory usage. |
| noop | 沒有函數(shù)“noop”的文檔 | There is no documentation for function ‘noop’ |
| noopstreaming | 沒有“noopstreaming”函數(shù)的文檔 | There is no documentation for function ‘noopstreaming’ |
| noopwithmap | 沒有函數(shù)“noopwithmap”的文檔 | There is no documentation for function ‘noopwithmap’ |
| noopwithmapstreaming | 函數(shù)“noopwithmapstreaming”沒有文檔 | There is no documentation for function ‘noopwithmapstreaming’ |
| not | 不是-邏輯上不是 | not a - Logical not |
| ntile | 沒有函數(shù)“ntile”的文檔 | There is no documentation for function ‘ntile’ |
| nvl | nvl(value,default_value)-如果value為null,則返回默認(rèn)值,否則返回value | nvl(value,default_value) - Returns default value if value is null else returns value |
| or | a1或a2或。。。或-邏輯or | a1 or a2 or … or an - Logical or |
| parse_url | parse_url(url,partToExtract[,key])-從url中提取部分 | parse_url(url, partToExtract[, key]) - extracts a part from a URL |
| parse_url_tuple | parse-url元組(url,partname1,partname2,…,partnameN)-從url中提取N(N>=1)個(gè)部分。 | parse_url_tuple(url, partname1, partname2, …, partnameN) - extracts N (N>=1) parts from a URL. |
| 它接受一個(gè)URL和一個(gè)或多個(gè)partname,并返回一個(gè)tuple。所有的輸入?yún)?shù)和輸出列類型都是字符串。 | It takes a URL and one or multiple partnames, and returns a tuple. All the input parameters and output column types are string. | |
| percent_rank | 沒有“percent_rank”函數(shù)的文檔 | There is no documentation for function ‘percent_rank’ |
| percentile | percentile(expr,pc)-返回pc(范圍:[0,1])處expr的百分比。pc可以是雙數(shù)組或雙數(shù)組 | percentile(expr, pc) - Returns the percentile(s) of expr at pc (range: [0,1]).pc can be a double or double array |
| percentile_approx | percentile_approach(expr,pc,[nb])-對于非常大的數(shù)據(jù),使用可選參數(shù)[nb]作為要使用的直方圖箱數(shù),從直方圖計(jì)算近似百分位值。nb值越大,近似值就越精確,代價(jià)是內(nèi)存使用率越高。 | percentile_approx(expr, pc, [nb]) - For very large data, computes an approximate percentile value from a histogram, using the optional argument [nb] as the number of histogram bins to use. A higher value of nb results in a more accurate approximation, at the cost of higher memory usage. |
| pi | pi()—返回pi | pi() - returns pi |
| pmod | pmodb-計(jì)算正模 | a pmod b - Compute the positive modulo |
| posexplode | posexplode(a)-行為類似于數(shù)組的explode,但包含了原始數(shù)組中項(xiàng)的位置 | posexplode(a) - behaves like explode for arrays, but includes the position of items in the original array |
| positive | 正a-返回a | positive a - Returns a |
| pow | pow(x1,x2)-將x1提升到x2的冪 | pow(x1, x2) - raise x1 to the power of x2 |
| power | 功率(x1,x2)-將x1提升到x2的冪 | power(x1, x2) - raise x1 to the power of x2 |
| printf | printf(字符串格式,對象。。。args)-可以根據(jù)printf樣式格式字符串格式化字符串的函數(shù) | printf(String format, Obj… args) - function that can format strings according to printf-style format strings |
| quarter | quarter(date/timestamp/string)-返回日期所在的季度,范圍為1到4。 | quarter(date/timestamp/string) - Returns the quarter of the year for date, in the range 1 to 4. |
| radians | 弧度(x)-將度數(shù)轉(zhuǎn)換為弧度 | radians(x) - Converts degrees to radians |
| rand | rand([seed])—返回介于0和1之間的偽隨機(jī)數(shù) | rand([seed]) - Returns a pseudorandom number between 0 and 1 |
| rank | 沒有函數(shù)“rank”的文檔 | There is no documentation for function ‘rank’ |
| reflect | reflect(class,method[,arg1[,arg2…]])使用反射調(diào)用方法 | reflect(class,method[,arg1[,arg2…]]) calls method with reflection |
| reflect2 | reflect2(arg0,method[,arg1[,arg2…]])使用反射調(diào)用arg0的方法 | reflect2(arg0,method[,arg1[,arg2…]]) calls method of arg0 with reflection |
| regexp | str regexp regexp-如果str與regexp匹配,則返回true,否則返回false | str regexp regexp - Returns true if str matches regexp and false otherwise |
| regexp_extract | regexp_extract(str,regexp[,idx])-提取與regexp匹配的組 | regexp_extract(str, regexp[, idx]) - extracts a group that matches regexp |
| regexp_replace | regexp_replace(str,regexp,rep)-用rep替換匹配regexp的str的所有子字符串 | regexp_replace(str, regexp, rep) - replace all substrings of str that match regexp with rep |
| repeat | 重復(fù)(str,n)-重復(fù)str n次 | repeat(str, n) - repeat str n times |
| replace | replace(str,search,rep)-將“search”與“rep”匹配的所有子字符串替換為“str” | replace(str, search, rep) - replace all substrings of ‘str’ that match ‘search’ with ‘rep’ |
| reverse | 反向(str)-反向str | reverse(str) - reverse str |
| rlike | str rlike regexp-如果str與regexp匹配,則返回true,否則返回false | str rlike regexp - Returns true if str matches regexp and false otherwise |
| round | 舍入(x[,d])—將x舍入到d個(gè)小數(shù)位 | round(x[, d]) - round x to d decimal places |
| row_number | 沒有“row_number”函數(shù)的文檔 | There is no documentation for function ‘row_number’ |
| rpad | rpad(str,len,pad)-返回str,右填充pad到len的長度 | rpad(str, len, pad) - Returns str, right-padded with pad to a length of len |
| rtrim | rtrim(str)-刪除str中的尾隨空格字符 | rtrim(str) - Removes the trailing space characters from str |
| second | second(date)-返回字符串/timestamp/interval的第二個(gè)組件 | second(date) - Returns the second component of the string/timestamp/interval |
| sentences | 句子(str,lang,country)-將str拆分為句子數(shù)組,其中每個(gè)句子都是一個(gè)單詞數(shù)組。“l(fā)ang”和“country”參數(shù)是可選的,如果省略,則使用默認(rèn)的區(qū)域設(shè)置。 | sentences(str, lang, country) - Splits str into arrays of sentences, where each sentence is an array of words. The ‘lang’ and’country’ arguments are optional, and if omitted, the default locale is used. |
| sha | sha(str或bin)-計(jì)算字符串或二進(jìn)制的sha-1摘要,并以十六進(jìn)制字符串的形式返回值。 | sha(str or bin) - Calculates the SHA-1 digest for string or binary and returns the value as a hex string. |
| sha1 | sha1(str或bin)-計(jì)算字符串或二進(jìn)制的SHA-1摘要,并以十六進(jìn)制字符串的形式返回值。 | sha1(str or bin) - Calculates the SHA-1 digest for string or binary and returns the value as a hex string. |
| sha2 | sha2(string/binary,len)-計(jì)算SHA-2哈希函數(shù)族(SHA-224、SHA-256、SHA-384和SHA-512)。 | sha2(string/binary, len) - Calculates the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and SHA-512). |
| shiftleft | 左移(a,b)-按位左移 | shiftleft(a, b) - Bitwise left shift |
| shiftright | shiftright(a,b)-按位右移 | shiftright(a, b) - Bitwise right shift |
| shiftrightunsigned | shiftrightunsigned(a,b)-位無符號右移 | shiftrightunsigned(a, b) - Bitwise unsigned right shift |
| sign | sign(x)-返回x的符號 | sign(x) - returns the sign of x ) |
| sin | sin(x)-返回x的正弦值(x以弧度為單位) | sin(x) - returns the sine of x (x is in radians) |
| size | size(a)-返回a的大小 | size(a) - Returns the size of a |
| sort_array | sort_array(array(obj1,obj2,…)—根據(jù)數(shù)組元素的自然順序?qū)斎霐?shù)組進(jìn)行升序排序。 | sort_array(array(obj1, obj2,…)) - Sorts the input array in ascending order according to the natural ordering of the array elements. |
| soundex | soundex(string)-返回字符串的soundex代碼。 | soundex(string) - Returns soundex code of the string. |
| space | space(n)-返回n個(gè)空格 | space(n) - returns n spaces |
| split | split(str,regex)-圍繞匹配regex的事件拆分str | split(str, regex) - Splits str around occurances that match regex |
| sqrt | sqrt(x)-返回x的平方根 | sqrt(x) - returns the square root of x |
| stack | 堆棧(n,cols…)-將k列轉(zhuǎn)換為n行,每行大小為k/n | stack(n, cols…) - turns k columns into n rows of size k/n each |
| std | std(x)-返回一組數(shù)字的標(biāo)準(zhǔn)偏差 | std(x) - Returns the standard deviation of a set of numbers |
| stddev | stddev(x)-返回一組數(shù)字的標(biāo)準(zhǔn)偏差 | stddev(x) - Returns the standard deviation of a set of numbers |
| stddev_pop | stddev_pop(x)-返回一組數(shù)字的標(biāo)準(zhǔn)偏差 | stddev_pop(x) - Returns the standard deviation of a set of numbers |
| stddev_samp | stddev_samp(x)-返回一組數(shù)字的樣本標(biāo)準(zhǔn)偏差 | stddev_samp(x) - Returns the sample standard deviation of a set of numbers |
| str_to_map | str_to_map(text,delimiter1,delimiter2)-通過解析文本創(chuàng)建映射 | str_to_map(text, delimiter1, delimiter2) - Creates a map by parsing text |
| struct | struct(col1,col2,col3,…)—用給定的字段值創(chuàng)建一個(gè)結(jié)構(gòu) | struct(col1, col2, col3, …) - Creates a struct with the given field values |
| substr | substr(str,pos[,len])-返回從pos開始的長度為len的str的子字符串或substr(bin,pos[,len])-返回從pos開始,長度為len的字節(jié)數(shù)組的片段 | substr(str, pos[, len]) - returns the substring of str that starts at pos and is of length len orsubstr(bin, pos[, len]) - returns the slice of byte array that starts at pos and is of length len |
| substring | substring(str,pos[,len])-返回從pos開始的長度為len或substring(bin,pos[,len])的str子字符串-返回從pos開始、長度為len的字節(jié)數(shù)組的片段 | substring(str, pos[, len]) - returns the substring of str that starts at pos and is of length len orsubstring(bin, pos[, len]) - returns the slice of byte array that starts at pos and is of length len |
| substring_index | substring_index(str,delim,count)-返回string str中分隔符delim出現(xiàn)count次之前的子字符串。 | substring_index(str, delim, count) - Returns the substring from string str before count occurrences of the delimiter delim. |
| sum | sum(x)-返回一組數(shù)字的總和 | sum(x) - Returns the sum of a set of numbers |
| tan | tan(x)-返回x的正切值(x以弧度表示) | tan(x) - returns the tangent of x (x is in radians) |
| to_date | 結(jié)束日期(expr)-提取date或datetime表達(dá)式expr的日期部分 | to_date(expr) - Extracts the date part of the date or datetime expression expr |
| to_unix_timestamp | to_unix_timestamp(date[,pattern])-返回unix時(shí)間戳 | to_unix_timestamp(date[, pattern]) - Returns the UNIX timestamp |
| to_utc_timestamp | to_utc_timestamp(timestamp,string timezone)-假設(shè)給定的時(shí)間戳在給定的時(shí)區(qū)中并轉(zhuǎn)換為utc(從配置單元0.8.0開始) | to_utc_timestamp(timestamp, string timezone) - Assumes given timestamp is in given timezone and converts to UTC (as of Hive 0.8.0) |
| translate | translate(input,from,to)-通過將from字符串中的字符替換為to字符串中的相應(yīng)字符來轉(zhuǎn)換輸入字符串 | translate(input, from, to) - translates the input string by replacing the characters present in the from string with the corresponding characters in the to string |
| trim | trim(str)-刪除str中的前導(dǎo)和尾隨空格字符 | trim(str) - Removes the leading and trailing space characters from str |
| trunc | trunc(date,fmt)-返回日期,其中一天的時(shí)間部分被截?cái)酁楦袷侥P蚮mt指定的單位。如果省略fmt,則日期將被截?cái)酁樽罱囊惶臁KF(xiàn)在只支持’MONTH’/‘MON’/‘MM’和’YEAR’/‘YYYY’/'YY’作為格式。 | trunc(date, fmt) - Returns returns date with the time portion of the day truncated to the unit specified by the format model fmt. If you omit fmt, then date is truncated to the nearest day. It now only supports ‘MONTH’/‘MON’/‘MM’ and ‘YEAR’/‘YYYY’/‘YY’ as format. |
| ucase | ucase(str)-返回str,所有字符都改為大寫 | ucase(str) - Returns str with all characters changed to uppercase |
| unbase64 | unbase64(str)-將參數(shù)從base64字符串轉(zhuǎn)換為二進(jìn)制 | unbase64(str) - Convert the argument from a base 64 string to binary |
| unhex | unhex(str)-將十六進(jìn)制參數(shù)轉(zhuǎn)換為二進(jìn)制 | unhex(str) - Converts hexadecimal argument to binary |
| unix_timestamp | unix_timestamp(date[,pattern])-將時(shí)間轉(zhuǎn)換為數(shù)字 | unix_timestamp(date[, pattern]) - Converts the time to a number |
| upper | upper(str)-返回所有字符都改為大寫的str | upper(str) - Returns str with all characters changed to uppercase |
| uuid | uuid()—返回通用唯一標(biāo)識符(uuid)字符串。 | uuid() - Returns a universally unique identifier (UUID) string. |
| var_pop | var_pop(x)-返回一組數(shù)字的方差 | var_pop(x) - Returns the variance of a set of numbers |
| var_samp | var_samp(x)-返回一組數(shù)字的樣本方差 | var_samp(x) - Returns the sample variance of a set of numbers |
| variance | variance(x)-返回一組數(shù)字的方差 | variance(x) - Returns the variance of a set of numbers |
| version | version()—返回配置單元內(nèi)部版本字符串—包括基本版本和修訂。 | version() - Returns the Hive build version string - includes base version and revision. |
| weekofyear | weekofyear(date)-返回給定日期所在的一年中的某一周。一周從星期一開始,第1周是第一周,超過3天。 | weekofyear(date) - Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days. |
| when | CASE WHEN a THEN b[WHEN c THEN d]*[ELSE e]END-當(dāng)a=true時(shí),返回b;當(dāng)c=true時(shí),返回d;ELSE返回e | CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END - When a = true, returns b; when c = true, return d; else return e |
| windowingtablefunction | 沒有函數(shù)“windowingtablefunction”的文檔 | There is no documentation for function ‘windowingtablefunction’ |
| xpath | xpath(xml,xpath)-返回xml節(jié)點(diǎn)中與xpath表達(dá)式匹配的值的字符串?dāng)?shù)組 | xpath(xml, xpath) - Returns a string array of values within xml nodes that match the xpath expression |
| xpath_boolean | xpath_boolean(xml,xpath)-計(jì)算布爾xpath表達(dá)式 | xpath_boolean(xml, xpath) - Evaluates a boolean xpath expression |
| xpath_double | xpath_double(xml,xpath)-返回與xpath表達(dá)式匹配的雙精度值 | xpath_double(xml, xpath) - Returns a double value that matches the xpath expression |
| xpath_float | xpath_float(xml,xpath)-返回與xpath表達(dá)式匹配的浮點(diǎn)值 | xpath_float(xml, xpath) - Returns a float value that matches the xpath expression |
| xpath_int | xpath_int(xml,xpath)-返回與xpath表達(dá)式匹配的整數(shù)值 | xpath_int(xml, xpath) - Returns an integer value that matches the xpath expression |
| xpath_long | xpath_long(xml,xpath)-返回與xpath表達(dá)式匹配的long值 | xpath_long(xml, xpath) - Returns a long value that matches the xpath expression |
| xpath_number | xpath_number(xml,xpath)-返回與xpath表達(dá)式匹配的雙精度值 | xpath_number(xml, xpath) - Returns a double value that matches the xpath expression |
| xpath_short | xpath_short(xml,xpath)-返回與xpath表達(dá)式匹配的短值 | xpath_short(xml, xpath) - Returns a short value that matches the xpath expression |
| xpath_string | xpath_string(xml,xpath)-返回與xpath表達(dá)式匹配的第一個(gè)xml節(jié)點(diǎn)的文本內(nèi)容 | xpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the xpath expression |
| year | year(param)-返回日期/時(shí)間戳/間隔的年份組件 | year(param) - Returns the year component of the date/timestamp/interval |
| == | | | ==a |
| ~ | ~n-按位不是 | ~ n - Bitwise not |
總結(jié)
以上是生活随笔為你收集整理的Hive中实现有序,有序concat拼接,有序集合,hive方法操作命令,与自带方法列表的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: vlcc运价指数
- 下一篇: 数据仓库工具箱:维度建模权威指南3