日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Elasticsearch 之(6)kibana嵌套聚合,下钻分析,聚合分析

發布時間:2023/12/20 编程问答 25 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Elasticsearch 之(6)kibana嵌套聚合,下钻分析,聚合分析 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
兩個核心概念:bucket和metric
city name
北京 小李
北京 小王
上海 小張
上海 小麗
上海 小陳

基于city劃分buckets
劃分出來兩個bucket,一個是北京bucket,一個是上海bucket

北京bucket:包含了2個人,小李,小王
上海bucket:包含了3個人,小張,小麗,小陳

按照某個字段進行bucket劃分,那個字段的值相同的那些數據,就會被劃分到一個bucket中
有一些mysql的sql知識的話,聚合,首先第一步就是分組,對每個組內的數據進行聚合分析,分組,就是我們的bucket
metric:對一個數據分組執行的統計
當我們有了一堆bucket之后,就可以對每個bucket中的數據進行聚合分詞了,比如說計算一個bucket內所有數據的數量,或者計算一個bucket內所有數據的平均值,最大值,最小值
bucketgroup by user_id --> 那些user_id相同的數據,就會被劃分到一個bucket中
metric,
就是對一個bucket執行的某種聚合分析的操作,比如說求平均值,求最大值,求最小值
計算一個數量計算每個tag下的商品數量 GET /ecommerce/product/_search {"size" : 0,??"aggs": {"group_by_tags": {"terms": { "field": "tags" }}} } size:只獲取聚合結果,而不要執行聚合的原始數據
aggs:固定語法,要對一份數據執行分組聚合操作
gourp_by_tags:就是對每個aggs,都要起一個名字,這個名字是隨機的,你隨便取什么都ok
terms:根據字段的值進行分組
field:根據指定的字段的值進行分組將文本
field的fielddata屬性設置為true (正排索引 用于嵌套聚合查詢, 后面會詳細描述) PUT /ecommerce/_mapping/product{"properties": {"tags": {"type": "text","fielddata": true}} } GET /ecommerce/product/_search {"size": 0,"aggs": {"all_tags": {"terms": { "field": "tags" }}} }{"took": 20,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 4,"max_score": 0,"hits": []},"aggregations": {"group_by_tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "fangzhu","doc_count": 2},{"key": "meibai","doc_count": 2},{"key": "qingxin","doc_count": 1}]}} } hits.hits:我們指定了size是0,所以hits.hits就是空的,否則會把執行聚合的那些原始數據給你返回回來
aggregations:聚合結果
gourp_by_tags:我們指定的某個聚合的名稱
buckets:根據我們指定的field劃分出的buckets
key:每個bucket對應的那個值
doc_count:這個bucket分組內,有多少個數據
每種tag對應的bucket中的數據的
默認的排序規則:按照doc_count降序排序
對名稱中包含yagao的商品,計算每個tag下的商品數量 GET /ecommerce/product/_search {"size": 0,"query": {"match": {"name": "yagao"}},"aggs": {"all_tags": {"terms": {"field": "tags"}}} } {"took": 35,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 3,"max_score": 0,"hits": []},"aggregations": {"all_tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "fangzhu","doc_count": 2},{"key": "meibai","doc_count": 1},{"key": "qingxin","doc_count": 1}]}} }
top_hits 獲取前幾個doc_ source 返回指定field GET /ecommerce/product/_search {"size": 0,"aggs" : {"group_by_tags" : {"terms" : { "field" : "tags" },"aggs" : {"top_tags": {"top_hits": { "_source": {"include": "name"}, "size": 1}} }}} }

計算每個tag下的商品的平均價格/最小價格/最大價格/總價
count:bucket,terms,自動就會有一個doc_count,就相當于是count
avg:avg aggs,求平均值
max:求一個bucket內,指定field值最大的那個數據
min:求一個bucket內,指定field值最小的那個數據
sum:求一個bucket內,指定field值的總和先分組,再算每組的平均值
GET /ecommerce/product/_search {"size": 0,"aggs" : {"group_by_tags" : {"terms" : { "field" : "tags" },"aggs" : {"avg_price": { "avg": { "field": "price" } },"min_price" : { "min": { "field": "price"} },?"max_price" : { "max": { "field": "price"} },"sum_price" : { "sum": { "field": "price" } }?}}} avg_price:我們自己取的metric aggs的名字
value:我們的metric計算的結果,每個bucket中的數據的price字段求平均值后的結果
{"took": 3,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 3,"max_score": 0,"hits": []},"aggregations": {"group_by_tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "fangzhu","doc_count": 2,"max_price": {"value": 30},"min_price": {"value": 25},"avg_price": {"value": 27.5},"sum_price": {"value": 55}},{"key": "meibai","doc_count": 1,"max_price": {"value": 30},"min_price": {"value": 30},"avg_price": {"value": 30},"sum_price": {"value": 30}},{"key": "qingxin","doc_count": 1,"max_price": {"value": 40},"min_price": {"value": 40},"avg_price": {"value": 40},"sum_price": {"value": 40}}]}} }


collect_mode

對于子聚合的計算,有兩種方式:

  • depth_first 直接進行子聚合的計算
  • breadth_first 先計算出當前聚合的結果,針對這個結果在對子聚合進行計算。
"order": { "avg_price": "desc" }

計算每個tag下的商品的平均價格,并且按照平均價格降序排序
GET /ecommerce/product/_search {"size": 0,"aggs" : {"all_tags" : {"terms" : { "field" : "tags", "collect_mode" : "breadth_first",?"order": { "avg_price": "desc" } },"aggs" : {"avg_price" : {"avg" : { "field" : "price" }}}}} } {"took": 2,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 3,"max_score": 0,"hits": []},"aggregations": {"all_tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "qingxin","doc_count": 1,"avg_price": {"value": 40}},{"key": "meibai","doc_count": 1,"avg_price": {"value": 30}},{"key": "fangzhu","doc_count": 2,"avg_price": {"value": 27.5}}]}} }
" ranges ": [{},{}] 按照指定的價格范圍區間進行分組,然后在每組內再按照tag進行分組,最后再計算每組的平均價格
GET /ecommerce/product/_search {"size": 0,"aggs": {"group_by_price": {"range": {"field": "price","ranges": [{"from": 0,"to": 20},{"from": 20,"to": 40},{"from": 40,"to": 50}]},"aggs": {"group_by_tags": {"terms": {"field": "tags"},"aggs": {"average_price": {"avg": {"field": "price"}}}}}}} }
histogram
類似于terms,也是進行bucket分組操作,接收一個field,按照這個field的值的各個范圍區間,進行bucket分組操作

interval:10,劃分范圍,0~10,10~20,20~30

GET /ecommerce/product/_search {"size" : 0,"aggs":{"price":{"histogram":{ "field": "price","interval": 10},"aggs":{"revenue": {"sum": { "field" : "price"}}}}} }{"took": 1,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 3,"max_score": 0,"hits": []},"aggregations": {"price": {"buckets": [{"key": 20,"doc_count": 1,"revenue": {"value": 25}},{"key": 30,"doc_count": 1,"revenue": {"value": 30}},{"key": 40,"doc_count": 1,"revenue": {"value": 40}}]}} }
date histogram
按照我們指定的某個date類型的日期field,以及日期interval,按照一定的日期間隔,去劃分bucket

date interval = 1m,
2017-01-01~2017-01-31,就是一個bucket
2017-02-01~2017-02-28,就是一個bucket
然后會去掃描每個數據的date field,判斷date落在哪個bucket中,就將其放入那個bucket

min_doc_count:即使某個日期interval,2017-01-01~2017-01-31中,一條數據都沒有,那么這個區間也是要返回的,不然默認是會過濾掉這個區間的
extended_bounds,min,max:劃分bucket的時候,會限定在這個起始日期,和截止日期內

GET /tvs/sales/_search {"size" : 0,"aggs": {"sales": {"date_histogram": {"field": "sold_date","interval": "month", "format": "yyyy-MM-dd","min_doc_count" : 0, "extended_bounds" : { "min" : "2016-01-01","max" : "2017-12-31"}}}} } {"took": 11,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 8,"max_score": 0,"hits": []},"aggregations": {"sales": {"buckets": [{"key_as_string": "2016-01-01","key": 1451606400000,"doc_count": 0},{"key_as_string": "2016-02-01","key": 1454284800000,"doc_count": 0},{"key_as_string": "2016-03-01","key": 1456790400000,"doc_count": 0},{"key_as_string": "2016-04-01","key": 1459468800000,"doc_count": 0},{"key_as_string": "2016-05-01","key": 1462060800000,"doc_count": 1},.....]}} } aggregation,scope,一個聚合操作,必須在query的搜索結果范圍內執行
出來兩個結果,一個結果,是基于query搜索結果來聚合的; 一個結果,是對所有數據執行聚合的

global

就是global bucket,就是將所有數據納入聚合的scope,而不管之前的query

GET /tvs/sales/_search {"size": 0, "query": {"term": {"brand": {"value": "長虹"}}},"aggs": {"single_brand_avg_price": {"avg": {"field": "price"}},"all": {"global": {},"aggs": {"all_brand_avg_price": {"avg": {"field": "price"}}}}} } {"took": 4,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 3,"max_score": 0,"hits": []},"aggregations": {"all": {"doc_count": 8,"all_brand_avg_price": {"value": 2650}},"single_brand_avg_price": {"value": 1666.6666666666667}} } single_brand_avg_price:就是針對query搜索結果,執行的,拿到的,就是長虹品牌的平均價格
all.all_brand_avg_price:拿到所有品牌的平均價格



總結

以上是生活随笔為你收集整理的Elasticsearch 之(6)kibana嵌套聚合,下钻分析,聚合分析的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。