生活随笔
收集整理的這篇文章主要介紹了
Solr实现SQL的查询与统计--转载
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
原文地址:http://shiyanjun.cn/archives/78.html
Cloudera公司已經推出了基于Hadoop平臺的查詢統計分析工具Impala,只要熟悉SQL,就可以熟練地使用Impala來執行查詢與分析的功能。不過Impala的SQL和關系數據庫的SQL還是有一點微妙地不同的。 下面,我們設計一個表,通過該表中的數據,來將SQL查詢與統計的語句,使用Solr查詢的方式來與SQL查詢對應。這個翻譯的過程,是非常有趣的,你可以看到Solr一些很不錯的功能。 用來示例的表結構設計,如圖所示:
下面,我們通過給出一些SQL查詢統計語句,然后對應翻譯成Solr查詢語句,然后對比結果。
查詢對比
SQL查詢語句:
1 SELECT?log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type
3 WHERE?prov_id = 1?AND?net_type = 1?AND?area_id = 10304?AND?time_type = 1?AND?time_id >= 20130801?AND?time_id <= 20130815
4 ORDER?BY?log_id LIMIT 10;
查詢結果,如圖所示: Solr查詢URL:
1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=prov_id:1 AND net_type:1 AND area_id:10304 AND time_type:1 AND time_id:[20130801 TO 20130815]&sort=log_id asc&start=0&rows=10
查詢結果,如下所示:
< response > < lst name ="responseHeader" > < int name ="status" > 0
</ int > < int name ="QTime" > 4
</ int > </ lst > < result name ="response" numFound ="77" start ="0" > < doc > < int name ="log_id" > 6827
</ int > < long name ="start_time" > 1375072117
</ long > < long name ="end_time" > 1375081683
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 11002
</ int > < int name ="cnt" > 0
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6827
</ int > < long name ="start_time" > 1375072117
</ long > < long name ="end_time" > 1375081683
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 11000
</ int > < int name ="cnt" > 0
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 14001
</ int > < int name ="cnt" > 5
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 11002
</ int > < int name ="cnt" > 23
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 10200
</ int > < int name ="cnt" > 55
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 14000
</ int > < int name ="cnt" > 4
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 11000
</ int > < int name ="cnt" > 1
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 10201
</ int > < int name ="cnt" > 31
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 8002
</ int > < int name ="cnt" > 8
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 8000
</ int > < int name ="cnt" > 30
</ int > < int name ="net_type" > 1
</ int > </ doc > </ result >
</ response > 對比上面結果,除了根據idt_id排序方式不同以外(Impala是升序,Solr是降序),其他是相同的。
SQL查詢語句:
1 SELECT?prov_id,?SUM(cnt)?AS?sum_cnt,?AVG(cnt)?AS?avg_cnt,?MAX(cnt)?AS?max_cnt,?MIN(cnt)?ASmin_cnt,?COUNT(cnt)?AS?count_cnt
查詢結果,如圖所示: Solr查詢URL:
1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&stats=true&stats.field=cnt&rows=0&indent=true
查詢結果,如下所示:
< response > < lst name ="responseHeader" > < int name ="status" > 0
</ int > < int name ="QTime" > 2
</ int > </ lst > < result name ="response" numFound ="4088" start ="0" ></ result > < lst name ="stats" > < lst name ="stats_fields" > < lst name ="cnt" > < double name ="min" > 0.0
</ double > < double name ="max" > 1258.0
</ double > < long name ="count" > 4088
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 32587.0
</ double > < double name ="sumOfSquares" > 9170559.0
</ double > < double name ="mean" > 7.971379647749511
</ double > < double name ="stddev" > 46.69344567709268
</ double > < lst name ="facets" /> </ lst > </ lst > </ lst >
</ response > 對比查詢結果,Solr提供了更多的統計項,如標準差(stddev)等,與SQL查詢結果是一致的。
SQL查詢語句:
1 SELECT?log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_typ
3 WHERE?prov_id = 1?AND?net_type = 1?AND?city_id?IN(106,103)?AND?idt_idIN(12011,5004,6051,6056,8002)?AND?time_type = 1?AND?time_id >= 20130801?AND?time_id <= 20130815
4 ORDER?BY?log_id, start_time?DESC?LIMIT 10;
查詢結果,如圖所示: Solr查詢URL:
http://slave1:8888/solr-cloud/i_event/select?q=*:*
&fl =log_id,start_time,end_time,prov_id,city_id,area_id,idt_id, cnt,net_type
&fq =prov_id:1 AND net_type:1 AND (city_id:106 OR city_id:103) AND (idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002) AND time_type:1 AND time_id:[20130801 TO 20130815]
&sort =log_id asc ,start_time desc
&start =0
&rows =10
或者:
http://slave1:8888/solr-cloud/i_event/select?q=*:*
&fl =log_id,start_time,end_time,prov_id,city_id,area_id,idt_id, cnt ,net_type
&fq =prov_id:1
&fq =net_type:1
&fq =(city_id:106 OR city_id:103)
&fq =(idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002)
&fq =time_type:1
&fq =time_id:[20130801 TO 20130815]
&sort =log_id asc,start_time desc
&start =0
&rows =10
查詢結果,如下所示:
< response > < lst name ="responseHeader" > < int name ="status" > 0
</ int > < int name ="QTime" > 6
</ int > </ lst > < result name ="response" numFound ="63" start ="0" > < doc > < int name ="log_id" > 6553
</ int > < long name ="start_time" > 1374054184
</ long > < long name ="end_time" > 1374054254
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 12011
</ int > < int name ="cnt" > 0
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6553
</ int > < long name ="start_time" > 1374054184
</ long > < long name ="end_time" > 1374054254
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 5004
</ int > < int name ="cnt" > 2
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6555
</ int > < long name ="start_time" > 1374055060
</ long > < long name ="end_time" > 1374055158
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 70104
</ int > < int name ="idt_id" > 5004
</ int > < int name ="cnt" > 3
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6555
</ int > < long name ="start_time" > 1374055060
</ long > < long name ="end_time" > 1374055158
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 70104
</ int > < int name ="idt_id" > 12011
</ int > < int name ="cnt" > 0
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6595
</ int > < long name ="start_time" > 1374292508
</ long > < long name ="end_time" > 1374292639
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 5004
</ int > < int name ="cnt" > 4
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6611
</ int > < long name ="start_time" > 1374461233
</ long > < long name ="end_time" > 1374461245
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 5004
</ int > < int name ="cnt" > 1
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6612
</ int > < long name ="start_time" > 1374461261
</ long > < long name ="end_time" > 1374461269
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 5004
</ int > < int name ="cnt" > 1
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6612
</ int > < long name ="start_time" > 1374461261
</ long > < long name ="end_time" > 1374461269
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 12011
</ int > < int name ="cnt" > 0
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6613
</ int > < long name ="start_time" > 1374461422
</ long > < long name ="end_time" > 1374461489
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 6056
</ int > < int name ="cnt" > 1
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6613
</ int > < long name ="start_time" > 1374461422
</ long > < long name ="end_time" > 1374461489
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 6051
</ int > < int name ="cnt" > 1
</ int > < int name ="net_type" > 1
</ int > </ doc > </ result >
</ response > 對比查詢結果,是一致的。
SQL查詢語句:
1 SELECT?log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type
3 WHERE?net_type = 1?AND?idt_id?IN(12011,5004,6051,6056,8002)?AND?time_type = 1?ANDstart_time >= 1373598465?AND?end_time < 1374055254
4 ORDER?BY?log_id, start_time, idt_id?DESC?LIMIT 30;
查詢結果,如圖所示: Solr查詢URL:
1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=net_type:1 AND (idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002) AND time_type:1 AND start_time:[1373598465 TO 1374055254]&fq =-start_time:1374055254&sort=log_id asc,start_time asc,idt_id desc&start=0&rows=30
或
1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=net_type:1 AND (idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002) AND time_type:1 AND start_time:[1373598465 TO 1374055254] AND -start_time:1374055254&sort=log_id asc,start_time asc,idt_id desc&start=0&rows=30
或
1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=net_type:1&fq=idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002&fq =time_type:1&fq=start_time:[1373598465 TO 1374055254]&fq =-start_time:1374055254&sort=log_id asc,start_time asc,idt_id desc&start=0&rows=30
查詢結果,如下所示:
< response > < lst name ="responseHeader" > < int name ="status" > 0
</ int > < int name ="QTime" > 5
</ int > </ lst > < result name ="response" numFound ="4" start ="0" > < doc > < int name ="log_id" > 6553
</ int > < long name ="start_time" > 1374054184
</ long > < long name ="end_time" > 1374054254
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 12011
</ int > < int name ="cnt" > 0
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6553
</ int > < long name ="start_time" > 1374054184
</ long > < long name ="end_time" > 1374054254
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 5004
</ int > < int name ="cnt" > 2
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6555
</ int > < long name ="start_time" > 1374055060
</ long > < long name ="end_time" > 1374055158
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 70104
</ int > < int name ="idt_id" > 12011
</ int > < int name ="cnt" > 0
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6555
</ int > < long name ="start_time" > 1374055060
</ long > < long name ="end_time" > 1374055158
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 70104
</ int > < int name ="idt_id" > 5004
</ int > < int name ="cnt" > 3
</ int > < int name ="net_type" > 1
</ int > </ doc > </ result >
</ response > SQL查詢語句:
1 SELECT?city_id, area_id,?COUNT(cnt)?AS?count_cnt
3 WHERE?prov_id = 1?AND?net_type = 1
4 GROUP?BY?city_id, area_id;
查詢結果,如圖所示: Solr查詢URL:
1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&facet=true&facet.pivot=city_id,area_id&fq=prov_id:1 AND net_type:1&rows=0&indent=true
查詢結果,如下所示:
< response > < lst name ="responseHeader" > < int name ="status" > 0
</ int > < int name ="QTime" > 72
</ int > </ lst > < result name ="response" numFound ="1171" start ="0" ></ result > < lst name ="facet_counts" > < lst name ="facet_queries" /> < lst name ="facet_fields" /> < lst name ="facet_dates" /> < lst name ="facet_ranges" /> < lst name ="facet_pivot" > < arr name ="city_id,area_id" > < lst > < str name ="field" > city_id
</ str > < int name ="value" > 103
</ int > < int name ="count" > 678
</ int > < arr name ="pivot" > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10307
</ int > < int name ="count" > 298
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10315
</ int > < int name ="count" > 120
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10317
</ int > < int name ="count" > 86
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10304
</ int > < int name ="count" > 67
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10310
</ int > < int name ="count" > 49
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 70104
</ int > < int name ="count" > 48
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10308
</ int > < int name ="count" > 6
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 0
</ int > < int name ="count" > 2
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10311
</ int > < int name ="count" > 2
</ int > </ lst > </ arr > </ lst > < lst > < str name ="field" > city_id
</ str > < int name ="value" > 0
</ int > < int name ="count" > 463
</ int > < arr name ="pivot" > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 0
</ int > < int name ="count" > 395
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10307
</ int > < int name ="count" > 68
</ int > </ lst > </ arr > </ lst > < lst > < str name ="field" > city_id
</ str > < int name ="value" > 106
</ int > < int name ="count" > 10
</ int > < arr name ="pivot" > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10304
</ int > < int name ="count" > 10
</ int > </ lst > </ arr > </ lst > < lst > < str name ="field" > city_id
</ str > < int name ="value" > 110
</ int > < int name ="count" > 8
</ int > < arr name ="pivot" > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 0
</ int > < int name ="count" > 8
</ int > </ lst > </ arr > </ lst > < lst > < str name ="field" > city_id
</ str > < int name ="value" > 118
</ int > < int name ="count" > 8
</ int > < arr name ="pivot" > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10316
</ int > < int name ="count" > 8
</ int > </ lst > </ arr > </ lst > < lst > < str name ="field" > city_id
</ str > < int name ="value" > 105
</ int > < int name ="count" > 4
</ int > < arr name ="pivot" > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 0
</ int > < int name ="count" > 4
</ int > </ lst > </ arr > </ lst > </ arr > </ lst > </ lst >
</ response > 對比上面結果,Solr查詢結果,需要從上面的各組中進行合并,得到最終的統計結果,結果和SQL結果是一致的。
多個字段分組統計(支持count、sum、max、min等函數) 一次對多個字段進行獨立分組統計,Solr可以很好的支持。這相當于執行兩個帶有GROUP BY子句的SQL,這兩個GROUP BY分別只對一個字段進行匯總統計。 SQL查詢語句:
1 SELECT?city_id, area_id,?COUNT(cnt)?AS?count_cnt
3 WHERE?prov_id = 1?AND?net_type = 1
6 SELECT?city_id, area_id,?COUNT(cnt)?AS?count_cnt
8 WHERE?prov_id = 1?AND?net_type = 1
查詢結果,不再顯示。 Solr查詢URL:
1 >http://slave1:8888/solr-cloud/i_event/select?q=*:*&stats=true&stats.field=cnt&f.cnt.stats.facet=city_id&&f.cnt.stats.facet=area_id&fq=prov_id:1 AND net_type:1&rows=0&indent=true
查詢結果,如下所示:
< response > < lst name ="responseHeader" > < int name ="status" > 0
</ int > < int name ="QTime" > 6
</ int > </ lst > < result name ="response" numFound ="1171" start ="0" ></ result > < lst name ="stats" > < lst name ="stats_fields" > < lst name ="cnt" > < double name ="min" > 0.0
</ double > < double name ="max" > 167.0
</ double > < long name ="count" > 1171
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 3701.0
</ double > < double name ="sumOfSquares" > 249641.0
</ double > < double name ="mean" > 3.1605465414175917
</ double > < double name ="stddev" > 14.260812879164407
</ double > < lst name ="facets" > < lst name ="city_id" > < lst name ="0" > < double name ="min" > 0.0
</ double > < double name ="max" > 167.0
</ double > < long name ="count" > 463
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 2783.0
</ double > < double name ="sumOfSquares" > 238819.0
</ double > < double name ="mean" > 6.010799136069115
</ double > < double name ="stddev" > 21.92524420257807
</ double > < lst name ="facets" /> </ lst > < lst name ="110" > < double name ="min" > 0.0
</ double > < double name ="max" > 1.0
</ double > < long name ="count" > 8
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 3.0
</ double > < double name ="sumOfSquares" > 3.0
</ double > < double name ="mean" > 0.375
</ double > < double name ="stddev" > 0.5175491695067657
</ double > < lst name ="facets" /> </ lst > < lst name ="106" > < double name ="min" > 0.0
</ double > < double name ="max" > 0.0
</ double > < long name ="count" > 10
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 0.0
</ double > < double name ="sumOfSquares" > 0.0
</ double > < double name ="mean" > 0.0
</ double > < double name ="stddev" > 0.0
</ double > < lst name ="facets" /> </ lst > < lst name ="105" > < double name ="min" > 0.0
</ double > < double name ="max" > 0.0
</ double > < long name ="count" > 4
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 0.0
</ double > < double name ="sumOfSquares" > 0.0
</ double > < double name ="mean" > 0.0
</ double > < double name ="stddev" > 0.0
</ double > < lst name ="facets" /> </ lst > < lst name ="103" > < double name ="min" > 0.0
</ double > < double name ="max" > 55.0
</ double > < long name ="count" > 678
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 915.0
</ double > < double name ="sumOfSquares" > 10819.0
</ double > < double name ="mean" > 1.3495575221238938
</ double > < double name ="stddev" > 3.7625525739676986
</ double > < lst name ="facets" /> </ lst > < lst name ="118" > < double name ="min" > 0.0
</ double > < double name ="max" > 0.0
</ double > < long name ="count" > 8
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 0.0
</ double > < double name ="sumOfSquares" > 0.0
</ double > < double name ="mean" > 0.0
</ double > < double name ="stddev" > 0.0
</ double > < lst name ="facets" /> </ lst > </ lst > < lst name ="area_id" > < lst name ="10308" > < double name ="min" > 0.0
</ double > < double name ="max" > 1.0
</ double > < long name ="count" > 6
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 1.0
</ double > < double name ="sumOfSquares" > 1.0
</ double > < double name ="mean" > 0.16666666666666666
</ double > < double name ="stddev" > 0.408248290463863
</ double > < lst name ="facets" /> </ lst > < lst name ="10310" > < double name ="min" > 0.0
</ double > < double name ="max" > 5.0
</ double > < long name ="count" > 49
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 40.0
</ double > < double name ="sumOfSquares" > 108.0
</ double > < double name ="mean" > 0.8163265306122449
</ double > < double name ="stddev" > 1.2528878206593208
</ double > < lst name ="facets" /> </ lst > < lst name ="0" > < double name ="min" > 0.0
</ double > < double name ="max" > 167.0
</ double > < long name ="count" > 409
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 2722.0
</ double > < double name ="sumOfSquares" > 238550.0
</ double > < double name ="mean" > 6.6552567237163816
</ double > < double name ="stddev" > 23.243931908854
</ double > < lst name ="facets" /> </ lst > < lst name ="10311" > < double name ="min" > 0.0
</ double > < double name ="max" > 0.0
</ double > < long name ="count" > 2
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 0.0
</ double > < double name ="sumOfSquares" > 0.0
</ double > < double name ="mean" > 0.0
</ double > < double name ="stddev" > 0.0
</ double > < lst name ="facets" /> </ lst > < lst name ="10304" > < double name ="min" > 0.0
</ double > < double name ="max" > 55.0
</ double > < long name ="count" > 77
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 370.0
</ double > < double name ="sumOfSquares" > 9476.0
</ double > < double name ="mean" > 4.805194805194805
</ double > < double name ="stddev" > 10.064318107786017
</ double > < lst name ="facets" /> </ lst > < lst name ="70104" > < double name ="min" > 0.0
</ double > < double name ="max" > 3.0
</ double > < long name ="count" > 48
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 51.0
</ double > < double name ="sumOfSquares" > 117.0
</ double > < double name ="mean" > 1.0625
</ double > < double name ="stddev" > 1.1560433254047038
</ double > < lst name ="facets" /> </ lst > < lst name ="10307" > < double name ="min" > 0.0
</ double > < double name ="max" > 12.0
</ double > < long name ="count" > 366
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 274.0
</ double > < double name ="sumOfSquares" > 768.0
</ double > < double name ="mean" > 0.7486338797814208
</ double > < double name ="stddev" > 1.2418218134151426
</ double > < lst name ="facets" /> </ lst > < lst name ="10315" > < double name ="min" > 0.0
</ double > < double name ="max" > 4.0
</ double > < long name ="count" > 120
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 143.0
</ double > < double name ="sumOfSquares" > 359.0
</ double > < double name ="mean" > 1.1916666666666667
</ double > < double name ="stddev" > 1.2588899560996694
</ double > < lst name ="facets" /> </ lst > < lst name ="10316" > < double name ="min" > 0.0
</ double > < double name ="max" > 0.0
</ double > < long name ="count" > 8
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 0.0
</ double > < double name ="sumOfSquares" > 0.0
</ double > < double name ="mean" > 0.0
</ double > < double name ="stddev" > 0.0
</ double > < lst name ="facets" /> </ lst > < lst name ="10317" > < double name ="min" > 0.0
</ double > < double name ="max" > 5.0
</ double > < long name ="count" > 86
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 100.0
</ double > < double name ="sumOfSquares" > 262.0
</ double > < double name ="mean" > 1.1627906976744187
</ double > < double name ="stddev" > 1.3093371930442208
</ double > < lst name ="facets" /> </ lst > </ lst > </ lst > </ lst > </ lst > </ lst >
</ response > 多個字段聯合分組統計(支持count、sum、max、min等函數) SQL查詢語句:
1 SELECT?city_id, area_id,?SUM(cnt)?AS?sum_cnt,?AVG(cnt)?AS?avg_cnt,?MAX(cnt)?AS?max_cnt,MIN(cnt)?AS?min_cnt,?COUNT(cnt)?AS?count_cnt
3 WHERE?prov_id = 1?AND?net_type = 1
4 GROUP?BY?city_id, area_id;
查詢結果,如圖所示: Solr目前不能簡單的支持這種查詢,如果想要滿足這種查詢統計,需要在schema的設計上,將一個字段設置為多值,然后通過多個值進行分組統計。如果應用中查詢統計分析的模式比較固定,預先知道哪些字段會用于聯合分組統計,完全可以在設計的時候,考慮設置多值字段來滿足這種需求。
參考鏈接
http://wiki.apache.org/solr/SimpleFacetParameters http://wiki.apache.org/solr/HierarchicalFaceting#Pivot_Facets http://docs.lucidworks.com/display/solr/The+Stats+Component http://docs.lucidworks.com/display/solr/Faceting ?
轉載于:https://www.cnblogs.com/davidwang456/p/4818749.html
總結
以上是生活随笔 為你收集整理的Solr实现SQL的查询与统计--转载 的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔 網站內容還不錯,歡迎將生活随笔 推薦給好友。