生活随笔
收集整理的這篇文章主要介紹了
Solr实现SQL的查询与统计--转载
小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
原文地址:http://shiyanjun.cn/archives/78.html
Cloudera公司已經(jīng)推出了基于Hadoop平臺(tái)的查詢統(tǒng)計(jì)分析工具Impala,只要熟悉SQL,就可以熟練地使用Impala來(lái)執(zhí)行查詢與分析的功能。不過(guò)Impala的SQL和關(guān)系數(shù)據(jù)庫(kù)的SQL還是有一點(diǎn)微妙地不同的。 下面,我們?cè)O(shè)計(jì)一個(gè)表,通過(guò)該表中的數(shù)據(jù),來(lái)將SQL查詢與統(tǒng)計(jì)的語(yǔ)句,使用Solr查詢的方式來(lái)與SQL查詢對(duì)應(yīng)。這個(gè)翻譯的過(guò)程,是非常有趣的,你可以看到Solr一些很不錯(cuò)的功能。 用來(lái)示例的表結(jié)構(gòu)設(shè)計(jì),如圖所示:
下面,我們通過(guò)給出一些SQL查詢統(tǒng)計(jì)語(yǔ)句,然后對(duì)應(yīng)翻譯成Solr查詢語(yǔ)句,然后對(duì)比結(jié)果。
查詢對(duì)比
SQL查詢語(yǔ)句:
1 SELECT?log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type
3 WHERE?prov_id = 1?AND?net_type = 1?AND?area_id = 10304?AND?time_type = 1?AND?time_id >= 20130801?AND?time_id <= 20130815
4 ORDER?BY?log_id LIMIT 10;
查詢結(jié)果,如圖所示: Solr查詢URL:
1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=prov_id:1 AND net_type:1 AND area_id:10304 AND time_type:1 AND time_id:[20130801 TO 20130815]&sort=log_id asc&start=0&rows=10
查詢結(jié)果,如下所示:
< response > < lst name ="responseHeader" > < int name ="status" > 0
</ int > < int name ="QTime" > 4
</ int > </ lst > < result name ="response" numFound ="77" start ="0" > < doc > < int name ="log_id" > 6827
</ int > < long name ="start_time" > 1375072117
</ long > < long name ="end_time" > 1375081683
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 11002
</ int > < int name ="cnt" > 0
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6827
</ int > < long name ="start_time" > 1375072117
</ long > < long name ="end_time" > 1375081683
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 11000
</ int > < int name ="cnt" > 0
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 14001
</ int > < int name ="cnt" > 5
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 11002
</ int > < int name ="cnt" > 23
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 10200
</ int > < int name ="cnt" > 55
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 14000
</ int > < int name ="cnt" > 4
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 11000
</ int > < int name ="cnt" > 1
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 10201
</ int > < int name ="cnt" > 31
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 8002
</ int > < int name ="cnt" > 8
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6851
</ int > < long name ="start_time" > 1375142158
</ long > < long name ="end_time" > 1375146391
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10304
</ int > < int name ="idt_id" > 8000
</ int > < int name ="cnt" > 30
</ int > < int name ="net_type" > 1
</ int > </ doc > </ result >
</ response > 對(duì)比上面結(jié)果,除了根據(jù)idt_id排序方式不同以外(Impala是升序,Solr是降序),其他是相同的。
SQL查詢語(yǔ)句:
1 SELECT?prov_id,?SUM(cnt)?AS?sum_cnt,?AVG(cnt)?AS?avg_cnt,?MAX(cnt)?AS?max_cnt,?MIN(cnt)?ASmin_cnt,?COUNT(cnt)?AS?count_cnt
查詢結(jié)果,如圖所示: Solr查詢URL:
1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&stats=true&stats.field=cnt&rows=0&indent=true
查詢結(jié)果,如下所示:
< response > < lst name ="responseHeader" > < int name ="status" > 0
</ int > < int name ="QTime" > 2
</ int > </ lst > < result name ="response" numFound ="4088" start ="0" ></ result > < lst name ="stats" > < lst name ="stats_fields" > < lst name ="cnt" > < double name ="min" > 0.0
</ double > < double name ="max" > 1258.0
</ double > < long name ="count" > 4088
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 32587.0
</ double > < double name ="sumOfSquares" > 9170559.0
</ double > < double name ="mean" > 7.971379647749511
</ double > < double name ="stddev" > 46.69344567709268
</ double > < lst name ="facets" /> </ lst > </ lst > </ lst >
</ response > 對(duì)比查詢結(jié)果,Solr提供了更多的統(tǒng)計(jì)項(xiàng),如標(biāo)準(zhǔn)差(stddev)等,與SQL查詢結(jié)果是一致的。
SQL查詢語(yǔ)句:
1 SELECT?log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_typ
3 WHERE?prov_id = 1?AND?net_type = 1?AND?city_id?IN(106,103)?AND?idt_idIN(12011,5004,6051,6056,8002)?AND?time_type = 1?AND?time_id >= 20130801?AND?time_id <= 20130815
4 ORDER?BY?log_id, start_time?DESC?LIMIT 10;
查詢結(jié)果,如圖所示: Solr查詢URL:
http://slave1:8888/solr-cloud/i_event/select?q=*:*
&fl =log_id,start_time,end_time,prov_id,city_id,area_id,idt_id, cnt,net_type
&fq =prov_id:1 AND net_type:1 AND (city_id:106 OR city_id:103) AND (idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002) AND time_type:1 AND time_id:[20130801 TO 20130815]
&sort =log_id asc ,start_time desc
&start =0
&rows =10
或者:
http://slave1:8888/solr-cloud/i_event/select?q=*:*
&fl =log_id,start_time,end_time,prov_id,city_id,area_id,idt_id, cnt ,net_type
&fq =prov_id:1
&fq =net_type:1
&fq =(city_id:106 OR city_id:103)
&fq =(idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002)
&fq =time_type:1
&fq =time_id:[20130801 TO 20130815]
&sort =log_id asc,start_time desc
&start =0
&rows =10
查詢結(jié)果,如下所示:
< response > < lst name ="responseHeader" > < int name ="status" > 0
</ int > < int name ="QTime" > 6
</ int > </ lst > < result name ="response" numFound ="63" start ="0" > < doc > < int name ="log_id" > 6553
</ int > < long name ="start_time" > 1374054184
</ long > < long name ="end_time" > 1374054254
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 12011
</ int > < int name ="cnt" > 0
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6553
</ int > < long name ="start_time" > 1374054184
</ long > < long name ="end_time" > 1374054254
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 5004
</ int > < int name ="cnt" > 2
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6555
</ int > < long name ="start_time" > 1374055060
</ long > < long name ="end_time" > 1374055158
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 70104
</ int > < int name ="idt_id" > 5004
</ int > < int name ="cnt" > 3
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6555
</ int > < long name ="start_time" > 1374055060
</ long > < long name ="end_time" > 1374055158
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 70104
</ int > < int name ="idt_id" > 12011
</ int > < int name ="cnt" > 0
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6595
</ int > < long name ="start_time" > 1374292508
</ long > < long name ="end_time" > 1374292639
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 5004
</ int > < int name ="cnt" > 4
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6611
</ int > < long name ="start_time" > 1374461233
</ long > < long name ="end_time" > 1374461245
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 5004
</ int > < int name ="cnt" > 1
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6612
</ int > < long name ="start_time" > 1374461261
</ long > < long name ="end_time" > 1374461269
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 5004
</ int > < int name ="cnt" > 1
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6612
</ int > < long name ="start_time" > 1374461261
</ long > < long name ="end_time" > 1374461269
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 12011
</ int > < int name ="cnt" > 0
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6613
</ int > < long name ="start_time" > 1374461422
</ long > < long name ="end_time" > 1374461489
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 6056
</ int > < int name ="cnt" > 1
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6613
</ int > < long name ="start_time" > 1374461422
</ long > < long name ="end_time" > 1374461489
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 6051
</ int > < int name ="cnt" > 1
</ int > < int name ="net_type" > 1
</ int > </ doc > </ result >
</ response > 對(duì)比查詢結(jié)果,是一致的。
SQL查詢語(yǔ)句:
1 SELECT?log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type
3 WHERE?net_type = 1?AND?idt_id?IN(12011,5004,6051,6056,8002)?AND?time_type = 1?ANDstart_time >= 1373598465?AND?end_time < 1374055254
4 ORDER?BY?log_id, start_time, idt_id?DESC?LIMIT 30;
查詢結(jié)果,如圖所示: Solr查詢URL:
1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=net_type:1 AND (idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002) AND time_type:1 AND start_time:[1373598465 TO 1374055254]&fq =-start_time:1374055254&sort=log_id asc,start_time asc,idt_id desc&start=0&rows=30
或
1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=net_type:1 AND (idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002) AND time_type:1 AND start_time:[1373598465 TO 1374055254] AND -start_time:1374055254&sort=log_id asc,start_time asc,idt_id desc&start=0&rows=30
或
1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=net_type:1&fq=idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002&fq =time_type:1&fq=start_time:[1373598465 TO 1374055254]&fq =-start_time:1374055254&sort=log_id asc,start_time asc,idt_id desc&start=0&rows=30
查詢結(jié)果,如下所示:
< response > < lst name ="responseHeader" > < int name ="status" > 0
</ int > < int name ="QTime" > 5
</ int > </ lst > < result name ="response" numFound ="4" start ="0" > < doc > < int name ="log_id" > 6553
</ int > < long name ="start_time" > 1374054184
</ long > < long name ="end_time" > 1374054254
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 12011
</ int > < int name ="cnt" > 0
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6553
</ int > < long name ="start_time" > 1374054184
</ long > < long name ="end_time" > 1374054254
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 10307
</ int > < int name ="idt_id" > 5004
</ int > < int name ="cnt" > 2
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6555
</ int > < long name ="start_time" > 1374055060
</ long > < long name ="end_time" > 1374055158
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 70104
</ int > < int name ="idt_id" > 12011
</ int > < int name ="cnt" > 0
</ int > < int name ="net_type" > 1
</ int > </ doc > < doc > < int name ="log_id" > 6555
</ int > < long name ="start_time" > 1374055060
</ long > < long name ="end_time" > 1374055158
</ long > < int name ="prov_id" > 1
</ int > < int name ="city_id" > 103
</ int > < int name ="area_id" > 70104
</ int > < int name ="idt_id" > 5004
</ int > < int name ="cnt" > 3
</ int > < int name ="net_type" > 1
</ int > </ doc > </ result >
</ response > 多個(gè)字段分組統(tǒng)計(jì)(只支持count函數(shù)) SQL查詢語(yǔ)句:
1 SELECT?city_id, area_id,?COUNT(cnt)?AS?count_cnt
3 WHERE?prov_id = 1?AND?net_type = 1
4 GROUP?BY?city_id, area_id;
查詢結(jié)果,如圖所示: Solr查詢URL:
1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&facet=true&facet.pivot=city_id,area_id&fq=prov_id:1 AND net_type:1&rows=0&indent=true
查詢結(jié)果,如下所示:
< response > < lst name ="responseHeader" > < int name ="status" > 0
</ int > < int name ="QTime" > 72
</ int > </ lst > < result name ="response" numFound ="1171" start ="0" ></ result > < lst name ="facet_counts" > < lst name ="facet_queries" /> < lst name ="facet_fields" /> < lst name ="facet_dates" /> < lst name ="facet_ranges" /> < lst name ="facet_pivot" > < arr name ="city_id,area_id" > < lst > < str name ="field" > city_id
</ str > < int name ="value" > 103
</ int > < int name ="count" > 678
</ int > < arr name ="pivot" > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10307
</ int > < int name ="count" > 298
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10315
</ int > < int name ="count" > 120
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10317
</ int > < int name ="count" > 86
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10304
</ int > < int name ="count" > 67
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10310
</ int > < int name ="count" > 49
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 70104
</ int > < int name ="count" > 48
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10308
</ int > < int name ="count" > 6
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 0
</ int > < int name ="count" > 2
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10311
</ int > < int name ="count" > 2
</ int > </ lst > </ arr > </ lst > < lst > < str name ="field" > city_id
</ str > < int name ="value" > 0
</ int > < int name ="count" > 463
</ int > < arr name ="pivot" > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 0
</ int > < int name ="count" > 395
</ int > </ lst > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10307
</ int > < int name ="count" > 68
</ int > </ lst > </ arr > </ lst > < lst > < str name ="field" > city_id
</ str > < int name ="value" > 106
</ int > < int name ="count" > 10
</ int > < arr name ="pivot" > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10304
</ int > < int name ="count" > 10
</ int > </ lst > </ arr > </ lst > < lst > < str name ="field" > city_id
</ str > < int name ="value" > 110
</ int > < int name ="count" > 8
</ int > < arr name ="pivot" > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 0
</ int > < int name ="count" > 8
</ int > </ lst > </ arr > </ lst > < lst > < str name ="field" > city_id
</ str > < int name ="value" > 118
</ int > < int name ="count" > 8
</ int > < arr name ="pivot" > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 10316
</ int > < int name ="count" > 8
</ int > </ lst > </ arr > </ lst > < lst > < str name ="field" > city_id
</ str > < int name ="value" > 105
</ int > < int name ="count" > 4
</ int > < arr name ="pivot" > < lst > < str name ="field" > area_id
</ str > < int name ="value" > 0
</ int > < int name ="count" > 4
</ int > </ lst > </ arr > </ lst > </ arr > </ lst > </ lst >
</ response > 對(duì)比上面結(jié)果,Solr查詢結(jié)果,需要從上面的各組中進(jìn)行合并,得到最終的統(tǒng)計(jì)結(jié)果,結(jié)果和SQL結(jié)果是一致的。
多個(gè)字段分組統(tǒng)計(jì)(支持count、sum、max、min等函數(shù)) 一次對(duì)多個(gè)字段進(jìn)行獨(dú)立分組統(tǒng)計(jì),Solr可以很好的支持。這相當(dāng)于執(zhí)行兩個(gè)帶有GROUP BY子句的SQL,這兩個(gè)GROUP BY分別只對(duì)一個(gè)字段進(jìn)行匯總統(tǒng)計(jì)。 SQL查詢語(yǔ)句:
1 SELECT?city_id, area_id,?COUNT(cnt)?AS?count_cnt
3 WHERE?prov_id = 1?AND?net_type = 1
6 SELECT?city_id, area_id,?COUNT(cnt)?AS?count_cnt
8 WHERE?prov_id = 1?AND?net_type = 1
查詢結(jié)果,不再顯示。 Solr查詢URL:
1 >http://slave1:8888/solr-cloud/i_event/select?q=*:*&stats=true&stats.field=cnt&f.cnt.stats.facet=city_id&&f.cnt.stats.facet=area_id&fq=prov_id:1 AND net_type:1&rows=0&indent=true
查詢結(jié)果,如下所示:
< response > < lst name ="responseHeader" > < int name ="status" > 0
</ int > < int name ="QTime" > 6
</ int > </ lst > < result name ="response" numFound ="1171" start ="0" ></ result > < lst name ="stats" > < lst name ="stats_fields" > < lst name ="cnt" > < double name ="min" > 0.0
</ double > < double name ="max" > 167.0
</ double > < long name ="count" > 1171
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 3701.0
</ double > < double name ="sumOfSquares" > 249641.0
</ double > < double name ="mean" > 3.1605465414175917
</ double > < double name ="stddev" > 14.260812879164407
</ double > < lst name ="facets" > < lst name ="city_id" > < lst name ="0" > < double name ="min" > 0.0
</ double > < double name ="max" > 167.0
</ double > < long name ="count" > 463
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 2783.0
</ double > < double name ="sumOfSquares" > 238819.0
</ double > < double name ="mean" > 6.010799136069115
</ double > < double name ="stddev" > 21.92524420257807
</ double > < lst name ="facets" /> </ lst > < lst name ="110" > < double name ="min" > 0.0
</ double > < double name ="max" > 1.0
</ double > < long name ="count" > 8
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 3.0
</ double > < double name ="sumOfSquares" > 3.0
</ double > < double name ="mean" > 0.375
</ double > < double name ="stddev" > 0.5175491695067657
</ double > < lst name ="facets" /> </ lst > < lst name ="106" > < double name ="min" > 0.0
</ double > < double name ="max" > 0.0
</ double > < long name ="count" > 10
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 0.0
</ double > < double name ="sumOfSquares" > 0.0
</ double > < double name ="mean" > 0.0
</ double > < double name ="stddev" > 0.0
</ double > < lst name ="facets" /> </ lst > < lst name ="105" > < double name ="min" > 0.0
</ double > < double name ="max" > 0.0
</ double > < long name ="count" > 4
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 0.0
</ double > < double name ="sumOfSquares" > 0.0
</ double > < double name ="mean" > 0.0
</ double > < double name ="stddev" > 0.0
</ double > < lst name ="facets" /> </ lst > < lst name ="103" > < double name ="min" > 0.0
</ double > < double name ="max" > 55.0
</ double > < long name ="count" > 678
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 915.0
</ double > < double name ="sumOfSquares" > 10819.0
</ double > < double name ="mean" > 1.3495575221238938
</ double > < double name ="stddev" > 3.7625525739676986
</ double > < lst name ="facets" /> </ lst > < lst name ="118" > < double name ="min" > 0.0
</ double > < double name ="max" > 0.0
</ double > < long name ="count" > 8
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 0.0
</ double > < double name ="sumOfSquares" > 0.0
</ double > < double name ="mean" > 0.0
</ double > < double name ="stddev" > 0.0
</ double > < lst name ="facets" /> </ lst > </ lst > < lst name ="area_id" > < lst name ="10308" > < double name ="min" > 0.0
</ double > < double name ="max" > 1.0
</ double > < long name ="count" > 6
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 1.0
</ double > < double name ="sumOfSquares" > 1.0
</ double > < double name ="mean" > 0.16666666666666666
</ double > < double name ="stddev" > 0.408248290463863
</ double > < lst name ="facets" /> </ lst > < lst name ="10310" > < double name ="min" > 0.0
</ double > < double name ="max" > 5.0
</ double > < long name ="count" > 49
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 40.0
</ double > < double name ="sumOfSquares" > 108.0
</ double > < double name ="mean" > 0.8163265306122449
</ double > < double name ="stddev" > 1.2528878206593208
</ double > < lst name ="facets" /> </ lst > < lst name ="0" > < double name ="min" > 0.0
</ double > < double name ="max" > 167.0
</ double > < long name ="count" > 409
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 2722.0
</ double > < double name ="sumOfSquares" > 238550.0
</ double > < double name ="mean" > 6.6552567237163816
</ double > < double name ="stddev" > 23.243931908854
</ double > < lst name ="facets" /> </ lst > < lst name ="10311" > < double name ="min" > 0.0
</ double > < double name ="max" > 0.0
</ double > < long name ="count" > 2
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 0.0
</ double > < double name ="sumOfSquares" > 0.0
</ double > < double name ="mean" > 0.0
</ double > < double name ="stddev" > 0.0
</ double > < lst name ="facets" /> </ lst > < lst name ="10304" > < double name ="min" > 0.0
</ double > < double name ="max" > 55.0
</ double > < long name ="count" > 77
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 370.0
</ double > < double name ="sumOfSquares" > 9476.0
</ double > < double name ="mean" > 4.805194805194805
</ double > < double name ="stddev" > 10.064318107786017
</ double > < lst name ="facets" /> </ lst > < lst name ="70104" > < double name ="min" > 0.0
</ double > < double name ="max" > 3.0
</ double > < long name ="count" > 48
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 51.0
</ double > < double name ="sumOfSquares" > 117.0
</ double > < double name ="mean" > 1.0625
</ double > < double name ="stddev" > 1.1560433254047038
</ double > < lst name ="facets" /> </ lst > < lst name ="10307" > < double name ="min" > 0.0
</ double > < double name ="max" > 12.0
</ double > < long name ="count" > 366
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 274.0
</ double > < double name ="sumOfSquares" > 768.0
</ double > < double name ="mean" > 0.7486338797814208
</ double > < double name ="stddev" > 1.2418218134151426
</ double > < lst name ="facets" /> </ lst > < lst name ="10315" > < double name ="min" > 0.0
</ double > < double name ="max" > 4.0
</ double > < long name ="count" > 120
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 143.0
</ double > < double name ="sumOfSquares" > 359.0
</ double > < double name ="mean" > 1.1916666666666667
</ double > < double name ="stddev" > 1.2588899560996694
</ double > < lst name ="facets" /> </ lst > < lst name ="10316" > < double name ="min" > 0.0
</ double > < double name ="max" > 0.0
</ double > < long name ="count" > 8
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 0.0
</ double > < double name ="sumOfSquares" > 0.0
</ double > < double name ="mean" > 0.0
</ double > < double name ="stddev" > 0.0
</ double > < lst name ="facets" /> </ lst > < lst name ="10317" > < double name ="min" > 0.0
</ double > < double name ="max" > 5.0
</ double > < long name ="count" > 86
</ long > < long name ="missing" > 0
</ long > < double name ="sum" > 100.0
</ double > < double name ="sumOfSquares" > 262.0
</ double > < double name ="mean" > 1.1627906976744187
</ double > < double name ="stddev" > 1.3093371930442208
</ double > < lst name ="facets" /> </ lst > </ lst > </ lst > </ lst > </ lst > </ lst >
</ response > 多個(gè)字段聯(lián)合分組統(tǒng)計(jì)(支持count、sum、max、min等函數(shù)) SQL查詢語(yǔ)句:
1 SELECT?city_id, area_id,?SUM(cnt)?AS?sum_cnt,?AVG(cnt)?AS?avg_cnt,?MAX(cnt)?AS?max_cnt,MIN(cnt)?AS?min_cnt,?COUNT(cnt)?AS?count_cnt
3 WHERE?prov_id = 1?AND?net_type = 1
4 GROUP?BY?city_id, area_id;
查詢結(jié)果,如圖所示: Solr目前不能簡(jiǎn)單的支持這種查詢,如果想要滿足這種查詢統(tǒng)計(jì),需要在schema的設(shè)計(jì)上,將一個(gè)字段設(shè)置為多值,然后通過(guò)多個(gè)值進(jìn)行分組統(tǒng)計(jì)。如果應(yīng)用中查詢統(tǒng)計(jì)分析的模式比較固定,預(yù)先知道哪些字段會(huì)用于聯(lián)合分組統(tǒng)計(jì),完全可以在設(shè)計(jì)的時(shí)候,考慮設(shè)置多值字段來(lái)滿足這種需求。
參考鏈接
http://wiki.apache.org/solr/SimpleFacetParameters http://wiki.apache.org/solr/HierarchicalFaceting#Pivot_Facets http://docs.lucidworks.com/display/solr/The+Stats+Component http://docs.lucidworks.com/display/solr/Faceting ?
轉(zhuǎn)載于:https://www.cnblogs.com/davidwang456/p/4818749.html
總結(jié)
以上是生活随笔 為你收集整理的Solr实现SQL的查询与统计--转载 的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
如果覺(jué)得生活随笔 網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔 推薦給好友。