當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

by截取字段 group_深入理解 group by【思考点滴】

發布時間：2025/3/15 编程问答 19 豆豆

生活随笔收集整理的這篇文章主要介紹了 by截取字段 group_深入理解 group by【思考点滴】小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

網上搜索 group by，結果一大堆，千篇一律

……

現在從淺入深的講一下 group by的應用

1、group by的作用

group by 簡單來講，就是分組，配合計算使用，計算方法如SUM、count、avg、max、min等…

應用舉例 : 從交易表中查詢出今天有交易的商戶賬戶。

2、使用 group by 完成一個簡單的應用

查詢order_amount表中，是否有指定的賬戶 ?：使用group by 就能做到 : select account_id from order_amount

group by account_id；

同樣distinct 也能做到：使用 distinct 也能做到 : select distinct account_id from

order_amount ;

如上使用 group by 和 distinct 的效率是一樣的，如果是查找是否存在，建議使用 order by + limit 1，掃描到第一條數據就完成，效率最高

select

account_id from

order_amount where

account_id =

xxx limit 1，

group by 和 distinct 使用上的差異

舉例1：select count(distinct account_id)

from order_amount ;

舉例2：select count(account_id) from order_amount

group by account_id;

感興趣的話，可以自己運行一下。

差異：

group by : 先分組，再計算

distinct ：先去重，再計算，distinct 只是去重，不太適合條件和計算類查找

復雜應用，通過兩個條件鎖定一條記錄

舉例，獲取一批商戶，符合指定條件的最后一個訂單詳情(如：12點到15點之間，參與秒殺商戶的最后一個訂單)

方法1：

1、select * from (select * from order_amount

where xxx yyy zzz) as sel group

by xy

方法2：

2、select * from? order_amount

group by xy

having id = (select max(id) from order_amount

where xxx yyy zzz)

注 distinct 無法完成本類需求，只能靜靜的看著 group

by 表演了。

方法1分析：涉及子查詢，子查詢空間和效率的問題都暴露出來了。

select * from (select * from order_amount

where xxx yyy zzz) as sel group

by xy

優點：

可以實現功能

缺點：

1、group by 中的，除了算法使用的字段和group

by 以外的字段，其它字段的值是隨機的，默認獲取的是選擇查詢索引(where或者group

by)的第一條符合分組的記錄填充的。

2、當子查詢的結果非常大的時候，數據庫服務器的臨時表空間會用完，因此多余的查詢數據會丟失

3、子查詢生成的臨時表，沒有索引可用，如果臨時表數據很大，則主select語句的效率也很低

4、子查詢結果很大的時候，生成臨時表的時間也很長

如果子查詢的數據超過1G【1G一般是mysql中默認的，子查詢配置的表大小，數量差不多是500萬條以上數據】，則后面的查詢結構就丟失了。造成隨機性數據丟失的問題。

所以一般數據量都不會踩到這個坑，踩到這個坑的都不是一般的數據量！

方法2分析：

select * from? order_amount

group by xy

having id = (select max(id) from order_amount

where xxx yyy zzz)

優點：

可以實現功能，合理的利用了having

語句，查詢結果集很小，無臨時表空間占滿的問題

缺點：

1、效率偏低。

方法3：

select * from order_amount

where id in (select max(id) from order_amount

where

xxx yyy

zzz

group

xy)

優點：

可以實現功能，查詢結果集很小，無臨時表空間占滿的問題，效率應該比網友指路要好很多

缺點：

不能說沒有缺點，暫時是最好的選擇。

更復雜的需求

step 1：先定位出唯一記錄的ID或者索引信息

需求1:12點到15點之間，最后完成，且最后創建的訂單select max(concat(complete_time,create_time)) from order_amount where xxx yyy zzz group by xy

需求2:12點到15點之間，最后完成，且最先創建的訂單：select max(concat(complete_time,2000000000-create_time)) from order_amount where xxx yyy zzz group by xy

需求3:12點到15點之間，最先完成，且最后創建的訂單：select min(concat(complete_time,2000000000-create_time)) from order_amount where xxx yyy zzz group by xy

需求4:12點到15點之間，最先完成，且最先創建的訂單：select min(concat(complete_time,create_time)) from order_amount where xxx yyy zzz group by xy

step 2：通過如上唯一信息，查詢唯一數據

如上只是個舉例，總的來講，還是通過 max/min(concat(xxx,yyy,bbb,...)) 等方式完成按需查找，找到符合條件的唯一記錄，其中?xxx,yyy,bbb 可以是字段，也可以是一種運算，如2000000000-create_time，總的原則來講，就是想通過max或者min搜索出想要的唯一信息。????????如果覺得數據量不是很大，則可以使用 select * from (select * from order_amount order

by complete_time desc,

create_time asc where

xxx yyy zzz) as sel

group by xy，zz，dd，通過內查詢按要求排序，通過group by篩選出第一條記錄。

group by 總結

1、group by 非計算列，非group by列，如何自行控制？

解決方法：子查詢，子查詢按需進行排序

select * from (select * from order_amount order

by complete_time desc,

create_time asc where

xxx yyy zzz) as sel

group by xy

2、group by 也是優先使用索引。

3、group by 一次可以完成多個函數，可以通過多個字段進行分組

select count(amount) as cnt, SUM(amount)

as total_amount, avg(amount)

as avg_amount, max(id)

as max_id, min(id)

as min_id, xy, za, hs from

order_amount

group by xy,

za,hs

4、同時可以使用 with rollup再獲取上級匯總

select count(amount) as cnt,

SUM(amount) as total_amount, avg(amount)

as avg_amount,

max(id) as max_id,

min(id) as min_id, xy, za, hs from

order_amount

group by xy, za,hs with

rollup

5、group by 之后的結果也可以排序,并非select的條件，且不影響select的結果。

select count(amount) as cnt,

SUM(amount) as total_amount, avg(amount)

as avg_amount,

max(id) as max_id,

min(id) as min_id, xy, za, hs from

order_amount

group by xy asc, za desc,hs asc

6、使用 group by 的時候，難免會用到子查詢，一定要嚴格審視子查詢結果的大小和性能

總結

以上是生活随笔為你收集整理的by截取字段 group_深入理解 group by【思考点滴】的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。