mysql 亿级_mysql 亿级数据量 ( sum ,group by )的优化
今天開發提出需求,讓統計數據,一詢問才得知表中的數據量已達億級以上。具體的sql如下:
SELECT id_province_code,gender,age,COUNT(1),SUM(zy_days),SUM(zf),SUM(ybnje)FROM medicare2017 WHERE zy_enter_date BETWEEN '2017-01-01 00:00:00' AND '2017-12-31 12:59:59' GROUP BY id_province_code,age,gender;
然后查看該sql的執行計劃
mysql> explain SELECT id_province_code,gender,age,COUNT(1),SUM(zy_days),SUM(zf),SUM(ybnje) FROM medicare2017 WHERE zy_enter_date BETWEEN '2017-01-01 00:00:00' AND '2017-12-31 12:59:59' GROUP BYid_province_code,age,gender;+----+-------------+--------------+------------+-------+-------------------------------------------+-------------------+---------+------+---------+----------+--------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------+------------+-------+-------------------------------------------+-------------------+---------+------+---------+----------+--------------------------------------------------------+
| 1 | SIMPLE | medicare2017 | NULL | range | idx_zy_enter_date,idx_province_age_gender | idx_zy_enter_date | 4 | NULL | 4836248 | 100.00 | Using index condition; Using temporary; Using filesort |
執行計劃中“Extra”中竟然出現了“Using temporary; Using filesort ”,看到這種情況我們就得進行優化了,雖然“type”列出現了“range”。出現這種情況是因為sql語句中使用了“group by”或者是“order by ”,然后進行了文件排序。
接著,我們就需要給group by后面的字段建立索引了,mysql索引原則是最左匹配前綴原則,我們給“id_province_code,age,gender”三字段添加一個復合索引(按照最左匹配原則):
alter table medicare2017 add index idx_ipc_age_gener(id_province_code,age,gender);
但一查看表結構
KEY `idx_province_age_gender` (`id_province_code`,`gender`,`age`)
早期已經建立好了,但是,怎么還會出現“Using temporary; Using filesort”,查看官檔發現,group by 默認是要排序的,所以即使我們添加了索引,還是會引起文件排序。這樣,我們的解決方案是:強制關閉排序:order by null
最后我們根據官方文檔的建議,進行了sql的最終優化:
mysql> mysql> explain SELECT id_province_code,gender,age,COUNT(1),SUM(zy_days),SUMM(ybnje) FROM medicare2017 WHERE zy_enter_date BETWEEN '2017-01-01 00:00:00' AND '2017-12-31 12:59:59' GROUP BY id_province_code,age,gender order by null;+----+-------------+--------------+------------+-------+-------------------------------------------+-------------------+---------+------+---------+----------+----------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------------+------------+-------+-------------------------------------------+-------------------+---------+------+---------+----------+----------------------------------------+
| 1 | SIMPLE | medicare2017 | NULL | range | idx_zy_enter_date,idx_province_age_gender | idx_zy_enter_date | 4 | NULL | 4836248 | 100.00 | Using index condition; Using temporary |
+----+-------------+--------------+------------+-------+-------------------------------------------+-------------------+---------+------+---------+----------+----------------------------------------+
查看執行計劃,發現沒有文件排序了,但是還是有“using tempory”,別著急,這是不重要的,只要執行sql的性能提升了就可以了
最終,運行最終優化后的sql發現比沒有強制關閉排序的sql,整整快了將近4個小時(表中數據將近3億,沒關閉排序前的sql我運行了將近4個小時,還在跑,但優化后只跑了幾十秒)看來優化的綜合評估是很重要的。
總結
以上是生活随笔為你收集整理的mysql 亿级_mysql 亿级数据量 ( sum ,group by )的优化的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: mysql 5.6压缩安装_MySQL
- 下一篇: mysql游标事例_Mysql 游标示例