當前位置：首頁 > 运维知识 > 数据库 >内容正文

数据库

优化案例 | CASE WHEN进行SQL改写优化

發布時間：2024/9/19 数据库 25 豆豆

生活随笔收集整理的這篇文章主要介紹了优化案例 | CASE WHEN进行SQL改写优化小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

導讀
今天給大家分享一個通過SQL改寫而獨辟蹊徑的SQL優化案例

待優化場景
發現SLOW QUERY LOG中有下面這樣一條記錄：

... # Query_time: 59.503827 ?Lock_time: 0.000198 ?Rows_sent: 641227 ?Rows_examined: 13442472 ?Rows_affected: 0 ... select uid,sum(power) powerup from t1 where? date>='2017-03-31' and? UNIX_TIMESTAMP(STR_TO_DATE(concat(date,' ',hour),'%Y-%m-%d %H'))>=1490965200 and? UNIX_TIMESTAMP(STR_TO_DATE(concat(date,' ',hour),'%Y-%m-%d %H'))<1492174801 ?and? aType in (1,6,9) group by uid;

實話說，看到這個SQL我也忍不住想罵人啊，究竟是哪個腦殘的XX狗設計的？

竟然把日期時間中的 date 和 hour 給獨立出來成兩列，查詢時再合并成一個新的條件，簡直無力吐槽。

吐槽歸吐槽，該干活還得干活，誰讓咱是DBA呢，SQL優化是咱的拿手好戲不是嘛~

SQL優化之路
SQL優化思路
不厭其煩地再說一遍SQL優化思路。

想要優化一個SQL，一般來說就是先看執行計劃，觀察是否盡可能用到索引，

同時要關注預計掃描的行數，

以及是否產生了臨時表（Using temporary）或者?

是否需要進行排序（Using filesort），

想辦法消除這些情況。

SQL性能瓶頸定位
毫無疑問，想要優化，先看表DDL以及執行計劃：

CREATE TABLE `t1` (`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,`date` date NOT NULL DEFAULT '0000-00-00',`hour` char(2) NOT NULL DEFAULT '00',`kid` int(4) NOT NULL DEFAULT '0',`uid` int(11) NOT NULL DEFAULT '0',`aType` tinyint(2) NOT NULL DEFAULT '0',`src` tinyint(2) NOT NULL DEFAULT '1',`aid` int(11) NOT NULL DEFAULT '1',`acount` int(11) NOT NULL DEFAULT '1',`power` decimal(20,2) DEFAULT '0.00',PRIMARY KEY (`id`,`date`),UNIQUE KEY `did` (`date`,`hour`,`kid`,`uid`,`aType`,`src`,`aid`) ) ENGINE=InnoDB AUTO_INCREMENT=50486620 DEFAULT CHARSET=utf8mb4 /*!50500 PARTITION BY RANGE ?COLUMNS(`date`) (PARTITION p20170316 VALUES LESS THAN ('2017-03-17') ENGINE = InnoDB,PARTITION p20170317 VALUES LESS THAN ('2017-03-18') ENGINE = InnoDB ... yejr@imysql.com[myDB]> EXPLAIN select uid,sum(power) powerup from t1 where? date>='2017-03-31' and? UNIX_TIMESTAMP(STR_TO_DATE(concat(date,' ',hour),'%Y-%m-%d %H'))>=1490965200 and? UNIX_TIMESTAMP(STR_TO_DATE(concat(date,' ',hour),'%Y-%m-%d %H'))<1492174801 ?and? aType in (1,6,9) group by uid\G *************************** 1. row ***************************id: 1select_type: SIMPLEtable: t1partitions: p20170324,p20170325,....all partitiontype: ALL possible_keys: didkey: NULLkey_len: NULLref: NULLrows: 25005577filtered: 15.00Extra: Using where; Using temporary; Using filesort

明顯的，這個SQL效率非常低，全表掃描、沒有索引、有臨時表、需要額外排序，什么倒霉催的全趕上了。

優化思考
這個SQL是想統計符合條件的power列總和，雖然 date 列已有索引，但WHERE子句中卻對 date 列加了函數，而且還是 date 和 hour 兩列的組合條件，那就無法用到這個索引了。

還好，有個聰明伶俐的妹子，突發起想（事實上這位妹子本來就擅長做SQL優化的~），可以用 CASE WHEN 方法來改造下SQL，改成像下面這樣的：

select uid,sum(powerup+powerup1) from (select uid,case when concat(date,' ',hour) >='2017-03-24 13:00' then power else '0' end as powerup,case when concat(date,' ',hour) < '2017-03-25 13:00' then power else '0' end as powerup1from t1where date>='2017-03-24'?and ??date <'2017-03-25'and ?aType in (1,6,9) ) a ?group by uid;

是不是很有才，直接把這個沒辦法用到索引的條件給用CASE WHEN來改造了。看看新的SQL執行計劃：

*************************** 1. row ***************************id: 1select_type: SIMPLEtable: t1partitions: p20170324type: range possible_keys: didkey: idx2_date_addRedTypekey_len: 4ref: NULLrows: 876375filtered: 30.00Extra: Using index condition; Using temporary; Using filesort

看看這個SQL的執行代價：

+----------------------------+---------+ | Variable_name ?????????????| Value ??| +----------------------------+---------+ | Handler_read_first ????????| 1 ??????| | Handler_read_key ??????????| 1834590 | | Handler_read_last ?????????| 0 ??????| | Handler_read_next ?????????| 1834589 | | Handler_read_prev ?????????| 0 ??????| | Handler_read_rnd ??????????| 232276 ?| | Handler_read_rnd_next ?????| 232277 ?| +----------------------------+---------+

及其SLOW QUERY LOG記錄的信息：

# Query_time: 6.381254 ?Lock_time: 0.000166 ?Rows_sent: 232276 ?Rows_examined: 2299141 ?Rows_affected: 0 # Bytes_sent: 4237347 ?Tmp_tables: 1 ?Tmp_disk_tables: 0 ?Tmp_table_sizes: 4187168 # InnoDB_trx_id: 0 # QC_Hit: No ?Full_scan: No ?Full_join: No ?Tmp_table: Yes ?Tmp_table_on_disk: No # Filesort: Yes ?Filesort_on_disk: No ?Merge_passes: 0 # ??InnoDB_IO_r_ops: 0 ?InnoDB_IO_r_bytes: 0 ?InnoDB_IO_r_wait: 0.000000 # ??InnoDB_rec_lock_wait: 0.000000 ?InnoDB_queue_wait: 0.000000 # ??InnoDB_pages_distinct: 9311

看起來還不是太理想啊，雖然不再掃描全表了，但畢竟還是?有臨時表?和?額外排序，想辦法消除后再對比看下。

有個變化不知道大家注意到沒，新的SLOW QUERY LOG記錄多了不少信息，這是因為用了Percona分支版本的插件才支持，這個功能確實不錯，甚至還能記錄Profiling的詳細信息，強烈推薦。

我們新建個 uid 列上的索引，看看能除臨時表及排序后的代價如何，看看這個的開銷會不會更低。

yejr@imysql.com[myDB]> ALTER TABLE t1 ADD INDEX idx_uid(uid); yejr@imysql.com[myDB]> EXPLAIN select uid,sum(powerup+powerup1) from (select uid,case when concat(date,' ',hour) >='2017-03-24 13:00' then power else '0' end as powerup,case when concat(date,' ',hour) < '2017-03-25 13:00' then power else '0' end as powerup1from t1where date>='2017-03-24'?and ??date <'2017-03-25'and ?aType in (1,6,9) ) a ?group by uid\G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: if_date_hour_army_countpartitions: p20170331,p20170401...type: index possible_keys: did,idx_uidkey: idx_uidkey_len: 4ref: NULLrows: 12701520filtered: 15.00Extra: Using where

看看添加索引后SQL的執行代價：

+----------------------------+---------+ | Variable_name ?????????????| Value ??| +----------------------------+---------+ | Handler_read_first ????????| 1 ??????| | Handler_read_key ??????????| 1 ??????| | Handler_read_last ?????????| 0 ??????| | Handler_read_next ?????????| 1834589 | | Handler_read_prev ?????????| 0 ??????| | Handler_read_rnd ??????????| 0 ??????| | Handler_read_rnd_next ?????| 0 ??????| +----------------------------+---------+

及其SLOW QUERY LOG記錄的信息：

# Query_time: 5.772286 ?Lock_time: 0.000330 ?Rows_sent: 232276 ?Rows_examined: 1834589 ?Rows_affected: 0 # Bytes_sent: 4215071 ?Tmp_tables: 0 ?Tmp_disk_tables: 0 ?Tmp_table_sizes: 0 # InnoDB_trx_id: 0 # QC_Hit: No ?Full_scan: Yes ?Full_join: No ?Tmp_table: No ?Tmp_table_on_disk: No # Filesort: No ?Filesort_on_disk: No ?Merge_passes: 0 # ??InnoDB_IO_r_ops: 0 ?InnoDB_IO_r_bytes: 0 ?InnoDB_IO_r_wait: 0.000000 # ??InnoDB_rec_lock_wait: 0.000000 ?InnoDB_queue_wait: 0.000000 # ??InnoDB_pages_distinct: 11470

我們注意到，雖然加了 uid 列索引后的SQL掃描的data page更多了，但執行效率其實是更高的，因為消除了臨時表和額外排序，這從 Handlerread% 的結果中也能看出來，很顯然它的順序I/O更多，隨機I/O更少，所以雖然需要掃描的 data page 更多，實際上效率卻是更快的。

后記
再想想這個SQL還有優化空間嗎，顯然是有的，那就是把數據表重新設計，將 date 和 hour 列整合到一起，這樣就不用費勁的拼湊條件并且也能用到索引了。
?

總結

以上是生活随笔為你收集整理的优化案例 | CASE WHEN进行SQL改写优化的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：惠州哪里租房便宜推荐几个便宜又好的租房
下一篇： linux cmake编译源码,linu