mysql索引之前缀索引
有時(shí)候需要索引很長(zhǎng)的字符列,這會(huì)讓索引變得大且慢。通常可以索引開始的部分字符,這樣可以大大節(jié)約索引空間,從而提高索引效率。但這樣也會(huì)降低索引的選擇性。索引的選擇性是指不重復(fù)的索引值(也稱為基數(shù),cardinality)和數(shù)據(jù)表的記錄總數(shù)的比值,范圍從1/T到1之間。索引的選擇性越高則查詢效率越高,因?yàn)檫x擇性高的索引可以讓MySQL在查找時(shí)過(guò)濾掉更多的行。唯一索引的選擇性是1,這是最好的索引選擇性,性能也是最好的。
一般情況下某個(gè)前綴的選擇性也是足夠高的,足以滿足查詢性能。對(duì)于BLOB,TEXT,或者很長(zhǎng)的VARCHAR類型的列,必須使用前綴索引,因?yàn)镸ySQL不允許索引這些列的完整長(zhǎng)度。訣竅在于要選擇足夠長(zhǎng)的前綴以保證較高的選擇性,同時(shí)又不能太長(zhǎng)(以便節(jié)約空間)。前綴應(yīng)該足夠長(zhǎng),以使得前綴索引的選擇性接近于索引的整個(gè)列。換句話說(shuō),前綴的”基數(shù)“應(yīng)該接近于完整的列的”基數(shù)“。
為了決定前綴的合適長(zhǎng)度,需要找到最常見的值的列表,然后和最常見的前綴列表進(jìn)行比較。下面的示例是mysql官方提供的示例數(shù)據(jù)庫(kù),下載地址如下:
http://downloads.mysql.com/docs/sakila-db.zip
在示例數(shù)據(jù)庫(kù)sakila中并沒(méi)有合適的例子,所以從表city中生成一個(gè)示例表,這樣就有足夠數(shù)據(jù)進(jìn)行演示:
1.解壓下載的sakila-db.zip文件
2.使用source命令以及sakila-schema.sql和sakila-data.sql文件來(lái)初始化sakila庫(kù)以及相關(guān)表格
mysql> select database();
+------------+
| database() |
+------------+
| sakila |
+------------+
1 row in set (0.00 sec)
mysql> create table city_demo (city varchar(50) not null);
mysql> insert into city_demo (city) select city from city; --執(zhí)行兩次
Query OK, 600 rows affected (0.02 sec)
Records: 600 Duplicates: 0 Warnings: 0
mysql> update city_demo set city = ( select city from city order by rand() limit 1);
Query OK, 1198 rows affected (0.42 sec)
Rows matched: 1200 Changed: 1198 Warnings: 0
注:因?yàn)樯鲜鰏ql語(yǔ)句使用了rand函數(shù),所以每個(gè)人的執(zhí)行結(jié)果可以都不一樣!
首先找到最常見的城市列表:
mysql> select count(*) as cnt,city from city_demo group by city order by cnt desc limit 10;
+-----+---------------+
| cnt | city |
+-----+---------------+
| 8 | Dongying |
| 7 | Omdurman |
| 6 | Etawah |
| 6 | Okara |
| 6 | Tsuyama |
| 6 | Brindisi |
| 6 | Kuwana |
| 6 | Grand Prairie |
| 5 | Fuyu |
| 5 | Siegen |
+-----+---------------+
10 rows in set (0.00 sec)
現(xiàn)在查找到頻繁出現(xiàn)的城市前綴。先從3個(gè)前綴字母開始,然后4個(gè),5個(gè),6個(gè):
mysql> select count(*) as cnt,left(city,3) as pref from city_demo group by pref order by cnt desc limit 10;
+-----+------+
| cnt | pref |
+-----+------+
| 23 | San |
| 15 | Hal |
| 14 | Cha |
| 14 | al- |
| 12 | Bat |
| 12 | Kor |
| 11 | Don |
| 11 | Shi |
| 10 | La |
| 10 | El |
+-----+------+
10 rows in set (0.00 sec)
可以看到3字節(jié)檢索到的結(jié)果與全文檢索相差很大,繼續(xù)增加到4個(gè)字節(jié)
mysql> select count(*) as cnt,left(city,4) as pref from city_demo group by pref order by cnt desc limit 10;
+-----+------+
| cnt | pref |
+-----+------+
| 14 | San |
| 8 | Dong |
| 7 | Iwak |
| 7 | al-Q |
| 7 | Omdu |
| 6 | Kuwa |
| 6 | Tsuy |
| 6 | Brin |
| 6 | Etaw |
| 6 | Okar |
+-----+------+
10 rows in set (0.00 sec)
mysql> select count(*) as cnt,left(city,5) as pref from city_demo group by pref order by cnt desc limit 10;
+-----+-------+
| cnt | pref |
+-----+-------+
| 8 | Dongy |
| 7 | al-Qa |
| 7 | Omdur |
| 6 | Okara |
| 6 | Valle |
| 6 | Grand |
| 6 | Tsuya |
| 6 | Etawa |
| 6 | South |
| 6 | Kuwan |
+-----+-------+
10 rows in set (0.00 sec)
mysql> select count(*) as cnt,left(city,6) as pref from city_demo group by pref order by cnt desc limit 10;
+-----+--------+
| cnt | pref |
+-----+--------+
| 8 | Dongyi |
| 7 | Omdurm |
| 6 | Okara |
| 6 | Tsuyam |
| 6 | Valle |
| 6 | Grand |
| 6 | Etawah |
| 6 | Brindi |
| 6 | Kuwana |
| 5 | Haldia |
+-----+--------+
10 rows in set (0.01 sec)
通過(guò)上面改變不同前綴長(zhǎng)度發(fā)現(xiàn),當(dāng)前綴長(zhǎng)度為6時(shí),這個(gè)前綴的選擇性就接近完整列的選擇性了。
當(dāng)然還有另外更方便的方法,那就是計(jì)算完整列的選擇性,并使其前綴的選擇性接近于完整列的選擇性。下面顯示如何計(jì)算完整列的選擇性:
mysql> select count(distinct city)/count(*) from city_demo;
+-------------------------------+
| count(distinct city)/count(*) |
+-------------------------------+
| 0.4333 |
+-------------------------------+
1 row in set (0.00 sec)
可以在一個(gè)查詢中針對(duì)不同前綴長(zhǎng)度的選擇性進(jìn)行計(jì)算,這對(duì)于大表非常有用,下面給出如何在同一個(gè)查詢中計(jì)算不同前綴長(zhǎng)度的選擇性:
mysql> select count(distinct left(city,3))/count(*) as sel3,count(distinct left(city,4))/count(*) as sel4,count(distinct left(city,5))/count(*) as sel5, count(distinct left(city,6))/count(*) as sel6 from city_demo;
+--------+--------+--------+--------+
| sel3 | sel4 | sel5 | sel6 |
+--------+--------+--------+--------+
| 0.3408 | 0.4100 | 0.4225 | 0.4300 |
+--------+--------+--------+--------+
1 row in set (0.00 sec)
可以看見當(dāng)索引前綴為6時(shí)的基數(shù)是0.4300,已經(jīng)接近完整列選擇性0.4333。
下面根據(jù)找到的索引前綴長(zhǎng)度創(chuàng)建前綴索引:
mysql> alter table city_demo add key (city(6));
Query OK, 0 rows affected (0.19 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> explain select * from city_demo where city like 'Jin%';
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: city_demo
partitions: NULL
type: range
possible_keys: city
key: city
key_len: 8
ref: NULL
rows: 4
filtered: 100.00
Extra: Using where
1 row in set, 1 warning (0.00 sec)
可以看見正確使用剛創(chuàng)建的索引。
優(yōu)點(diǎn):前綴索引是一種能使索引更小,更快的有效辦法
缺點(diǎn):mysql無(wú)法使用其前綴索引做ORDER BY和GROUP BY,也無(wú)法使用前綴索引做覆蓋掃描。
郭慕榮博客園
總結(jié)
以上是生活随笔為你收集整理的mysql索引之前缀索引的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 长沙下水道女孩(隋唐演义英雄豪杰排名榜)
- 下一篇: 常用站长工具软件汇总,有没有一款你在用?