Oracle:集群因子
什么是集群因子(Clustering Factor)呢?集群因子是通過(guò)一個(gè)索引掃描一張表時(shí)需要訪問(wèn)的表的數(shù)據(jù)塊的數(shù)量。
集群因子的計(jì)算方法如下:
(1) 掃描一個(gè)索引;
(2) 比較某行的ROWID和前一行的ROWID,如果這兩個(gè)ROWID不屬于同一個(gè)數(shù)據(jù)塊,那么集群因子增加1;
(3) 整個(gè)索引掃描完畢后,就得到了該索引的集群因子。
以上面集群因子的計(jì)算方式可以看出,集群因子反映了索引范圍掃描可能帶來(lái)的對(duì)整個(gè)表訪問(wèn)過(guò)程的IO開(kāi)銷情況,如果集群因子接近于表存儲(chǔ)的塊數(shù),說(shuō)明這張表是按照索引字段的順序存儲(chǔ)的。如果集群因子接近于行的數(shù)量,那說(shuō)明這張表不是按索引字段順序存儲(chǔ)的。在計(jì)算索引訪問(wèn)成本時(shí),集群因子十分有用。集群因子乘以選擇性參數(shù)就是訪問(wèn)索引的開(kāi)銷。(是優(yōu)化器決定是否用索引的關(guān)鍵因素)
官方說(shuō)明:
?----row存儲(chǔ)的越有序,clustering factor的值越低
----當(dāng)clustering factor很高時(shí),說(shuō)明index entry (rowid) 是????隨機(jī)指向一些block的,在一個(gè)大的index range scan時(shí),這樣為了讀取這些rowid指向的block,就需要一次又一次重復(fù)的去讀這些block。
? ? ? ?----當(dāng)clustering factor值低時(shí),說(shuō)明index keys (rowid) 是指向的記錄是存儲(chǔ)在相同的block里,這樣去讀row時(shí),只需要在同一個(gè)block里讀取就可以了,這樣減少重復(fù)讀取blocks的次數(shù)
Index Clustering Factor說(shuō)明
??? 簡(jiǎn)單的說(shuō),Index Clustering Factor是通過(guò)一個(gè)索引掃描一張表,需要訪問(wèn)的表的數(shù)據(jù)塊的數(shù)量,即對(duì)I/O的影響,也代表索引鍵存儲(chǔ)位置是否有序。
????(1)、如果越有序,即相鄰的鍵值存儲(chǔ)在相同的block,那么這時(shí)候Clustering Factor的值就越低;
??? (2)、如果不是很有序,即鍵值是隨機(jī)的存儲(chǔ)在block上,這樣在讀取鍵值時(shí),可能就需要一次又一次的去訪問(wèn)相同的block,從而增加了I/O。
??? Clustering Factor的計(jì)算方式如下:
???? (1)、掃描一個(gè)索引(large index range scan);
???? (2)、比較某行的rowid和前一行的rowid,如果這兩個(gè)rowid不屬于同一個(gè)數(shù)據(jù)塊,那么cluster factor增加1;
???? (3)、整個(gè)索引掃描完畢后,就得到了該索引的clustering factor。
??????????? 如果clustering factor接近于表存儲(chǔ)的塊數(shù),說(shuō)明這張表是按照索引字段的順序存儲(chǔ)的。
??????????? 如果clustering factor接近于行的數(shù)量,那說(shuō)明這張表不是按索引字段順序存儲(chǔ)的。
??????????? 在計(jì)算索引訪問(wèn)成本的時(shí)候,這個(gè)值十分有用。Clustering Factor乘以選擇性參數(shù)(selectivity)就是訪問(wèn)索引的開(kāi)銷。
??????????? 如果這個(gè)統(tǒng)計(jì)數(shù)據(jù)不能真實(shí)反映出索引的真實(shí)情況,那么可能會(huì)造成優(yōu)化器錯(cuò)誤的選擇執(zhí)行計(jì)劃。另外如果某張表上的大多數(shù)訪問(wèn)是按照某個(gè)索引做索引掃描,那么將該表的數(shù)據(jù)按照索引字段的順序重新組織,可以提高該表的訪問(wèn)性能。
測(cè)試
????產(chǎn)生問(wèn)題
----查看一下數(shù)據(jù)庫(kù)的版本---- SQL>?select?*?from?v$version?where?rownum=1;BANNER -------------------------------------------------------------------------------- Oracle?Database?11g?Enterprise?Edition?Release?11.2.0.1.0?-?Production----創(chuàng)建一張測(cè)試表jack---- SQL>?create?table?jack?as?select?*?from?dba_objects?where?1=2;Table?created.----將數(shù)據(jù)無(wú)序的插入jack表中---- SQL>?begin2??????for?i?in?1..10?loop3????????insert?/*+?append?*/?into?jack?select?*?from?dba_objects?order?by?i;4??????commit;5????end?loop;6??end;7??/PL/SQL?procedure?successfully?completed.SQL>?select?count(*)?from?jack;COUNT(*) ----------725460----查看一下表的大小----- SQL>?set?wrap?off SQL>?col?owner?for?a10; SQL>?col?segment_name?for?a15; SQL>?select?segment_name,blocks,extents,bytes/1024/1024||'M'?"size"?from?user_segments?where?segment_name='JACK';SEGMENT_NAME?????BLOCKS????EXTENTS???size -------------?----------?----------?-------- JACK?????????????11264???????82??????88M----在object_id上創(chuàng)建索引---- SQL>?create?index?jack_ind?on?jack(object_id);Index?created.----查看一下索引的大小---- SQL>?select?segment_name,segment_type,blocks,extents,bytes/1024/1024||'M'?"size"?from?user_segments?where?segment_name='JACK_IND';SEGMENT_NAME????SEGMENT_TYPE??????BLOCKS?????EXTENTS?????size ------------?------------------?----------?----------?--------- JACK_IND???????????INDEX???????????1664?????????28????????13M----在沒(méi)有收集相關(guān)的統(tǒng)計(jì)信息之前,查看一下index?clustering?factor---- SQL>?select?index_name,clustering_factor,num_rows?from?user_indexes?where?index_name='JACK_IND';INDEX_NAME??????CLUSTERING_FACTOR???NUM_ROWS ---------------?-----------------?---------- JACK_IND??????????????725460?????????725460----簡(jiǎn)單的收集一下統(tǒng)計(jì)信息---- SQL>?exec?dbms_stats.gather_table_stats(user,'jack',cascade=>true);PL/SQL?procedure?successfully?completed.----再次查看index?clustering?factor---- SQL>?select?index_name,clustering_factor,num_rows?from?user_indexes?where?index_name='JACK_IND';INDEX_NAME?????CLUSTERING_FACTOR???NUM_ROWS --------------?-----------------?---------- JACK_IND??????????????725460???????725460???????----顯然統(tǒng)計(jì)信息收集前和后,clustering?factor值不變,說(shuō)在創(chuàng)建索引的時(shí)候,會(huì)收集表中的數(shù)據(jù)真正的行數(shù)。并且這里的clustering?factor等num_rows,也說(shuō)明表的clustering?factor是無(wú)序的。----查看一個(gè)確定值,然后查看執(zhí)行計(jì)劃---- SQL>?explain?plan?for?select?*?from?jack?where?object_id=1501;Explained.SQL>?select?*?from?table(dbms_xplan.display);PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------- Plan?hash?value:?2860868395-------------------------------------------------------------------------------- |?Id??|?Operation????????????|?Name?????|?Rows??|?Bytes?|?Cost?(%CPU)|?Ti -------------------------------------------------------------------------------- |???0?|?SELECT?STATEMENT????????|???????????|????10?|???970?|????13?????(0)|?00 |???1?|??TABLE?ACCESS?BY?INDEX?ROWID|?JACK?????|????10?|???970?|????13?????(0)|?00 |*??2?|???INDEX?RANGE?SCAN????????|?JACK_IND?|????10?|???????|?????3?????(0)|?00 --------------------------------------------------------------------------------Predicate?Information?(identified?by?operation?id):PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------2?-?access("OBJECT_ID"=1501)14?rows?selected.?????????????????----在這里走了索引,cost為13.SQL>?alter?system?flush?buffer_cache;System?altered.SQL>?set?autotrace?traceonly;----查詢一個(gè)范圍的執(zhí)行計(jì)劃---- SQL>?select?*?from?jack?where?object_id>1000?and?object_id<2000;9880?rows?selected.Execution?Plan ---------------------------------------------------------- Plan?hash?value:?949574992-------------------------------------------------------------------------- |?Id??|?Operation??????|?Name?|?Rows??|?Bytes?|?Cost?(%CPU)|?Time?????| -------------------------------------------------------------------------- |???0?|?SELECT?STATEMENT??|?????|??9657?|???914K|??1824???(1)|?00:00:22?| |*??1?|??TABLE?ACCESS?FULL|?JACK?|??9657?|???914K|??1824???(1)|?00:00:22?| --------------------------------------------------------------------------Predicate?Information?(identified?by?operation?id): ---------------------------------------------------1?-?filter("OBJECT_ID"<2000?AND?"OBJECT_ID">1000)Statistics ----------------------------------------------------------0???recursive?calls0???db?block?gets10993?consistent?gets10340?physical?reads0???redo?size471945?bytes?sent?via?SQL*Net?to?client7657??bytes?received?via?SQL*Net?from?client660???SQL*Net?roundtrips?to/from?client0???sorts?(memory)0???sorts?(disk)9880??rows?processed???????----注意,object_id上是有索引的,但是這里并沒(méi)有使用索引,而是使用了全表掃描。SQL>?alter?system?flush?buffer_cache;System?altered.----強(qiáng)制走索引,查看執(zhí)行計(jì)劃---- SQL>?select?/*+?index(jack?jack_ind)?*/?*?from?jack?where?object_id>1000?and?object_id<2000;9880?rows?selected.Execution?Plan ---------------------------------------------------------- Plan?hash?value:?2860868395---------------------------------------------------------------------------------------- |?Id??|?Operation????????????|?Name?????|?Rows??|?Bytes?|?Cost?(%CPU)|?Time?????| ---------------------------------------------------------------------------------------- |???0?|?SELECT?STATEMENT????????|???????????|??9657?|???914K|??9683?????(1)|?00:01:57?| |???1?|??TABLE?ACCESS?BY?INDEX?ROWID|?JACK?????|??9657?|???914K|??9683?????(1)|?00:01:57?| |*??2?|???INDEX?RANGE?SCAN????????|?JACK_IND?|??9657?|???????|????24?????(0)|?00:00:01?| ----------------------------------------------------------------------------------------Predicate?Information?(identified?by?operation?id): ---------------------------------------------------2?-?access("OBJECT_ID">1000?AND?"OBJECT_ID"<2000)Statistics ----------------------------------------------------------0????recursive?calls0????db?block?gets10561??consistent?gets164????physical?reads0????redo?size988947??bytes?sent?via?SQL*Net?to?client7657???bytes?received?via?SQL*Net?from?client660????SQL*Net?roundtrips?to/from?client0????sorts?(memory)0????sorts?(disk)9880???rows?processed???? ----強(qiáng)制走索引之后,使用了index?range?scan,但是cost變成了9683,而全表掃描時(shí)是1824. ----還有比較一下兩次查詢中物理讀的情況:全表掃描的物理讀明顯比索引的要高很多,但是Oracle卻沒(méi)有使用索引。 ----因此Oracle認(rèn)為走索引的Cost比走全表掃描大,而是大N倍,CBO是基于Cost來(lái)決定執(zhí)行計(jì)劃的。 ----由此得出,對(duì)于索引的Cost,Oracle是根據(jù)clustering?factor參數(shù)來(lái)計(jì)算的,而該實(shí)驗(yàn)中的clustering?factor參數(shù)是很高的,數(shù)據(jù)存儲(chǔ)無(wú)序。這就造成了Oracle認(rèn)為走索引的cost比全表掃描的大。解決問(wèn)題:
| ----通過(guò)上面的分析,可以看出,要降低clustering?factor才能解決問(wèn)題,而要解決clustering?factor,就需要重新對(duì)表的存儲(chǔ)位置進(jìn)行排序。--------重建jakc表---- SQL>?create?table?echo?as?select?*?from?jack?where?1=0;Table?created.SQL>?insert?/*+?append?*/?into?echo?select?*?from?jack?order?by?object_id;725460?rows?created.SQL>?commit;Commit?complete.SQL>?truncate?table?jack;Table?truncated.SQL>?insert?/*+?append?*/?into?jack?select?*?from?echo;725460?rows?created.SQL>?commit;Commit?complete.----查看表和索引的信息---- SQL>?select?segment_name,blocks,extents,bytes/1024/1024||'M'?"size"?from?user_segments?where?segment_name='JACK';SEGMENT_NAME????BLOCKS????EXTENTS?????size -------------?----------?----------?----------- JACK?????????????11264???????82????????88MSQL>?select?segment_name,segment_type,blocks,extents,bytes/1024/1024||'M'?"size"?from?user_segments?where?segment_name='JACK_IND';SEGMENT_NAME????SEGMENT_TYPE??????BLOCKS?????EXTENTS????size ------------?------------------?----------?----------?------------- JACK_IND????????????INDEX??????????1536??????????27?????12MSQL>?select?index_name,clustering_factor,num_rows?from?user_indexes?where?index_name='JACK_IND';INDEX_NAME?????CLUSTERING_FACTOR?NUM_ROWS -------------?-----------------?---------- JACK_IND?????????????725460??????725460----對(duì)索引進(jìn)行rebuild---- SQL>?alter?index?jack_ind?rebuild;Index?altered.----查看cluster?factor---- SQL>?select?index_name,clustering_factor,num_rows?from?user_indexes?where?index_name='JACK_IND';INDEX_NAME??????CLUSTERING_FACTOR??NUM_ROWS ---------------?-----------------?---------- JACK_IND???????????????10327???????725460????------注意這里的Factor,已經(jīng)變成10327,我們收集一下表的統(tǒng)計(jì)信息,然后與表的block進(jìn)行一次比較。SQL>?exec?dbms_stats.gather_table_stats(user,'jack',cascade=>true);PL/SQL?procedure?successfully?completed.SQL>?select?blocks?from?dba_tables?where?table_name='JACK';BLOCKS ----------10474???----表jack實(shí)際使用的block是10474,clustering?factor是10327基本還是比較接近了,這也說(shuō)明相鄰的row是存儲(chǔ)在相同的block里。SQL>?select?index_name,clustering_factor,num_rows?from?user_indexes?where?index_name='JACK_IND';INDEX_NAME???????????????CLUSTERING_FACTOR???NUM_ROWS ------------------------------?-----------------?---------- JACK_IND???????????????????10327?????725460SQL>?alter?system?flush?buffer_cache;System?altered.SQL>?set?autotrace?traceonly;----再次查看之前sql的執(zhí)行計(jì)劃---- SQL>?select?*?from?jack?where?object_id>1000?and?object_id<2000;9880?rows?selected.Execution?Plan ---------------------------------------------------------- Plan?hash?value:?2860868395---------------------------------------------------------------------------------------- |?Id??|?Operation????????????|?Name?????|?Rows??|?Bytes?|?Cost?(%CPU)|?Time?????| ---------------------------------------------------------------------------------------- |???0?|?SELECT?STATEMENT????????|???????????|??9657?|???914K|???162?????(0)|?00:00:02?| |???1?|??TABLE?ACCESS?BY?INDEX?ROWID|?JACK?????|??9657?|???914K|???162?????(0)|?00:00:02?| |*??2?|???INDEX?RANGE?SCAN????????|?JACK_IND?|??9657?|???????|????24?????(0)|?00:00:01?| ----------------------------------------------------------------------------------------Predicate?Information?(identified?by?operation?id): ---------------------------------------------------2?-?access("OBJECT_ID">1000?AND?"OBJECT_ID"<2000)Statistics ----------------------------------------------------------1???recursive?calls0???db?block?gets1457??consistent?gets151???physical?reads0???redo?size988947??bytes?sent?via?SQL*Net?to?client7657??bytes?received?via?SQL*Net?from?client660???SQL*Net?roundtrips?to/from?client0???sorts?(memory)0???sorts?(disk)9880??rows?processed----注意這里的cost已經(jīng)降到了162,性能提升還是非常明顯。 |
小結(jié)
????通過(guò)以上說(shuō)明和測(cè)試,可以看到clustering factor也是索引健康的一個(gè)重要判斷的標(biāo)準(zhǔn)。其值越低越好。它會(huì)影響CBO選擇正確的執(zhí)行計(jì)劃。但是注意一點(diǎn),clustering factor總是趨勢(shì)與不斷惡化的。
轉(zhuǎn)載于:https://blog.51cto.com/newcourage/1942656
總結(jié)
以上是生活随笔為你收集整理的Oracle:集群因子的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 吉利银河官宣发布:将推7款新品 L7二季
- 下一篇: Derby的jar说明