Oracle 分析及动态采样
之前在說Oracle Optimizer中的CBO時(shí)講到,當(dāng)表沒有做分析的時(shí)候,Oracle 會(huì)使用動(dòng)態(tài)采樣來收集統(tǒng)計(jì)信息。 獲取準(zhǔn)確的段對象(表,表分區(qū),索引等)的分析數(shù)據(jù),是CBO存在的基石,CBO的機(jī)制就是收集盡可能多的對象信息和系統(tǒng)信息,通過對這些信息進(jìn)行計(jì)算,分析,評估,最終得出一個(gè)成本最低的執(zhí)行計(jì)劃。 所以對于CBO,數(shù)據(jù)段的分析就非常重要。
?
Oracle Optimizer CBO RBO
http://blog.csdn.net/tianlesoftware/archive/2010/08/19/5824886.aspx
?
一.???????? 先演示一個(gè)示例,來理解分析的作用
?
1.1創(chuàng)建表
SQL> create table t as select object_id,object_name from dba_objects where 1=2;
表已創(chuàng)建。
SQL> create index index_t on t(object_id);
索引已創(chuàng)建。
SQL> insert into t select object_id,object_name from dba_objects;
已創(chuàng)建72926行。
SQL> commit;
提交完成。
?
1.2查看分的分析及執(zhí)行計(jì)劃
SQL> select num_rows,avg_row_len,blocks,last_analyzed from user_tables where table_name='T';
? NUM_ROWS AVG_ROW_LEN???? BLOCKS LAST_ANALYZED
---------- ----------- ---------- --------------
?
SQL> select blevel,leaf_blocks,distinct_keys,last_analyzed from user_indexes where table_name='T';
??? BLEVEL LEAF_BLOCKS DISTINCT_KEYS LAST_ANALYZED
---------- ----------- ------------- --------------
????? ???0?????????? 0???????????? 0 25-8月 -10
?
從查詢結(jié)果看出,表的行數(shù),行長,占用的數(shù)據(jù)塊數(shù)及最后的分析時(shí)間都是空。 索引的相關(guān)信息也沒有,說明這個(gè)表和說因都沒有被分析,如果此時(shí)有一條SQL 對表做查詢,CBO 由于無法獲取這些信息,很可能生成錯(cuò)誤的執(zhí)行計(jì)劃,如:
?
SQL> set linesize 200
SQL> set autot trace exp;
SQL> select /*+dynamic_sampling(t 0) */ * from t where object_id>30;
執(zhí)行計(jì)劃
----------------------------------------------------------
Plan hash value: 80339723
?
---------------------------------------------------------------------------------------
| Id? | Operation?????????????????? | Name??? | Rows? | Bytes | Cost (%CPU)| Time???? |
---------------------------------------------------------------------------------------
|?? 0 | SELECT STATEMENT??????????? |???????? |???? 4 |?? 316 |???? 0?? (0)| 00:00:01 |
|?? 1 |? TABLE ACCESS BY INDEX ROWID| T?????? |???? 4 |?? 316 |???? 0?? (0)| 00:00:01 |
|*? 2 |?? INDEX RANGE SCAN????????? | INDEX_T |???? 1 |?????? |???? 0?? (0)| 00:00:01 |
---------------------------------------------------------------------------------------
?
Predicate Information (identified by operation id):
---------------------------------------------------
?? 2 - access("OBJECT_ID">30)
SQL>
?
在Oracle 10g以后,如果一個(gè)表沒有做分析,數(shù)據(jù)庫將自動(dòng)對它做動(dòng)態(tài)采樣分析,所以這里采用hint的方式將動(dòng)態(tài)采樣的級別設(shè)置為0,即不使用動(dòng)態(tài)采樣。
?
???????? 從這個(gè)執(zhí)行計(jì)劃,看書CBO 估計(jì)出表中滿足條件的記錄為4條,索引使用了索引。 我們對表做一下分析,用結(jié)果比較一下。
?
1.3 分析表及查看分析之后的執(zhí)行計(jì)劃
分析可以通過兩中方式:
一種是analyze 命令,如:
analyze table tablename compute statistics for all indexes;
???????? 還有一種就是通過DBMS_STATS包來分析,從9i 開始,Oracle 推薦使用DBMS_STATS包對表進(jìn)行分析操作,因?yàn)镈BMS_STATS 提供了更多的功能,以及靈活的操作方式。
????????
SQL> exec dbms_stats.gather_table_stats('SYS','T');
PL/SQL 過程已成功完成。
SQL> select blevel,leaf_blocks,distinct_keys,last_analyzed from user_indexes where table_name='T';
??? BLEVEL LEAF_BLOCKS DISTINCT_KEYS LAST_ANALYZED
---------- ----------- ------------- --------------
???????? 1???????? 263???????? 72926 25-8月 -10
SQL> select num_rows,avg_row_len,blocks,last_analyzed from user_tables where table_name='T';
? NUM_ROWS AVG_ROW_LEN???? BLOCKS LAST_ANALYZED
---------- ----------- ---------- --------------
???? 72926????????? 29??????? 345 25-8月 -10
?
從上面的結(jié)果,可以看出DBMS_STATS.gather_table_stats已經(jīng)對表和索引都做了分析。 現(xiàn)在我們在來看一下執(zhí)行計(jì)劃。
?
SQL> set autot trace exp;
SQL> select * from t where object_id>30;
執(zhí)行計(jì)劃
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id? | Operation???????? | Name | Rows? | Bytes | Cost (%CPU)| Time???? |
--------------------------------------------------------------------------
|?? 0 | SELECT STATEMENT? |????? | 72899 |? 2064K|??? 96?? (2)| 00:00:02 |
|*? 1 |? TABLE ACCESS FULL| T??? | 72899 |? 2064K|??? 96?? (2)| 00:00:02 |
--------------------------------------------------------------------------
?
Predicate Information (identified by operation id):
---------------------------------------------------
?
?? 1 - filter("OBJECT_ID">30)
?
從這個(gè)計(jì)劃,我們看出CBO 估算出的結(jié)果是72899 條記錄,與實(shí)際的72926很近。 此時(shí)選擇全表掃描更優(yōu)。 通過這個(gè)例子,我們也看出了分析對執(zhí)行計(jì)劃的重要性。
?
?
二.???????? 直方圖(Histogram)
DBMS_STATS 包對段表的分析有三個(gè)層次:
(1)表自身的分析: 包括表中的行數(shù),數(shù)據(jù)塊數(shù),行長等信息。
(2)列的分析:包括列值的重復(fù)數(shù),列上的空值,數(shù)據(jù)在列上的分布情況。
(3)索引的分析: 包括索引葉塊的數(shù)量,索引的深度,索引的聚合因子等。
?
直方圖就是 列分析中 數(shù)據(jù)在列上的分布情況。
?
???????? 當(dāng)Oracle 做直方圖分析時(shí),會(huì)將要分析的列上的數(shù)據(jù)分成很多數(shù)量相同的部分,每一部分稱為一個(gè)bucket,這樣CBO就可以非常容易地知道這個(gè)列上的數(shù)的分布情況,這種數(shù)據(jù)的分布將作為一個(gè)非常重要的因素納入到執(zhí)行計(jì)劃成本的計(jì)算當(dāng)中。
?
???????? 對于數(shù)據(jù)分布非常傾斜的表,做直方圖是非常有用的。 如: 1,10,20,30,40,50. 那么在一個(gè)數(shù)值范圍(bucket)內(nèi),它的數(shù)據(jù)記錄基本上一樣。 如果是:1,5,5,5,5,10,10,20,50,100. 那么它在bucket內(nèi),數(shù)據(jù)分布就是嚴(yán)重的傾斜。
?
???????? 直方圖有時(shí)對于CBO非常重要,特別是對于有字段數(shù)據(jù)非常傾斜的表,做直方圖分析尤為重要。 可以用dbms_stats包來分析。 默認(rèn)情況下,dbms_stats 包會(huì)對所有的列做直方圖分析。 如:??
???????? SQL> exec dbms_stats.gather_table_stats('SYS','T',cascade=>true);
PL/SQL 過程已成功完成。
?
然后從user_histograms視圖上查看到相關(guān)的信息:
?
SQL> select table_name,column_name,endpoint_number,endpoint_value from user_histograms where table_name='T';
TABLE_NAME???????????????????? COLUMN_NAME????????? ENDPOINT_NUMBER ENDPOINT_VALUE
------------------------------ -------------------- --------------- --------------
T????? ????????????????????????OBJECT_ID????????????????????????? 0????????????? 2
T????????????????????????????? OBJECT_NAME??????????????????????? 0???? 2.4504E+35
T????????????????????????????? OBJECT_ID????????????????????????? 1????????? 76685
T???????????? ?????????????????OBJECT_NAME??????????????????????? 1???? 1.0886E+36
?
如果一個(gè)列上的數(shù)據(jù)有比較嚴(yán)重的傾斜,對這個(gè)列做直方圖是必要的,但是,Oracle 對數(shù)據(jù)分析是需要消耗資源的,特別是對于一些很大的段對象,分析的時(shí)間尤其長。對于OLAP系統(tǒng),可能需要幾個(gè)小時(shí)才能完成。
???????? 所以做不做分析就需要DBA 權(quán)衡好了。 但有一點(diǎn)要注意, 不要在生產(chǎn)環(huán)境中隨便修改分析方案,除非你有十足的把握。 否則可能導(dǎo)致非常嚴(yán)重的后果。
?
?
三.???????? DBMS_STATS包
DBMS_STAS包不僅能夠?qū)Ρ磉M(jìn)行分析,它還可以對數(shù)據(jù)庫分析進(jìn)行管理。 按照功能可以分一下幾類:
(1)?????? 性能數(shù)據(jù)的收集
(2)?????? 性能數(shù)據(jù)的設(shè)置
(3)?????? 性能數(shù)據(jù)的刪除
(4)?????? 性能數(shù)據(jù)的備份和恢
?
更多信息參考Oracle 聯(lián)機(jī)文檔:
11g DBMS_STATS
http://download.oracle.com/docs/cd/E11882_01/appdev.112/e10577/d_stats.htm#ARPLS68486
?
10g DBMS_STATS
http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_stats.htm#i1036461
?
?
3.1 ?DBMS_STATS包的幾個(gè)常用功能:性能的手機(jī),設(shè)定 和刪除
???????? 性能數(shù)據(jù)的收集包含這樣幾個(gè)存儲(chǔ)過程:
GATHER_DATABASE_STATS Procedures
GATHER_DICTIONARY_STATS Procedure
GATHER_FIXED_OBJECTS_STATS Procedure
GATHER_INDEX_STATS Procedure
GATHER_SCHEMA_STATS Procedures
GATHER_SYSTEM_STATS Procedure
GATHER_TABLE_STATS Procedure
?
從名字也可以看出各自的作用,這些存儲(chǔ)過程用來收集數(shù)據(jù)庫不同級別對象的性能數(shù)據(jù),包括:數(shù)據(jù)庫,數(shù)據(jù)字典,表,索引,SCHEMA的性能等。
?
3.1.1? GATHER_TABLE_STATS Procedure 存儲(chǔ)過程
?
在10g中, GATHER_TABLE_STATS的參數(shù)如下:
DBMS_STATS.GATHER_TABLE_STATS (
?? ownname????????? VARCHAR2,
?? tabname????????? VARCHAR2,
?? partname???????? VARCHAR2 DEFAULT NULL,
?? estimate_percent NUMBER?? DEFAULT to_estimate_percent_type
??????????????????????????????????????????????? (get_param('ESTIMATE_PERCENT')),
?? block_sample???? BOOLEAN? DEFAULT FALSE,
?? method_opt?????? VARCHAR2 DEFAULT get_param('METHOD_OPT'),
?? degree?????????? NUMBER?? DEFAULT to_degree_type(get_param('DEGREE')),
?? granularity????? VARCHAR2 DEFAULT GET_PARAM('GRANULARITY'),
?? cascade????????? BOOLEAN? DEFAULT to_cascade_type(get_param('CASCADE')),
?? stattab????????? VARCHAR2 DEFAULT NULL,
?? statid?????????? VARCHAR2 DEFAULT NULL,
?? statown????????? VARCHAR2 DEFAULT NULL,
?? no_invalidate??? BOOLEAN? DEFAULT? to_no_invalidate_type (
???????????????????????????????????? get_param('NO_INVALIDATE')),
?? force??????????? BOOLEAN DEFAULT FALSE);
?
到了11g,對參數(shù)做了調(diào)整:
???????? DBMS_STATS.GATHER_TABLE_STATS (
?? ownname????????? VARCHAR2,
?? tabname????????? VARCHAR2,
?? partname???????? VARCHAR2 DEFAULT NULL,
?? estimate_percent NUMBER?? DEFAULT to_estimate_percent_type
??????????????????????????????????????? ????????(get_param('ESTIMATE_PERCENT')),
?? block_sample???? BOOLEAN? DEFAULT FALSE,
?? method_opt?????? VARCHAR2 DEFAULT get_param('METHOD_OPT'),
?? degree?????????? NUMBER?? DEFAULT to_degree_type(get_param('DEGREE')),
?? granularity????? VARCHAR2 DEFAULT GET_PARAM('GRANULARITY'),
?? cascade????????? BOOLEAN? DEFAULT to_cascade_type(get_param('CASCADE')),
?? stattab????????? VARCHAR2 DEFAULT NULL,
?? statid?????????? VARCHAR2 DEFAULT NULL,
?? statown????????? VARCHAR2 DEFAULT NULL,
?? no_invalidate??? BOOLEAN? DEFAULT? to_no_invalidate_type (
???????????????????????????????????? get_param('NO_INVALIDATE')),
?? force??????????? BOOLEAN DEFAULT FALSE);
?
對參數(shù)的說明:
Parameter | Description |
ownname | Schema of table to analyze |
tabname | Name of table |
partname | Name of partition |
estimate_percent | Percentage of rows to estimate (NULL means compute) The valid range is [0.000001,100]. Use the constant DBMS_STATS.AUTO_SAMPLE_SIZE to have Oracle determine the appropriate sample size for good statistics. This is the default.The default value can be changed using the SET_PARAM Procedure. |
block_sample | Whether or not to use random block sampling instead of random row sampling. Random block sampling is more efficient, but if the data is not randomly distributed on disk, then the sample values may be somewhat correlated. Only pertinent when doing an estimate statistics. |
method_opt | Accepts: FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause] FOR COLUMNS [size clause] column|attribute [size_clause] [,column|attribute [size_clause]...] size_clause is defined as size_clause := SIZE {integer | REPEAT | AUTO | SKEWONLY}
The default is FOR ALL COLUMNS SIZE AUTO.The default value can be changed using the SET_PARAM Procedure. |
degree | Degree of parallelism. The default for degree is NULL. The default value can be changed using the SET_PARAM Procedure NULL means use the table default value specified by the DEGREE clause in the CREATE TABLE or ALTER TABLE statement. Use the constant DBMS_STATS.DEFAULT_DEGREE to specify the default value based on the initialization parameters. The AUTO_DEGREE value determines the degree of parallelism automatically. This is either 1 (serial execution) or DEFAULT_DEGREE (the system default value based on number of CPUs and initialization parameters) according to size of the object. |
granularity | Granularity of statistics to collect (only pertinent if the table is partitioned). 'ALL' - gathers all (subpartition, partition, and global) statistics 'AUTO'- determines the granularity based on the partitioning type. This is the default value. 'DEFAULT' - gathers global and partition-level statistics. This option is obsolete, and while currently supported, it is included in the documentation for legacy reasons only. You should use the 'GLOBAL AND PARTITION' for this functionality. Note that the default value is now 'AUTO'. 'GLOBAL' - gathers global statistics 'GLOBAL AND PARTITION' - gathers the global and partition level statistics. No subpartition level statistics are gathered even if it is a composite partitioned object. 'PARTITION '- gathers partition-level statistics 'SUBPARTITION' - gathers subpartition-level statistics. |
cascade | Gather statistics on the indexes for this table. Index statistics gathering is not parallelized. Using this option is equivalent to running the GATHER_INDEX_STATS Procedure on each of the table's indexes. Use the constant DBMS_STATS.AUTO_CASCADE to have Oracle determine whether index statistics to be collected or not. This is the default. The default value can be changed using theSET_PARAM Procedure. |
stattab | User statistics table identifier describing where to save the current statistics |
statid | Identifier (optional) to associate with these statistics within stattab |
statown | Schema containing stattab (if different than ownname) |
no_invalidate | Does not invalidate the dependent cursors if set to TRUE. The procedure invalidates the dependent cursors immediately if set to FALSE. Use DBMS_STATS.AUTO_INVALIDATE. to have Oracle decide when to invalidate dependent cursors. This is the default. The default can be changed using the SET_PARAM Procedure. |
force | Gather statistics of table even if it is locked |
?
?
在gather_table_stats 存儲(chǔ)過程的所有參數(shù)中,除了ownname和tabname,其他的參數(shù)都有默認(rèn)值。 所以我們在調(diào)用這個(gè)存儲(chǔ)過程時(shí),Oracle 會(huì)使用參數(shù)的默認(rèn)值對表進(jìn)行分析。如:
SQL> exec dbms_stats.gather_table_STATS('SYS','T');
PL/SQL 過程已成功完成。
?
???????? 如果想查看當(dāng)前的默認(rèn)值,可以使用dbms_stats.get_param函數(shù)來獲取:
?
SQL> select dbms_stats.get_param('method_opt') from dual;
?
DBMS_STATS.GET_PARAM('METHOD_OPT')
------------------------------------------------------------
FOR ALL COLUMNS SIZE AUTO
?
結(jié)合上面對參數(shù)的說明:
???? - AUTO : Oracle determines the columns to collect histograms based on data distribution and the workload of the columns.
我們可以看出,就是對所有的列做直方圖分析,直方圖設(shè)置的bucket值由Oracle自己決定。
?
3.1.1.1? estimate_percent 參數(shù)
???????? 這個(gè)參數(shù)是一個(gè)百分比值,它告訴分析包需要使用表中數(shù)據(jù)的多大比例來做分析。
????????
理論上來講,采樣的數(shù)據(jù)越多,得到的信息就越接近于實(shí)際,CBO做出的執(zhí)行計(jì)劃就越優(yōu)化,但是,采樣越多,消耗的系統(tǒng)資源必然越多。 對系統(tǒng)的影響也越大。 所以對于這個(gè)值的設(shè)置,要根據(jù)業(yè)務(wù)情況來。 如果數(shù)據(jù)的直方圖分布比較均勻,就可以使用默認(rèn)值:AUTO_SAMPLE_SIZE,即讓Oracle 自己來判斷采樣的比例。有時(shí),特別是對于批量加載的表,我們可以預(yù)估表中的數(shù)據(jù)量,可以人工地設(shè)置一個(gè)合理的值。 一般,對于一個(gè)有1000萬數(shù)據(jù)的表分區(qū),可以把這個(gè)參數(shù)設(shè)置為0.000001.
?
3.1. 1.2 ?Method_option 參數(shù)
???????? 這個(gè)參數(shù)用來定義直方圖分析的一些值。
FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause]
FOR COLUMNS [size clause] column|attribute [size_clause] [,column|attribute [size_clause]...]
?
???????? 這里給出了4種指定哪些列進(jìn)行分析的方式:
(1)?????? 所有列:for all column
(2)?????? 索引列:只對有索引的列進(jìn)行分析,for all indexed columns
(3)?????? 影藏列:只對影藏的列進(jìn)行分析,for all hidden columns
(4)?????? 顯示指定列:顯示的指定那些列進(jìn)行分析,for columns columns_name
?
該參數(shù)默認(rèn)值:for all columns size auto.
????????
3.1. 1.3 degree 參數(shù)
用來指定分析時(shí)使用的并行度。 有以下這些設(shè)置:
(1)???? Null: 如果設(shè)置為null,Oracle 將使用被分析表屬性的并行度,比如表在創(chuàng)建時(shí)指定的并行度,或者后者使用alter table 重新設(shè)置的并行度。
(2)???? 一個(gè)數(shù)值: 可以顯示地指定分析時(shí)使用的并行度。
(3)???? Default_degree: 如果設(shè)置為default,Oracle 將根據(jù)初始化參數(shù)中相關(guān)參數(shù)的設(shè)置來決定使用的并行度。
?
這個(gè)參數(shù)的默認(rèn)值是Null,即通過表上的并行度屬性來決定分析使用的并行度。 當(dāng)需要分析的表或表分區(qū)非常大,并且系統(tǒng)資源比較充分的時(shí)候,就可以考慮使用并行的方式來做分析,這樣就會(huì)大大提高分析的速度。 相反,如果你的系統(tǒng)資源比較吃緊,那么啟用并行可能會(huì)適得其反。
?
3.1. 1.4 Granularity
分析的粒度,有以下幾個(gè)配置:
(1)?????? ALL : 將會(huì)對表的全局(global),分區(qū),子分區(qū)的數(shù)據(jù)都做分析
(2)?????? AUTO: Oracle 根據(jù)分區(qū)的類型,自動(dòng)決定做哪一種粒度的分析。
(3)?????? GLOBAL:只做全局級別的分析。
(4)?????? GLOBAL AND PARTITION: 只對全局和分區(qū)級別做分析,對子分區(qū)不做分析,這是和ALL的一個(gè)區(qū)別。
(5)?????? PARTITION: 只在分區(qū)級別做分析。
(6)?????? SUBPARTITION: 只在子分區(qū)做分析。
?
在生產(chǎn)環(huán)境中,特別是OLAP 或者數(shù)據(jù)倉庫的環(huán)境中,這個(gè)參數(shù)的設(shè)置會(huì)直接影響到CBO的執(zhí)行計(jì)劃選擇。
?
在OLAP或者數(shù)據(jù)倉庫系統(tǒng)中,經(jīng)常有這樣的事情,新創(chuàng)建一個(gè)分區(qū),將批量的數(shù)據(jù)(通常是很大的數(shù)據(jù))加載到分區(qū)中,對分區(qū)做分析,然后做報(bào)表或者數(shù)據(jù)挖掘。 在理想的情況下,對表的全局,分區(qū)都做分析,這樣才能得到最充足的數(shù)據(jù),但是通常這樣的表都非常大,如果每增加一個(gè)分區(qū)都需要做一次全局分析,那么會(huì)消耗極大的系統(tǒng)資源。 但是如果只對新加入的分區(qū)進(jìn)行分區(qū)而不做全局分析,oracle 在全局范圍內(nèi)的信息就會(huì)不準(zhǔn)確。
?
???????? 該參數(shù)在默認(rèn)情況下,DBMS_STATS 包會(huì)對表級(全局),分區(qū)級(對應(yīng)參數(shù)partition)都會(huì)進(jìn)行分析。 如果把cascade 設(shè)置為true,相應(yīng)索引的全局和分區(qū)級別也都會(huì)被分析。 如果只對分區(qū)級進(jìn)行分析,而全局沒有分析,那么全局信息沒有更新,依然會(huì)導(dǎo)致CBO 作出錯(cuò)誤的執(zhí)行計(jì)劃。
?
所以當(dāng)一些新的數(shù)據(jù)插入到表中時(shí),如果對這些新的數(shù)據(jù)進(jìn)行分析,是一個(gè)非常重要的問題。 一般參考如下原則:
(1)?????? 看一下新插入的數(shù)據(jù)在全表中所占的比例,如果所占比例不是很大,那么可以考慮不做全局分析,否則就需要考慮,一句是業(yè)務(wù)的實(shí)際運(yùn)行情況。
(2)?????? 采樣比例。 如果載入的數(shù)據(jù)量非常大,比如上千萬或者更大,就要把采樣比例壓縮的盡可能地小,但底線是不能影響CBO做出正確的執(zhí)行計(jì)劃,采樣比例的上線是不能消耗太多的資源而影響到業(yè)務(wù)的正常運(yùn)行。
(3)?????? 新加載的數(shù)據(jù)應(yīng)該要做分區(qū)級的數(shù)據(jù)分析。 至于是否需要直方圖分析,以及設(shè)置多少個(gè)buckets(size參數(shù)指定),需要DBA一句數(shù)據(jù)的分布情況進(jìn)行考慮,關(guān)鍵是視數(shù)據(jù)的傾斜程度而定。
?
?
3.1.2 ?GATHER_SCHEMA_STATS 存儲(chǔ)過程
???????? 這個(gè)存儲(chǔ)過程用于對某個(gè)用戶下所有的對象進(jìn)行分析。如果你的數(shù)據(jù)用戶對象非常多,單獨(dú)對每個(gè)對象進(jìn)行分析設(shè)定會(huì)非常不方便,這個(gè)存儲(chǔ)過程就很方便。 它的好處在于如果需要分析的對象非常多,將可以大大降低DBA的工作量,不足之處是所有分析使用相同的分析策略,可能會(huì)導(dǎo)致分析不是最優(yōu)。 所以要根據(jù)實(shí)際情況來決定。
?
???????? 該存儲(chǔ)過程參數(shù)如下:
???????? DBMS_STATS.GATHER_SCHEMA_STATS (
?? ownname????????? VARCHAR2,
?? estimate_percent NUMBER?? DEFAULT to_estimate_percent_type
??????????????????????????????????????????????? (get_param('ESTIMATE_PERCENT')),
?? block_sample???? BOOLEAN? DEFAULT FALSE,
?? method_opt?????? VARCHAR2 DEFAULT get_param('METHOD_OPT'),
?? degree?????????? NUMBER?? DEFAULT to_degree_type(get_param('DEGREE')),
?? granularity????? VARCHAR2 DEFAULT GET_PARAM('GRANULARITY'),
?? cascade????????? BOOLEAN? DEFAULT to_cascade_type(get_param('CASCADE')),
?? stattab????????? VARCHAR2 DEFAULT NULL,
?? statid?????????? VARCHAR2 DEFAULT NULL,
?? options????????? VARCHAR2 DEFAULT 'GATHER',
?? objlist????????? OUT????? ObjectTab,
?? statown????????? VARCHAR2 DEFAULT NULL,
?? no_invalidate??? BOOLEAN? DEFAULT to_no_invalidate_type (
???????????????????????????????????? get_param('NO_INVALIDATE')),
? force???????????? BOOLEAN DEFAULT FALSE,
? obj_filter_list? ObjectTab DEFAULT NULL);
?
參數(shù)說明如下:
Parameter | Description |
ownname | Schema to analyze (NULL means current schema) |
estimate_percent | Percentage of rows to estimate (NULL means compute): The valid range is [0.000001,100]. Use the constant DBMS_STATS.AUTO_SAMPLE_SIZE to have Oracle determine the appropriate sample size for good statistics. This is the default.The default value can be changed using the SET_DATABASE_PREFS Procedure, SET_GLOBAL_PREFS Procedure, SET_SCHEMA_PREFS Procedure and SET_TABLE_PREFS Procedure. |
block_sample | Whether or not to use random block sampling instead of random row sampling. Random block sampling is more efficient, but if the data is not randomly distributed on disk, then the sample values may be somewhat correlated. Only pertinent when doing an estimate statistics. |
method_opt | Accepts: FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause] size_clause is defined as size_clause := SIZE {integer | REPEAT | AUTO | SKEWONLY}
The default is FOR ALL COLUMNS SIZE AUTO.The default value can be changed using the SET_DATABASE_PREFS Procedure, SET_GLOBAL_PREFS Procedure, SET_SCHEMA_PREFS Procedure and SET_TABLE_PREFS Procedure. |
degree | Degree of parallelism. The default for degree is NULL. The default value can be changed using the SET_DATABASE_PREFS Procedure, SET_GLOBAL_PREFS Procedure, SET_SCHEMA_PREFS Procedure and SET_TABLE_PREFS Procedure. NULL means use the table default value specified by the DEGREE clause in the CREATE TABLE or ALTER TABLE statement. Use the constant DBMS_STATS.DEFAULT_DEGREE to specify the default value based on the initialization parameters.The AUTO_DEGREE value determines the degree of parallelism automatically. This is either 1 (serial execution) or DEFAULT_DEGREE (the system default value based on number of CPUs and initialization parameters) according to size of the object. |
granularity | Granularity of statistics to collect (only pertinent if the table is partitioned). 'ALL' - gathers all (subpartition, partition, and global) statistics 'AUTO'- determines the granularity based on the partitioning type. This is the default value. 'DEFAULT' - gathers global and partition-level statistics. This option is obsolete, and while currently supported, it is included in the documentation for legacy reasons only. You should use the 'GLOBAL AND PARTITION' for this functionality. Note that the default value is now 'AUTO'. 'GLOBAL' - gathers global statistics 'GLOBAL AND PARTITION' - gathers the global and partition level statistics. No subpartition level statistics are gathered even if it is a composite partitioned object. 'PARTITION '- gathers partition-level statistics 'SUBPARTITION' - gathers subpartition-level statistics. |
cascade | Gather statistics on the indexes as well. Using this option is equivalent to running the GATHER_INDEX_STATS Procedure on each of the indexes in the schema in addition to gathering table and column statistics. Use the constant DBMS_STATS.AUTO_CASCADE to have Oracle determine whether index statistics to be collected or not. This is the default. The default value can be changed using the SET_DATABASE_PREFS Procedure, SET_GLOBAL_PREFS Procedure, SET_SCHEMA_PREFS Procedure and SET_TABLE_PREFS Procedure. |
stattab | User statistics table identifier describing where to save the current statistics |
statid | Identifier (optional) to associate with these statistics within stattab |
options | Further specification of which objects to gather statistics for: GATHER: Gathers statistics on all objects in the schema. GATHER AUTO: Gathers all necessary statistics automatically. Oracle implicitly determines which objects need new statistics, and determines how to gather those statistics. When GATHER AUTO is specified, the only additional valid parameters are ownname, stattab, statid, objlist and statown; all other parameter settings are ignored. Returns a list of processed objects. GATHER STALE: Gathers statistics on stale objects as determined by looking at the *_tab_modifications views. Also, return a list of objects found to be stale. GATHER EMPTY: Gathers statistics on objects which currently have no statistics. also, return a list of objects found to have no statistics. LIST AUTO: Returns a list of objects to be processed with GATHER AUTO. LIST STALE: Returns list of stale objects as determined by looking at the *_tab_modifications views. LIST EMPTY: Returns list of objects which currently have no statistics. |
objlist | List of objects found to be stale or empty |
statown | Schema containing stattab (if different than ownname) |
no_invalidate | Does not invalidate the dependent cursors if set to TRUE. The procedure invalidates the dependent cursors immediately if set to FALSE. Use DBMS_STATS.AUTO_INVALIDATE. to have Oracle decide when to invalidate dependent cursors. This is the default. The default can be changed using the SET_DATABASE_PREFS Procedure, SET_GLOBAL_PREFS Procedure, SET_SCHEMA_PREFS Procedure and SET_TABLE_PREFS Procedure. |
force | Gather statistics on objects even if they are locked |
obj_filter_list | A list of object filters. When provided, GATHER_SCHEMA_STATS will gather statistics only on objects which satisfy at least one object filter in the list as needed. In a single object filter, we can specify the constraints on the object attributes. The attribute values specified in the object filter are case- insensitive unless double-quoted. Wildcard is allowed in the attribute values. Suppose non-NULL values s1, s2, ... are specified for attributes a1, a2, ... in one object filter. An object o is said to satisfy this object filter if (o.a1 like s1) and (o.a2 like s2) and ... is true. See Applying an Object Filter List. |
?
?
3.1.3 ?DBMS_STATS.GATHER_INDEX_STATS 存儲(chǔ)過程
???????? 該存儲(chǔ)過程用于對索引的分析,如果我們在使用DBMS_STATS.GATHER_TABLES_STATS的分析時(shí)設(shè)置參數(shù)cascade=>true。 那么Oracle會(huì)同時(shí)執(zhí)行這個(gè)存儲(chǔ)過程來對索引進(jìn)行分析。
?
存儲(chǔ)過程參數(shù):
DBMS_STATS.GATHER_INDEX_STATS (
?? ownname????????? VARCHAR2,
?? indname????????? VARCHAR2,
?? partname???????? VARCHAR2 DEFAULT NULL,
?? estimate_percent NUMBER?? DEFAULT to_estimate_percent_type
??????????????????????????????????????????????? (GET_PARAM('ESTIMATE_PERCENT')),
?? stattab????????? VARCHAR2 DEFAULT NULL,
?? statid?????????? VARCHAR2 DEFAULT NULL,
?? statown????????? VARCHAR2 DEFAULT NULL,
?? degree?????????? NUMBER?? DEFAULT to_degree_type(get_param('DEGREE')),
?? granularity????? VARCHAR2 DEFAULT GET_PARAM('GRANULARITY'),
?? no_invalidate??? BOOLEAN? DEFAULT to_no_invalidate_type
??????????????????????????????????????????? ???(GET_PARAM('NO_INVALIDATE')),
?? force??????????? BOOLEAN DEFAULT FALSE);
?
Parameter | Description |
ownname | Schema of index to analyze |
indname | Name of index |
partname | Name of partition |
estimate_percent | Percentage of rows to estimate (NULL means compute). The valid range is [0.000001,100]. Use the constant DBMS_STATS.AUTO_SAMPLE_SIZE to have Oracle determine the appropriate sample size for good statistics. This is the default.The default value can be changed using the SET_DATABASE_PREFS Procedure, SET_GLOBAL_PREFS Procedure, SET_SCHEMA_PREFS Procedure and SET_TABLE_PREFS Procedure. |
stattab | User statistics table identifier describing where to save the current statistics |
statid | Identifier (optional) to associate with these statistics within stattab |
statown | Schema containing stattab (if different than ownname) |
degree | Degree of parallelism. The default for degree is NULL. The default value can be changed using the SET_DATABASE_PREFS Procedure, SET_GLOBAL_PREFS Procedure, SET_SCHEMA_PREFS Procedure and SET_TABLE_PREFS Procedure. NULL means use of table default value that was specified by the DEGREE clause in the CREATE/ALTER INDEX statement. Use the constant DBMS_STATS.DEFAULT_DEGREE for the default value based on the initialization parameters. The AUTO_DEGREE value determines the degree of parallelism automatically. This is either 1 (serial execution) or DEFAULT_DEGREE (the system default value based on number of CPUs and initialization parameters) according to size of the object. |
granularity | Granularity of statistics to collect (only pertinent if the table is partitioned). 'ALL' - gathers all (subpartition, partition, and global) statistics 'AUTO'- determines the granularity based on the partitioning type. This is the default value. 'DEFAULT' - gathers global and partition-level statistics. This option is obsolete, and while currently supported, it is included in the documentation for legacy reasons only. You should use the 'GLOBAL AND PARTITION' for this functionality. Note that the default value is now 'AUTO'. 'GLOBAL' - gathers global statistics 'GLOBAL AND PARTITION' - gathers the global and partition level statistics. No subpartition level statistics are gathered even if it is a composite partitioned object. 'PARTITION '- gathers partition-level statistics 'SUBPARTITION' - gathers subpartition-level statistics. |
no_invalidate | Does not invalidate the dependent cursors if set to TRUE. The procedure invalidates the dependent cursors immediately if set to FALSE. Use DBMS_STATS.AUTO_INVALIDATE. to have Oracle decide when to invalidate dependent cursors. This is the default. The default can be changed using the SET_DATABASE_PREFS Procedure, SET_GLOBAL_PREFS Procedure, SET_SCHEMA_PREFS Procedure and SET_TABLE_PREFS Procedure. |
force | Gather statistics on object even if it is locked |
?
?
上面討論了三個(gè)常用的存儲(chǔ)過程。 分析對CBO 來說非常重要,如果不能按照自己的系統(tǒng)指定出切合實(shí)際的數(shù)據(jù)分析方案,可能會(huì)導(dǎo)致如下問題的發(fā)生:
(1)?????? 分析信息不充分導(dǎo)致CBO 產(chǎn)生錯(cuò)誤的執(zhí)行計(jì)劃,導(dǎo)致SQL執(zhí)行效率低下。
(2)?????? 過多的分析工具帶來系統(tǒng)性能的嚴(yán)重下降。
?
?
?
3.2 ?DBMS_STATS包管理功能
3.2.1 獲取分析數(shù)據(jù)
GET_COLUMN_STATS Procedures
GET_INDEX_STATS Procedures
GET_SYSTEM_STATS Procedure
GET_TABLE_STATS Procedure
?
這四個(gè)存儲(chǔ)過程分別為用戶獲取字段,索引,表和系統(tǒng)的統(tǒng)計(jì)信息。 它的用法是首先定義要獲取性能指標(biāo)的變量,然后使用存儲(chǔ)過程將性能指標(biāo)的值賦給變量,最后將變量的值輸出。 ?如:
?
SQL> set serveroutput on
SQL> declare
? 2? dist number;
? 3? dens number;
? 4? ncnt number;
? 5? orec dbms_stats.statrec;
? 6? avgc number;
? 7? begin
? 8? dbms_stats.get_column_stats('SYS','T','object_ID',distcnt=>dist,density=>dens,nullcnt=>ncnt,srec=>orec,avgclen=>avgc);
? 9? dbms_output.put_line('the distcnt is:' ||to_char(dist));
?10? dbms_output.put_line('the density is:' ||to_char(dens));
?11? dbms_output.put_line('the nullcnt is:' ||to_char(ncnt));
?12? dbms_output.put_line('the srec is:' ||to_char(ncnt));
?13? dbms_output.put_line('the avgclen is:' ||to_char(avgc));
?14? end;
?15? /
the distcnt is:72926
the density is:.0000137125305103804
the nullcnt is:0
the srec is:0
the avgclen is:5
?
PL/SQL 過程已成功完成。
?
更多信息參考:
???????? http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_stats.htm#i1036461
?
3.2.2 設(shè)置分析數(shù)據(jù)
SET_COLUMN_STATS Procedures
SET_INDEX_STATS Procedures
SET_SYSTEM_STATS Procedure
SET_TABLE_STATS Procedure
?
這幾個(gè)存儲(chǔ)過程允許我們手工地為字段,索引,表和系統(tǒng)性能數(shù)據(jù)賦值。 它的一個(gè)用處是當(dāng)相應(yīng)的指標(biāo)不準(zhǔn)確導(dǎo)致執(zhí)行計(jì)劃失敗時(shí),可以使用這種方法手工地來為這些性能數(shù)據(jù)賦值。 在極端情況下,這也不失為一個(gè)解決問題的方法。
?
關(guān)于這4個(gè)存儲(chǔ)過程的絕提用法參考 oracle 聯(lián)機(jī)文檔:
???????? http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_stats.htm#i1036461
?
?
3.2.3 刪除分析數(shù)據(jù)
DELETE_COLUMN_STATS Procedure
DELETE_DATABASE_STATS Procedure
DELETE_DICTIONARY_STATS Procedure
DELETE_FIXED_OBJECTS_STATS Procedure
DELETE_INDEX_STATS Procedure
DELETE_SCHEMA_STATS Procedure
DELETE_SYSTEM_STATS Procedure
DELETE_TABLE_STATS Procedure
?
當(dāng)性能數(shù)據(jù)出現(xiàn)異常導(dǎo)致CBO判斷錯(cuò)誤時(shí),為了立刻修正這個(gè)錯(cuò)誤,刪除性能數(shù)據(jù)也是一種補(bǔ)救的方法,比如刪除了表的數(shù)據(jù),讓CBO重新對表做動(dòng)態(tài)采樣分析,得到一個(gè)正確的結(jié)果。
???????? 它可以刪除字段,數(shù)據(jù)庫,數(shù)據(jù)字典,基表,索引,表等級別的性能數(shù)據(jù)。
?
具體參考o(jì)racle 聯(lián)機(jī)文檔:
???????? http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_stats.htm#i1036461
?
3.2.4 保存分析數(shù)據(jù)
CREATE_STAT_TABLE Procedure
DROP_STAT_TABLE Procedure
????????
???????? 可以用這兩個(gè)存儲(chǔ)過程創(chuàng)建一個(gè)表,用于存放性能數(shù)據(jù),這樣有利于對性能數(shù)據(jù)的管理,也可以刪除這個(gè)表。
?
具體參考o(jì)racle 聯(lián)機(jī)文檔:
???????? http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_stats.htm#i1036461
?
?
3.2.5 導(dǎo)入和導(dǎo)出分析數(shù)據(jù)
EXPORT_COLUMN_STATS Procedure
EXPORT_DATABASE_STATS Procedure
EXPORT_DICTIONARY_STATS Procedure
EXPORT_FIXED_OBJECTS_STATS Procedure
EXPORT_INDEX_STATS Procedure
EXPORT_SCHEMA_STATS Procedure
EXPORT_SYSTEM_STATS Procedure
EXPORT_TABLE_STATS Procedure
IMPORT_COLUMN_STATS Procedure
IMPORT_DATABASE_STATS Procedure
IMPORT_DICTIONARY_STATS Procedure
IMPORT_FIXED_OBJECTS_STATS Procedure
IMPORT_INDEX_STATS Procedure
IMPORT_SCHEMA_STATS Procedure
IMPORT_SYSTEM_STATS Procedure
IMPORT_TABLE_STATS Procedure
?
這些存儲(chǔ)過程可以將已經(jīng)有的性能指標(biāo)導(dǎo)入到用戶創(chuàng)建好的表中存放,需要時(shí),可以從表中倒回來。
?
具體參考o(jì)racle 聯(lián)機(jī)文檔:
???????? http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_stats.htm#i1036461
?
?
3.2.6 鎖定分析數(shù)據(jù)
LOCK_SCHEMA_STATS Procedure
LOCK_TABLE_STATS Procedure
UNLOCK_SCHEMA_STATS Procedure
UNLOCK_TABLE_STATS Procedure
The LOCK_* procedures either freeze the current set of the statistics or to keep the statistics empty (uncollected).When statistics on a table are locked, all the statistics depending on the table, including table statistics, column statistics, histograms and statistics on all dependent indexes, are considered to be locked.
可能在某些時(shí)候,我們覺得當(dāng)前的統(tǒng)計(jì)信息非常好,執(zhí)行計(jì)劃很準(zhǔn)確,并且表中數(shù)據(jù)幾乎不變化,那么可以使用LOCK_TABLE_STATS Procedure 來鎖定表的統(tǒng)計(jì)信息,不允許對表做分析或者設(shè)定分析數(shù)據(jù)。 當(dāng)表的分析數(shù)據(jù)被鎖定之后,相關(guān)的所有分析數(shù)據(jù),包括表級,列級,直方圖,索引的分析數(shù)據(jù)都將被鎖定,不允許被更新。
?
具體參考o(jì)racle 聯(lián)機(jī)文檔:
???????? http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_stats.htm#i1036461
?
?
3.2.7 分析數(shù)據(jù)的恢復(fù)
RESET_PARAM_DEFAULTS Procedure
RESTORE_DICTIONARY_STATS Procedure
RESTORE_FIXED_OBJECTS_STATS Procedure
RESTORE_SCHEMA_STATS Procedure
RESTORE_SYSTEM_STATS Procedure
RESTORE_TABLE_STATS Procedure
Whenever statistics in dictionary are modified, old versions of statistics are saved automatically for future restoring. The old statistics are purged automatically at regular intervals based on the statistics history retention setting and the time of recent statistics gathering performed in the system. Retention is configurable using the ALTER_STATS_HISTORY_RETENTION Procedure.
比如我們重新分析了表,發(fā)現(xiàn)分析的數(shù)據(jù)導(dǎo)致了CBO選擇了錯(cuò)誤的執(zhí)行計(jì)劃,為了挽救這種局面,可以將統(tǒng)計(jì)信息恢復(fù)到從前的那個(gè)時(shí)間點(diǎn),也就是CBO執(zhí)行計(jì)劃正確的時(shí)間點(diǎn),先解決這個(gè)問題,再來分析問題的原因。
?
具體參考o(jì)racle 聯(lián)機(jī)文檔:
???????? http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_stats.htm#i1036461
?
?
四.???????? 動(dòng)態(tài)采樣
?
4.1 什么是動(dòng)態(tài)采樣
???????? 動(dòng)態(tài)采樣(Dynamic Sampling)技術(shù)的最初提出是在Oracle 9i R2,在段(表,索引,分區(qū))沒有分析的情況下,為了使CBO 優(yōu)化器得到足夠的信息以保證做出正確的執(zhí)行計(jì)劃而發(fā)明的一種技術(shù),可以把它看做分析手段的一種補(bǔ)充。
???????? 當(dāng)段對象沒有統(tǒng)計(jì)信息時(shí)(即沒有做分析),動(dòng)態(tài)采樣技術(shù)可以通過直接從需要分析的對象上收集數(shù)據(jù)塊(采樣)來獲得CBO需要的統(tǒng)計(jì)信息。
?
一個(gè)簡單的例子:
?
創(chuàng)建表:
SQL> create table t
? 2? as
? 3? select owner,object_type from all_objects;
表已創(chuàng)建。
?
查看表的記錄數(shù):
SQL> select count(*) from t;
COUNT(*)
----------
72236? -- 記錄數(shù)
?
這里創(chuàng)建了一張普通表,沒有做分析,我們在hint中用0級來限制動(dòng)態(tài)采樣,此時(shí)CBO 唯一可以使用的信息就是表存儲(chǔ)在數(shù)據(jù)字典中的一些信息,如有多少個(gè)extent,有多少個(gè)block,但是這些信息是不夠的。
?
SQL> set autot traceonly explain
SQL> select /*+dynamic_sampling(t 0) */ * from t;
?
執(zhí)行計(jì)劃
----------------------------------------------------------
Plan hash value: 1601196873
?
--------------------------------------------------------------------------
| Id? | Operation???????? | Name | Rows? | Bytes | Cost (%CPU)| Time???? |
--------------------------------------------------------------------------
|?? 0 | SELECT STATEMENT? |????? | 15928 |?? 435K|??? 55?? (0)| 00:00:01 |
|?? 1 |? TABLE ACCESS FULL| T??? | 15928 |?? 435K|??? 55?? (0)| 00:00:01 |
?
在沒有做動(dòng)態(tài)分析的情況下,CBO 估計(jì)的記錄數(shù)是15928條,與真實(shí)的72236 相差甚遠(yuǎn)。
?
我們用動(dòng)態(tài)分析來查看一下:
SQL> select * from t;
執(zhí)行計(jì)劃
----------------------------------------------------------
Plan hash value: 1601196873
?
--------------------------------------------------------------------------
| Id? | Operation???????? | Name | Rows? | Bytes | Cost (%CPU)| Time???? |
--------------------------------------------------------------------------
|?? 0 | SELECT STATEMENT? |????? | 80232 |? 2193K|??? 56?? (2)| 00:00:01 |
|?? 1 |? TABLE ACCESS FULL| T??? | 80232 |? 2193K|??? 56?? (2)| 00:00:01 |
--------------------------------------------------------------------------
?
Note
-----
?? - dynamic sampling used for this statement (level=2)
?
在Oracle 10g中默認(rèn)對沒有分析的段做動(dòng)態(tài)采樣,上面的查詢結(jié)果顯示使用了Level 2級的動(dòng)態(tài)采樣,CBO 估計(jì)的結(jié)果是80232 與72236 很接近了。
?
注意一點(diǎn):
???????? 在沒有動(dòng)態(tài)采樣的情況下,對于沒有分析過的段,CBO也可能錯(cuò)誤地將結(jié)果判斷的程度擴(kuò)大話。 如:
SQL> delete from t;
已刪除72236行。
SQL> commit;
提交完成。
SQL> select /*+dynamic_sampling(t 0) */ * from t;
執(zhí)行計(jì)劃
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id? | Operation???????? | Name | Rows? | Bytes | Cost (%CPU)| Time???? |
--------------------------------------------------------------------------
|?? 0 | SELECT STATEMENT? |????? | 15928 |?? 435K|??? 55?? (0)| 00:00:01 |
|?? 1 |? TABLE ACCESS FULL| T??? | 15928 |?? 435K|??? 55?? (0)| 00:00:01 |
--------------------------------------------------------------------------
?
SQL> select * from t;
執(zhí)行計(jì)劃
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id? | Operation???????? | Name | Rows? | Bytes | Cost (%CPU)| Time???? |
--------------------------------------------------------------------------
|?? 0 | SELECT STATEMENT? |????? |??? ?1 |??? 28 |??? 55?? (0)| 00:00:01 |
|?? 1 |? TABLE ACCESS FULL| T??? |???? 1 |??? 28 |??? 55?? (0)| 00:00:01 |
--------------------------------------------------------------------------
Note
-----
?? - dynamic sampling used for this statement (level=2)
?
如果細(xì)心一點(diǎn),可能看出2個(gè)執(zhí)行計(jì)劃的差別。 在沒有采用動(dòng)態(tài)分析的情況下,CBO 對t表估計(jì)的還是15928行記錄,但是用動(dòng)態(tài)分析就顯示1條記錄。 而表中的數(shù)據(jù)在查詢之前已經(jīng)刪除掉了。? 出現(xiàn)這種情況的原因是因?yàn)楦咚弧?雖然表的數(shù)據(jù)已經(jīng)刪除,但是表分配的extent 和block 沒有被回收,所以在這種情況下CBO 依然認(rèn)為有那么多的數(shù)據(jù)在那。
????????
???????? 通過這一點(diǎn),我們可以看出,此時(shí)CBO能夠使用的信息非常有限,也就是這個(gè)表有幾個(gè)extent,有幾個(gè)block。 但動(dòng)態(tài)采樣之后,Oracle 立即發(fā)現(xiàn),原來數(shù)據(jù)塊中都是空的。
?
關(guān)于Oracle 高水位,參考我的blog:Oracle 高水位(HWM)
http://blog.csdn.net/tianlesoftware/archive/2009/10/22/4707900.aspx
?
動(dòng)態(tài)采樣有兩方面的作用:
(1)?????? CBO 依賴的是充分的統(tǒng)計(jì)分析信息,但是并不是每個(gè)用戶都會(huì)非常認(rèn)真,及時(shí)地去對每個(gè)表做分析。 為了保證執(zhí)行計(jì)劃都盡可能地正確,Oracle 需要使用動(dòng)態(tài)采樣技術(shù)來幫助CBO 獲取盡可能多的信息。
(2)?????? 全局臨時(shí)表。 通常來講,臨時(shí)表的數(shù)據(jù)是不做分析的,因?yàn)樗娣诺臄?shù)據(jù)是臨時(shí)性的,可能很快就釋放了,但是當(dāng)一個(gè)查詢關(guān)聯(lián)到這樣的臨時(shí)表時(shí),CBO要想獲得臨時(shí)表上的統(tǒng)計(jì)信息分析數(shù)據(jù),就只能依賴于動(dòng)態(tài)采樣了。
?
動(dòng)態(tài)采樣除了可以在段對象沒有分析時(shí),給CBO提供分析數(shù)據(jù)之外,還有一個(gè)獨(dú)特的能力,它可以對不同列之間的相關(guān)性做統(tǒng)計(jì)。
?
相對的,表分析的信息是獨(dú)立的。 如:
(1)?????? 表的行數(shù),平均行長。
(2)?????? 表的每個(gè)列的最大值,最小值,重復(fù)率,也可能包含直方圖。
(3)?????? 索引的聚合因子,索引葉的塊數(shù)目,索引的高度等。
?
盡管看到動(dòng)態(tài)采樣的優(yōu)點(diǎn),但是它的缺點(diǎn)也是顯而易見,否則Oracle 一定會(huì)一直使用動(dòng)態(tài)采樣來取代數(shù)據(jù)分析:
(1)?????? 采樣的數(shù)據(jù)塊有限,對于海量數(shù)據(jù)的表,結(jié)果難免有偏差。
(2)?????? 采樣會(huì)消耗系統(tǒng)資源,特別是OLTP數(shù)據(jù)庫,尤其不推薦使用動(dòng)態(tài)采樣。
?
?
4.2 動(dòng)態(tài)采樣的級別
???????? Oracle 為動(dòng)態(tài)采樣劃分了11個(gè)級別,在Oracle 的官網(wǎng)上詳細(xì)的介紹。
?????????????????? 13.5.7.4 Dynamic Sampling Levels
????????????? http://download.oracle.com/docs/cd/E11882_01/server.112/e10821/stats.htm#PFGRF94760
?
The sampling levels are as follows if the dynamic sampling level used is from a cursor hint or from the OPTIMIZER_DYNAMIC_SAMPLING initialization parameter:
Level 0: Do not use dynamic sampling.
Level 1: Sample all tables that have not been analyzed if the following criteria are met: (1) there is at least 1 unanalyzed table in the query; (2) this unanalyzed table is joined to another table or appears in a subquery or non-mergeable view; (3) this unanalyzed table has no indexes; (4) this unanalyzed table has more blocks than the number of blocks that would be used for dynamic sampling of this table. The number of blocks sampled is the default number of dynamic sampling blocks (32).
Level 2: Apply dynamic sampling to all unanalyzed tables. The number of blocks sampled is two times the default number of dynamic sampling blocks.
Level 3: Apply dynamic sampling to all tables that meet Level 2 criteria, plus all tables for which standard selectivity estimation used a guess for a predicate that is a potential dynamic sampling predicate. The number of blocks sampled is the default number of dynamic sampling blocks. For unanalyzed tables, the number of blocks sampled is twice the default number of dynamic sampling blocks.
Level 4: Apply dynamic sampling to all tables that meet Level 3 criteria, plus all tables that have single-table predicates that reference 2 or more columns. The number of blocks sampled is the default number of dynamic sampling blocks. For unanalyzed tables, the number of blocks sampled is two times the default number of dynamic sampling blocks.
Levels 5, 6, 7, 8, and 9: Apply dynamic sampling to all tables that meet the previous level criteria using 2, 4, 8, 32, or 128 times the default number of dynamic sampling blocks respectively.
Level 10: Apply dynamic sampling to all tables that meet the Level 9 criteria using all blocks in the table.
?
The sampling levels are as follows if the dynamic sampling level for a table is set using the DYNAMIC_SAMPLING optimizer hint:
Level 0: Do not use dynamic sampling.
Level 1: The number of blocks sampled is the default number of dynamic sampling blocks (32).
Levels 2, 3, 4, 5, 6, 7, 8, and 9: The number of blocks sampled is 2, 4, 8, 16, 32, 64, 128, or 256 times the default number of dynamic sampling blocks respectively.
Level 10: Read all blocks in the table.
4.2.1 Level 0
???????? 不做動(dòng)態(tài)分析
?
4.2.2 Level 1
???????? Oracle 對沒有分析的表進(jìn)行動(dòng)態(tài)采樣,但需要同時(shí)滿足以下4個(gè)條件。
(1)?????? SQL中至少有一個(gè)未分析的表
(2)?????? 未分析的表出現(xiàn)在關(guān)聯(lián)查詢或者子查詢中
(3)?????? 未分析的表沒有索引
(4)?????? 未分析的表占用的數(shù)據(jù)塊要大于動(dòng)態(tài)采樣默認(rèn)的數(shù)據(jù)塊(32個(gè))
?
4.2.3 Level 2
???????? 對所有的未分析表做分析,動(dòng)態(tài)采樣的數(shù)據(jù)塊是默認(rèn)數(shù)據(jù)塊的2倍。
?
4.2.4 Level 3
???????? 采樣的表包含滿足Level 2定義的所有表,同時(shí)包括,那些謂詞有可能潛在地需要?jiǎng)討B(tài)采樣的表,這些動(dòng)態(tài)采樣的數(shù)據(jù)塊為默認(rèn)數(shù)據(jù)塊,對沒有分析的表,動(dòng)態(tài)采樣的默認(rèn)塊為默認(rèn)數(shù)據(jù)塊的2倍。
?
4.2.5 Level 4
???????? 采樣的表包含滿足Level 3定義的表,同時(shí)還包括一些表,他們包含一個(gè)單表的謂詞會(huì)引用另外的2個(gè)列或者更多的列;采樣的塊數(shù)是動(dòng)態(tài)采樣默認(rèn)數(shù)據(jù)塊數(shù);對沒有分析的表,動(dòng)態(tài)采樣的數(shù)據(jù)塊為默認(rèn)數(shù)據(jù)塊的2倍。
?
4.2.6 Level 5,6,7,8,9
???????? 采樣的表包含滿足Level 4定義的表,同時(shí)分別使用動(dòng)態(tài)采樣默認(rèn)數(shù)據(jù)塊的2,4,8,32,128 倍的數(shù)量來做動(dòng)態(tài)分析。
?
4.2.7 Level 10
???????? 采樣的表包含滿足Level 9定義的所有表,同時(shí)對表的所有數(shù)據(jù)進(jìn)行動(dòng)態(tài)采樣。
?
?
采樣的數(shù)據(jù)塊越多,得到的分析數(shù)據(jù)就越接近與真實(shí),但同時(shí)伴隨著資源消耗的也越大。
?
?
4.3 什么時(shí)候使用動(dòng)態(tài)采樣
???????? 動(dòng)態(tài)采樣也需要額外的消耗數(shù)據(jù)庫資源,所以,如果 SQL 被反復(fù)執(zhí)行,變量被綁定,硬分析很少,在這樣一個(gè)環(huán)境中,是不宜使用動(dòng)態(tài)采樣的,就像OLTP系統(tǒng)。 動(dòng)態(tài)采樣發(fā)生在硬分析時(shí),如果很少有硬分析發(fā)生,動(dòng)態(tài)采樣的意義就不大。
?
???????? 而在OLAP或者數(shù)據(jù)倉庫環(huán)境下,SQL執(zhí)行消耗的資源要遠(yuǎn)遠(yuǎn)大于SQL解析,那么讓解析在消耗多一點(diǎn)資源做一些動(dòng)態(tài)采樣分析,從而做出一個(gè)最優(yōu)的執(zhí)行計(jì)劃是非常值得的。 實(shí)際上在這樣的環(huán)境中,硬分析消耗的資源幾乎是可以忽略的。
?
???????? 所以,一般在OLAP 或者數(shù)據(jù)倉庫環(huán)境中,將動(dòng)態(tài)采樣的level 設(shè)置為3或者4 比較好。 相反,在OLTP系統(tǒng)下,不應(yīng)該使用動(dòng)態(tài)采樣。
?
?
FROM:http://blog.csdn.net/tianlesoftware/article/details/5845028
轉(zhuǎn)載于:https://www.cnblogs.com/zlja/archive/2011/07/11/2449092.html
總結(jié)
以上是生活随笔為你收集整理的Oracle 分析及动态采样的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。