Oracle全文索引
Oracle全文索引
一、設置詞法分析器
Oracle實現全文檢索,其機制其實很簡單。即通過Oracle專利的詞法分析器(lexer),將文章中所有的表意單元(Oracle?稱為?term)找出來,記錄在一組?以dr$開頭的表中,同時記下該term出現的位置、次數、hash?值等信息。檢索時,Oracle?從這組表中查找相應的term,并計算其出現頻率,根據某個算法來計算每個文檔的得分(score),即所謂的‘匹配率’。而lexer則是該機制的核心,它決定了全文檢索的效率。Oracle?針對不同的語言提供了不同的?lexer,?而我們通常能用到其中的三個:
?
n??????basic_lexer:?針對英語。它能根據空格和標點來將英語單詞從句子中分離,還能自動將一些出現頻率過高已經失去檢索意義的單詞作為‘垃圾’處理,如if , is?等,具有較高的處理效率。但該lexer應用于漢語則有很多問題,由于它只認空格和標點,而漢語的一句話中通常不會有空格,因此,它會把整句話作為一個term,事實上失去檢索能力。以‘中國人民站起來了’這句話為例,basic_lexer?分析的結果只有一個term ,就是‘中國人民站起來了’。此時若檢索‘中國’,將檢索不到內容。
n??????chinese_vgram_lexer:?專門的漢語分析器,支持所有漢字字符集(ZHS16CGB231280?ZHS16GBK?ZHT32EUC?ZHT16BIG5?ZHT32TRIS?ZHT16MSWIN950?ZHT16HKSCS?UTF8?)。該分析器按字為單元來分析漢語句子。‘中國人民站起來了’這句話,會被它分析成如下幾個term: ‘中’,‘中國’,‘國人’,‘人民’,‘民站’,‘站起’,起來’,‘來了’,‘了’??梢钥闯?#xff0c;這種分析方法,實現算法很簡單,并且能實現‘一網打盡’,但效率則是差強人意。
n??????chinese_lexer:?這是一個新的漢語分析器,只支持utf8字符集。上面已經看到,chinese vgram lexer這個分析器由于不認識常用的漢語詞匯,因此分析的單元非常機械,像上面的‘民站’,‘站起’在漢語中根本不會單獨出現,因此這種term是沒有意義的,反而影響效率。chinese_lexer的最大改進就是該分析器?能認識大部分常用漢語詞匯,因此能更有效率地分析句子,像以上兩個愚蠢的單元將不會再出現,極大?提高了效率。但是它只支持?utf8,?如果你的數據庫是zhs16gbk字符集,則只能使用笨笨的那個Chinese vgram lexer.
如果不做任何設置,Oracle?缺省使用basic_lexer這個分析器。要指定使用哪一個lexer,?可以這樣操作:
BEGIN
? ctx_ddl.create_preference ('my_lexer', 'chinese_vgram_lexer');
END;
/
其中my_lexer是分析器名。
?
二、建立全文索引
在建立intermedia索引時,指明所用的lexer:
CREATE INDEX? myindex ON mytable(mycolumn) indextype is ctxsys.context??parameters('lexer my_lexer');
※個人體會:全文索引建立后,用pl/sql developer工具view table,在index這一欄是看不到索引信息的。
而本人在刪除全文索引時遇到過一下報錯:
SQL> drop index searchkeytbl_key;
drop index searchkeytbl_key
ORA-29868: cannot issue DDL on a domain index marked as LOADING
解決方法:
ORA-29868: cannot issue DDL on a domain index marked as LOADING
說明:在創建索引的時候斷開、重啟等導致索引中斷沒有執行成功,之后再drop或者rebuild等操作的時候都會報此錯誤
解決:只能drop index ind_name force強行刪除,然后再重建
三、索引同步維護
用以下的兩個job來完成(該job要建在和表同一個用戶下) :
VARIABLE jobno number;
BEGIN
DBMS_JOB.SUBMIT(:jobno,'ctx_ddl.sync_index(''index_name'');',
SYSDATE, 'SYSDATE + (1/24/4)');
commit;
END;?????????? //同步
?
VARIABLE jobno number;
BEGIN
DBMS_JOB.SUBMIT(:jobno,'ctx_ddl.optimize_index(''myindex'',''FULL'');',
SYSDATE, 'SYSDATE + 1');
commit;?????????//優化
建完后手動運行下:
exec dbms_job.run(jobno);
※個人體會:運行job可能會有問題,此時可以單獨運行索引,嘗試一下
exec ctx_ddl.sync_index('index_name');
如果單獨運行沒有問題,則檢查job是否寫錯或者當前操作的oracle數據庫用戶有無運行存儲過程的權限
SQL> exec dbms_job.run(190);
begin dbms_job.run(190); end;
ORA-12011: execution of 1 jobs failed
ORA-06512: at "SYS.DBMS_IJOB", line 406
ORA-06512: at "SYS.DBMS_JOB", line 272
ORA-06512: at line 1
以上報錯就是用戶沒有運行任何存儲過程造成的,此時需要對用戶加上這個權限:
SQL> grant execute any procedure to oracle_username;?
再看一下job的情況
select * from user_jobs;
?
四、測試
關聯查詢: select * from?table_name where contains (column_name,'keyword') >0;
SQL> select * from searchkeytbl where type='城市' and contains (key,'楊浦') >0;
USERNAME???????????? TYPE???????????????????????????????????? KEY
-------------------- ---------------------------------------- --------------------------------------------------------------------------------
mujian80???????????? 城市???????????????????????????????????? 上海市楊浦區
?
五、問題
加全文索引遇到的問題(不斷更新)
SQL> create index gh_ghname_idx on gh(ghname) indextype is ctxsys.context parameters('lexer gh_ghname_lexer');
create index gh_ghname_idx on gh(ghname) indextype is ctxsys.context parameters('lexer gh_ghname_lexer')
ORA-24795: Illegal COMMIT attempt made
ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine
ORA-20000: Oracle Text error:
DRG-50857: oracle error in drvddl.IndexCreate
ORA-20000: Oracle Text error:
DRG-50857: oracle error in drvdml.MaintainKTab
ORA-24795: Illegal COMMIT attempt made
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.TEXTINDEXMETHODS", line 364
?
?
To avoid the error, please use one of the following solutions
1. Don't use a 32k-blocksized tablespace to store the internal index objects
- or -
2. Download Patch 5596325 from Metalink and apply it as described in the README file.
通過案例學調優之--Oracle 全文索引
全文檢索(oracle text)?
? ? ? Oracle Text使Oracle9i具備了強大的文本檢索能力和智能化的文本管理能力,Oracle Text 是 Oracle9i 采用的新名稱,在 oracle8/8i 中被稱為 oracle intermedia text,oracle8 以前是 oracle context cartridge。Oracle Text 的索引和查找功能并不局限于存儲在數據庫中的數據。 它可以對存儲于文件系統中的文檔進行檢索和查找,并可檢索超過 150 種文檔類型,包括 Microsoft Word、PDF和XML。Oracle Text查找功能包括模糊查找、詞干查找(搜索mice 和查找 mouse)、通配符、相近性等查找方式,以及結果分級和關鍵詞突出顯示等。你甚至 可以增加一個詞典,以查找搭配詞,并找出包含該搭配詞的文檔。?
? ? ? Oracle text 需要為可檢索的數據項建立索引,用戶才能夠通過搜索查找內容,索引進 程是根據管道建模的,在這個管道中,數據經過一系列的轉換后,將其關鍵字會添加到索引 中。該索引進程分為多個階段,如下圖?
??
1.數據檢索(Datastore):只是將數據從數據存儲(例如 web 頁面、數據庫大型對象或本 地文件系統)中取出,然后作為數據流傳送到下一個階段。?
2. 過濾(Filter):過濾器負責將各種文件格式的數據轉換為純文本格式,索引管道中的其 他組件只能處理純文本數據,不能識別 Ms word 或 excel 等文件格式。?
3. 分段(Sectioner):分段器添加關于原始數據項結構的元數據。?
4. 詞法分析(Lexer):根據數據項的語言將字符流分為幾個字詞。 5. 索引(Index):最后一個階段將關鍵字添加到實際索引中。?
全文檢索和普通檢索的區別
? ? ? 不使用Oracle text功能,當然也有很多方法可以在Oracle數據庫中搜索文本,比如INSTR函數和LIKE操作:
1 、SELECT *FROM mytext WHERE INSTR (thetext, 'Oracle') > 0;
2 、SELECT * FROM mytext WHERE thetext LIKE '%Oracle%';
有很多時候,使用instr和like是很理想的, 特別是搜索僅跨越很小的表的時候。然而通過這些文本定位的方法將導致全表掃描,對資源來說消耗比較昂貴,而且實現的搜索功能也非常有限,因此對海量的文本數據進行搜索時,建議使用oralce提供的全文檢索功能。
附:這里順帶記錄一下INSTR和LIKE:
Oracle中,可以使用 Instr 函數對某個字符串進行判斷,判斷其是否含有指定的字符。其語法為:Instr(string, substring, position, occurrence)。
string:代表源字符串(寫入字段則表示此字段的內容)。
substring:代表想從源字符串中查找的子串。
position:代表查找的開始位置,該參數可選的,默認為1。
occurrence:代表想從源字符中查找出第幾次出現的substring,該參數也是可選的,默認為1。
position 的值為負數,那么代表從右往左進行查找。
instr和like的性能比較
其實從效率角度來看,誰能用到索引,誰的查詢速度就會快。
like有時可以用到索引,例如:name like ‘李%’,而當下面的情況時索引會失效:name like ‘%李’。所以一般我們查找中文類似于‘%字符%’時,索引都會失效。與其他數據庫不同的是,oracle支持函數索引。例如在name字段上建個instr索引,查詢速度就比較快了,這也是為什么instr會比like效率高的原因。
注:instr(title,’手冊’)>0 相當于like‘%手冊%’
instr(title,’手冊’)=0 相當于not like‘%手冊%’
Oracle Text 索引原理?
? ? Oracle text 索引將文本中所有的字符轉化成記號(token),如 www.taobao.com 會轉化 成 www,taobao,com 這樣的記號。
Oracle10g 里面支持四種類型的索引:
context、ctxcat、ctxrule、ctxxpath??
CONTEXT?
? ? ? ?用于對含有大量連續文本數據進行檢索。支持 word、html、xml、text 等很多數據格式。支持范圍(range)分區,支持并行創建索引(Parallel indexing)的索引類型。支持類型:VARCHAR2, CLOB, BLOB, CHAR, BFILE, XMLType, and URIType.?
DML 操作后,需要 CTX_DDL.SYNC_INDEX 手工同步索引 如果有查詢包含多個詞語,直接用空格隔開(如 oracle itpub)?
案例分析:
設置全文檢索
步驟步驟一:檢查和設置數據庫角色
? ? ?首先檢查數據庫中是否有CTXSYS用戶和CTXAPP腳色。如果沒有這個用戶和角色,意味著你的數據庫創建時未安裝intermedia功能(10G默認安裝都有此用戶和角色)。你必須修改數據庫以安裝這項功能。默認安裝情況下,ctxsys用戶是被鎖定的,因此要先啟用ctxsys的用戶。
11:53:13?SYS@?prod?>select?username,account_status?from?dba_users?where?username?like?'CTX%'; USERNAME???????????????????????ACCOUNT_STATUS ------------------------------?-------------------------------- CTXSYS?????????????????????????EXPIRED?&?LOCKED11:54:17?SYS@?prod?>alter?user?ctxsys?identified?by?oracle?account?unlock; User?altered.11:55:07?SYS@?prod?>select?username,account_status?from?dba_users?where?username?like?'CTX%'; USERNAME???????????????????????ACCOUNT_STATUS ------------------------------?-------------------------------- CTXSYS?????????????????????????OPEN12:00:13?SYS@?prod?>select?role?from?dba_roles 12:00:23???2???where?role?like?'CTX%'; ROLE ------------------------------ CTXAPP步驟二:賦權
在ctxsys用戶下,授予測試用戶scott以下權限:
[oracle@RH6?~]$?cat?t.sql GRANT?resource,?CONNECT,?ctxapp?TO?scott; GRANT?EXECUTE?ON?ctxsys.ctx_cls?TO?scott; GRANT?EXECUTE?ON?ctxsys.ctx_ddl?TO?scott; GRANT?EXECUTE?ON?ctxsys.ctx_doc?TO?scott; GRANT?EXECUTE?ON?ctxsys.ctx_output?TO?scott; GRANT?EXECUTE?ON?ctxsys.ctx_query?TO?scott; GRANT?EXECUTE?ON?ctxsys.ctx_report?TO?scott; GRANT?EXECUTE?ON?ctxsys.ctx_thes?TO?scott; GRANT?EXECUTE?ON?ctxsys.ctx_ulexer?TO?scott;11:58:04?SYS@?prod?>@/home/oracle/t.sql Grant?succeeded. Elapsed:?00:00:00.15 Grant?succeeded. Elapsed:?00:00:00.21 Grant?succeeded. Elapsed:?00:00:00.09 Grant?succeeded. Elapsed:?00:00:00.09 Grant?succeeded. Elapsed:?00:00:00.13 Grant?succeeded. Elapsed:?00:00:00.07 Grant?succeeded. Elapsed:?00:00:00.09 Grant?succeeded. Elapsed:?00:00:00.10 Grant?succeeded. Elapsed:?00:00:00.07步驟三:設置詞法分析器(lexer)
? ? ???Oracle實現全文檢索,其機制其實很簡單。即通過Oracle專利的詞法分析器(lexer),將文章中所有的表意單元(Oracle 稱為 term)找出來,記錄在一組以dr$開頭的表中,同時記下該term出現的位置、次數、hash值等信息。檢索時,Oracle從這組表中查找相應的term,并計算其出現頻率,根據某個算法來計算每個文檔的得分(score),即所謂的‘匹配率’。而lexer則是該機制的核心,它決定了全文檢索的效率。Oracle針對不同的語言提供了不同的lexer,而我們通常能用到其中的三個:
[cpp]?view plaincopy print?
步驟四:在book_abstract字段建立索引使用剛剛設置的ORATEXT_LEXER :chinese_vgram_lexer作為分析器。
12:16:15?SCOTT@?prod?>CREATE?INDEX?demo_abstract?ON?textdemo(book_abstract)?indextype?IS?ctxsys.context?parameters('lexer?my_LEXER');??之后如上所述多出很多dr$開頭的表和索引,系統會創建四個相關的表: DR$DEMO_ABSTRACT$I(分詞后的TOKEN表) DR$DEMO_ABSTRACT$K DR$DEMO_ABSTRACT$N? DR$DEMO_ABSTRACT$R 14:56:16?SCOTT@?prod?>select?*?from?tab;TNAME??????????????????????????TABTYPE??CLUSTERID ------------------------------?-------?---------- BONUS??????????????????????????TABLE DEPT???????????????????????????TABLE DR$DEMO_ABSTRACT$I?????????????TABLE DR$DEMO_ABSTRACT$K?????????????TABLE DR$DEMO_ABSTRACT$N?????????????TABLE DR$DEMO_ABSTRACT$R?????????????TABLE EMP????????????????????????????TABLE SALGRADE???????????????????????TABLE TEXTDEMO???????????????????????TABLE9?rows?selected. 14:56:36?SCOTT@?prod?>desc?DR$DEMO_ABSTRACT$IName??????????????????????????????????????????????????????????????Null?????Type-----------------------------------------------------------------?--------?--------------------------------------------TOKEN_TEXT????????????????????????????????????????????????????????NOT?NULL?VARCHAR2(64)TOKEN_TYPE????????????????????????????????????????????????????????NOT?NULL?NUMBER(3)TOKEN_FIRST???????????????????????????????????????????????????????NOT?NULL?NUMBER(10)TOKEN_LAST????????????????????????????????????????????????????????NOT?NULL?NUMBER(10)TOKEN_COUNT???????????????????????????????????????????????????????NOT?NULL?NUMBER(10)TOKEN_INFO?????????????????????????????????????????????????????????????????BLOB14:57:45?SCOTT@?prod?>desc?DR$DEMO_ABSTRACT$KName??????????????????????????????????????????????????????????????Null?????Type-----------------------------------------------------------------?--------?--------------------------------------------DOCID??????????????????????????????????????????????????????????????????????NUMBER(38)TEXTKEY???????????????????????????????????????????????????????????NOT?NULL?ROWID14:57:57?SCOTT@?prod?>desc?DR$DEMO_ABSTRACT$NName??????????????????????????????????????????????????????????????Null?????Type-----------------------------------------------------------------?--------?--------------------------------------------NLT_DOCID?????????????????????????????????????????????????????????NOT?NULL?NUMBER(38)NLT_MARK??????????????????????????????????????????????????????????NOT?NULL?CHAR(1)14:58:11?SCOTT@?prod?>desc?DR$DEMO_ABSTRACT$RName??????????????????????????????????????????????????????????????Null?????Type-----------------------------------------------------------------?--------?--------------------------------------------ROW_NO?????????????????????????????????????????????????????????????????????NUMBER(3)DATA???????????????????????????????????????????????????????????????????????BLOB14:58:26?SCOTT@?prod?>select?index_name,index_type,table_name?from?user_indexes;INDEX_NAME?????????????????????INDEX_TYPE??????????????????TABLE_NAME ------------------------------?---------------------------?------------------------------ DEMO_ABSTRACT??????????????????DOMAIN??????????????????????TEXTDEMO SYS_C0013418???????????????????NORMAL??????????????????????TEXTDEMO PK_EMP?????????????????????????NORMAL??????????????????????EMP SYS_IL0000076525C00002$$???????LOB?????????????????????????DR$DEMO_ABSTRACT$R SYS_IOT_TOP_76528??????????????IOT?-?TOP???????????????????DR$DEMO_ABSTRACT$N SYS_IOT_TOP_76523??????????????IOT?-?TOP???????????????????DR$DEMO_ABSTRACT$K SYS_IL0000076520C00006$$???????LOB?????????????????????????DR$DEMO_ABSTRACT$I DR$DEMO_ABSTRACT$X?????????????NORMAL??????????????????????DR$DEMO_ABSTRACT$I PK_DEPT????????????????????????NORMAL??????????????????????DEPT9?rows?selected.下面的語句可以查看索引創建過程中是否發生了錯誤:
[cpp]?view plaincopy print?步驟五:查詢測試
查看執行計劃 15:04:36?SCOTT@?prod?>r1*?select?*?from?textdemo?where?contains(book_abstract,'gushi')>0Elapsed:?00:00:00.02Execution?Plan ---------------------------------------------------------- Plan?hash?value:?2570915478--------------------------------------------------------------------------------------------- |?Id??|?Operation???????????????????|?Name??????????|?Rows??|?Bytes?|?Cost?(%CPU)|?Time?????| --------------------------------------------------------------------------------------------- |???0?|?SELECT?STATEMENT????????????|???????????????|?????1?|??1392?|?????4???(0)|?00:00:01?| |???1?|??TABLE?ACCESS?BY?INDEX?ROWID|?TEXTDEMO??????|?????1?|??1392?|?????4???(0)|?00:00:01?| |*??2?|???DOMAIN?INDEX??????????????|?DEMO_ABSTRACT?|???????|???????|?????4???(0)|?00:00:01?| ---------------------------------------------------------------------------------------------Predicate?Information?(identified?by?operation?id): ---------------------------------------------------2?-?access("CTXSYS"."CONTAINS"("BOOK_ABSTRACT",'gushi')>0)Note ------?dynamic?sampling?used?for?this?statement?(level=2)Statistics ----------------------------------------------------------29??recursive?calls0??db?block?gets33??consistent?gets0??physical?reads0??redo?size796??bytes?sent?via?SQL*Net?to?client415??bytes?received?via?SQL*Net?from?client2??SQL*Net?roundtrips?to/from?client0??sorts?(memory)0??sorts?(disk)1??rows?processed15:04:37?SCOTT@?prod?>通過sql?trace查看詳細計劃(部分內容)SQL?ID:?2rsr1z6zkp24p Plan?Hash:?2570915478 select?* fromtextdemo?where?contains(book_abstract,:"SYS_B_0")>:"SYS_B_1"call?????count???????cpu????elapsed???????disk??????query????current????????rows -------?------??--------?----------?----------?----------?----------??---------- Parse????????1??????0.00???????0.00??????????0??????????0??????????0???????????0 Execute??????1??????0.01???????0.01??????????0????????250??????????0???????????0 Fetch????????2??????0.00???????0.00??????????0??????????2??????????0???????????1 -------?------??--------?----------?----------?----------?----------??---------- total????????4??????0.01???????0.01??????????0????????252??????????0???????????1 Misses?in?library?cache?during?parse:?1 Optimizer?mode:?ALL_ROWS Parsing?user?id:?101Rows?????Row?Source?Operation -------??---------------------------------------------------1??TABLE?ACCESS?BY?INDEX?ROWID?TEXTDEMO?(cr=12?pr=0?pw=0?time=0?us?cost=4?size=1392?card=1)1???DOMAIN?INDEX??DEMO_ABSTRACT?(cr=11?pr=0?pw=0?time=0?us?cost=4?size=0?card=0)declarecost?sys.ODCICost?:=?sys.ODCICost(NULL,?NULL,?NULL,?NULL);arg0?VARCHAR2(1)?:=?null;begin:1?:=?"CTXSYS"."TEXTOPTSTATS".ODCIStatsFunctionCost(sys.ODCIFuncInfo('CTXSYS','CTX_CONTAINS','TEXTCONTAINS',2),cost,sys.ODCIARGDESCLIST(sys.ODCIARGDESC(2,?'TEXTDEMO',?'SCOTT',?'"BOOK_ABSTRACT"',?NULL,?NULL,?NULL),?sys.ODCIARG DESC(1,?NULL,?NULL,?NULL,?NULL,?NULL,?NULL)),?arg0,?:5,sys.ODCIENV(:6,:7,:8,:9));if?cost.CPUCost?IS?NULL?then:2?:=?-1.0;else:2?:=?cost.CPUCost;end?if;if?cost.IOCost?IS?NULL?then:3?:=?-1.0;else:3?:=?cost.IOCost;end?if;if?cost.NetworkCost?IS?NULL?then:4?:=?-1.0;else:4?:=?cost.NetworkCost;end?if;exceptionwhen?others?thenraise;end;call?????count???????cpu????elapsed???????disk??????query????current????????rows -------?------??--------?----------?----------?----------?----------??---------- Parse????????2??????0.00???????0.00??????????0??????????0??????????0???????????0 Execute??????2??????0.00???????0.00??????????0?????????18??????????0???????????2 Fetch????????0??????0.00???????0.00??????????0??????????0??????????0???????????0 -------?------??--------?----------?----------?----------?----------??---------- total????????4??????0.00???????0.00??????????0?????????18??????????0???????????2Misses?in?library?cache?during?parse:?0 Optimizer?mode:?ALL_ROWS Parsing?user?id:?101?????(recursive?depth:?1) 全文索引和DML操作
Insert 操作:
15:19:21?SCOTT@?prod?>insert?into?textdemo?values?(20,'huoda',sysdate,'musilin?de?zangli','meili?de?rensheng','/home/2.txt'); 1?row?created. 15:20:10?SCOTT@?prod?>commit;15:21:35?SCOTT@?prod?>select?id,BOOK_ABSTRACT?from?textdemo?where?BOOK_ABSTRACT?like?'%rensheng%'ID?BOOK_ABSTRACT ----------?--------------------------------------------------20?meili?de?rensheng15:23:12?SCOTT@?prod?>set?autotrace?on 15:23:38?SCOTT@?prod?>select?id,BOOK_ABSTRACT?from?textdemo?where?contains(BOOK_ABSTRACT,'rensheng')>0 no?rows?selected Execution?Plan ---------------------------------------------------------- Plan?hash?value:?2570915478 --------------------------------------------------------------------------------------------- |?Id??|?Operation???????????????????|?Name??????????|?Rows??|?Bytes?|?Cost?(%CPU)|?Time?????| --------------------------------------------------------------------------------------------- |???0?|?SELECT?STATEMENT????????????|???????????????|?????1?|??1027?|?????4???(0)|?00:00:01?| |???1?|??TABLE?ACCESS?BY?INDEX?ROWID|?TEXTDEMO??????|?????1?|??1027?|?????4???(0)|?00:00:01?| |*??2?|???DOMAIN?INDEX??????????????|?DEMO_ABSTRACT?|???????|???????|?????4???(0)|?00:00:01?| --------------------------------------------------------------------------------------------- Predicate?Information?(identified?by?operation?id): ---------------------------------------------------2?-?access("CTXSYS"."CONTAINS"("BOOK_ABSTRACT",'rensheng')>0) Note ------?dynamic?sampling?used?for?this?statement?(level=2) Statistics ----------------------------------------------------------23??recursive?calls0??db?block?gets33??consistent?gets0??physical?reads0??redo?size349??bytes?sent?via?SQL*Net?to?client404??bytes?received?via?SQL*Net?from?client1??SQL*Net?roundtrips?to/from?client0??sorts?(memory)0??sorts?(disk)0??rows?processed 15:26:40?SYS@?prod?>select?*?from?ctxsys.dr$pending;PND_CID????PND_PID?PND_ROWID??????????PND_TIMES?P ----------?----------?------------------?---------?-1082??????????0?AAASrlAAEAAAAI1AAB?21-NOV-14?N 15:26:26?SCOTT@?prod?>alter?index?demo_abstract?rebuild?parameters('sync'); Index?altered. 15:30:10?SCOTT@?prod?>select?id,BOOK_ABSTRACT?from?textdemo?where?contains(BOOK_ABSTRACT,'rensheng')>0;ID?BOOK_ABSTRACT ----------?--------------------------------------------------20?meili?de?rensheng?在做Insert操作時,Oracle會把一條信息放入到CTXSYS.DR$PENDING表里,必須手工進行同步才能更新全文索引。
Delete 操作:
? ?Delete 操作后,索引會立刻更新。
Update 操作:
15:38:14?SCOTT@?prod?>update?textdemo?set?BOOK_ABSTRACT='meili?de?gushi'?where?id=20; 1?row?updated.15:39:48?SYS@?prod?>select?*?from?ctxsys.dr$delete; no?rows?selected15:39:59?SYS@?prod?>select?*?from?ctxsys.dr$pending; no?rows?selected15:43:03?SCOTT@?prod?>select?id,BOOK_ABSTRACT?from?textdemo?where?contains(BOOK_ABSTRACT,'gushi')>0;ID?BOOK_ABSTRACT ----------?--------------------------------------------------10?zhen?shi?de?gushi15:43:14?SCOTT@?prod?>alter?index?demo_abstract?rebuild?parameters('sync'); Index?altered.15:43:39?SCOTT@?prod?>select?id,BOOK_ABSTRACT?from?textdemo?where?contains(BOOK_ABSTRACT,'gushi')>0;ID?BOOK_ABSTRACT ----------?--------------------------------------------------10?zhen?shi?de?gushi20?meili?de?gushi? ? ?對于update操作,應該是包含了Delete和Insert的操作,需要手工同步后才能更新索引。
對多字段建立全文索引
很多時候需要從多個文本字段中查詢滿足條件的記錄,這時就需要建立針對多個字段的全文索引,例如需要從pmhsubjects(專題表)的 subjectname(專題名稱)和briefintro(簡介)上進行全文檢索,則需要按以下步驟進行操作:
建立多字段索引的preference,以ctxsys登錄,并執行:
BEGIN
ctx_ddl.create_preference('ctx_demo_abstract_title','MULTI_COLUMN_DATASTORE');
END;
建立preference對應的字段值(以ctxsys登錄) 對應title path book_abstract三個字段建立索引:
BEGIN
ctx_ddl.set_attribute('ctx_demo_abstract_title ','columns','title,path');
END;
建立全文索引:
CREATE INDEX demo_abstract_title ON textdemo(book_abstract) indextype IS ctxsys.context parameters(' DATASTORE ctxsys. ctx_demo_ abstract_title lexer ORATEXT_LEXER'); ?
commit;
測試
SELECT score(20),t.* FROM textdemo t WHERE contains(book_abstract,'移動城堡 or 俄羅斯',20)>0;
對大字段進行檢索測試
CREATE TABLE mytable(id NUMBER PRIMARY KEY, docs CLOB);
INSERT INTO mytable VALUES(111555,'this text will be indexed');
INSERT INTO mytable VALUES(111556,'this is a direct_datastore example');
Commit;
?
CREATE INDEX myindex ON mytable(docs)
indextype IS ctxsys.context
parameters ('datastore ctxsys.default_datastore');
?
SELECT * FROM mytable WHERE contains(docs, 'text') > 0;
1.?創建數據存儲定義(Datastore),使用多列數據存儲在多列上創建全文索引
BEGIN
?????CTX_DDL.CREATE_PREFERENCE('INFOGRID_COM_DATASTORE','MULTI_COLUMN_DATASTORE');
CTX_DDL.SET_ATTRIBUTE('INFOGRID_COM_DATASTORE','columns','NAME,ADDRESS,BUSINESS_ZONE,FAREN,FUND,INTRODUCTION');
END;
2.創建詞法分析器(Lexer)
BEGIN
????CTX_DDL.CREATE_PREFERENCE('INFOGRID_LEXER',?'CHINESE_LEXER');
END;
?
3.創建全文索引(索引在DML提交后自動同步更新)
CREATE?INDEX?INFOGRID_COM_FULL_IDX?ON?g2b_com(NAME)
INDEXTYPE??IS??CTXSYS.CONTEXT
PARAMETERS?(
???'LEXER INFOGRID_LEXER
????DATASTORE INFOGRID_COM_DATASTORE
????SYNC (ON COMMIT)'?--使用實時同步(DML提交時同步全文索引)
)
--手工同步索引(如果使用自動同步,這步可省略)
begin
ctx_ddl.sync_index('INFOGRID_COM_FULL_IDX');
end;
?
4.使用全文索引查詢(按照出現頻率排序)
select?score(0),t.*?from?g2b_com t?where?contains(NAME,'條件一,條件二',0)>0?order?by?score(0)?desc
?
5.刪除全文索引(刪除詞法分析器,刪除數據存儲定義,刪除索引)
begin
ctx_ddl.drop_preference('INFOGRID_COM_DATASTORE');--刪除數據儲存定義DataStore
ctx_ddl.drop_preference('INFOGRID_LEXER');--刪除詞法分析器
end;
drop?index?INFOGRID_COM_FULL_IDX;--刪除索引
ORA-06553: PLS-907: cannot load library unit XDB.XDB_FUNCIMPL during export?
文檔 ID:?785728.1?類型:?PROBLEM?Modified Date:?25-FEB-2009?狀態:?REVIEWED
In this Document
??Symptoms
??Cause
??Solution
??References
This problem can occur on any platform.
SymptomsExport is failing with below errors:
????
??????? EXP-00056: ORACLE error 29900 encountered
??????? ORA-29900: operator binding does not exist
??????? ORA-06540: PL/SQL: compilation error
??????? ORA-06553: PLS-907: cannot load library unit XDB.XDB_FUNCIMPL (referenced by XDB.UNDER_PATH)
??????? EXP-00000: Export terminated unsuccessfullyCause
It seems you have some invalid objects in XDB schema or XDB installation is INVALID.
Solution-- check DBA Registry for status of installed componentsset pages 1000
column comp_id format A10
column version like comp_id
column comp_name format A35
col status for a15
select comp_id, status, version, comp_name from dba_registry order by 1;
-- check for invalid XDB objects
col owner for a15
col object_name for a30
col object_type for a30
select owner,object_name,object_type from dba_objects
where status != 'VALID'
and wner = 'XDB'
/
You can try to recompile all invalid objects using:
SQL> connect / as sysdba
SQL> @?/rdbms/admin/utlrp.sql
Then check the invalid XDB objects again. If there are no invalid XDB objects then try the export again.
If you still have invalid XDB objects which cannot be compiled then you may have to re-install XDB as described in the note specified below:
Note 243554.1?- How to Deinstall and Reinstall XML Database (XDB)
About Me
| ............................................................................................................................... ● 本文整理自網絡 ● 本文在itpub(http://blog.itpub.net/26736162)、博客園(http://www.cnblogs.com/lhrbest)和個人微信公眾號(xiaomaimiaolhr)上有同步更新 ● 本文itpub地址:http://blog.itpub.net/26736162/abstract/1/ ● 本文博客園地址:http://www.cnblogs.com/lhrbest ● 本文pdf版及小麥苗云盤地址:http://blog.itpub.net/26736162/viewspace-1624453/ ● 數據庫筆試面試題庫及解答:http://blog.itpub.net/26736162/viewspace-2134706/ ● QQ群:230161599???? 微信群:私聊 ● 聯系我請加QQ好友(646634621),注明添加緣由 ● 于 2017-05-09 09:00 ~ 2017-05-30 22:00 在魔都完成 ● 文章內容來源于小麥苗的學習筆記,部分整理自網絡,若有侵權或不當之處還請諒解 ● 版權所有,歡迎分享本文,轉載請保留出處 ............................................................................................................................... 拿起手機使用微信客戶端掃描下邊的左邊圖片來關注小麥苗的微信公眾號:xiaomaimiaolhr,掃描右邊的二維碼加入小麥苗的QQ群,學習最實用的數據庫技術。
|
| | | |
來自 “ ITPUB博客 ” ,鏈接:http://blog.itpub.net/26736162/viewspace-2139034/,如需轉載,請注明出處,否則將追究法律責任。
總結
以上是生活随笔為你收集整理的Oracle全文索引的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 盒子塌陷问题
- 下一篇: [转]常用CASE工具介绍