项目中遇到的ORA error 及解决办法 ---ora-07445
目的
本文檔主要介紹ora-07445錯(cuò)誤相關(guān)內(nèi)容,并給出了對(duì)這個(gè)錯(cuò)誤的進(jìn)一步診斷建議,文檔主要基于unix系統(tǒng)編寫,但原理通用。
文檔適用范圍
主要為DBA處理系統(tǒng)的ora-07445錯(cuò)誤時(shí)使用。
0ra-07445錯(cuò)誤的定義
當(dāng)oracle服務(wù)器進(jìn)程從操作系統(tǒng)收到一個(gè)致命的錯(cuò)誤信息時(shí)會(huì)拋出ora-07445錯(cuò)誤,這個(gè)錯(cuò)誤可以被oracle后臺(tái)進(jìn)程或者用戶進(jìn)程激發(fā)。當(dāng)錯(cuò)誤被拋出時(shí),系統(tǒng)會(huì)首先寫一個(gè)錯(cuò)誤日志到alert.log文件中,然后會(huì)寫跟蹤文件到user_dump_dest或background_dump_dest中;最后會(huì)將主存信息轉(zhuǎn)儲(chǔ)到core_dump_dest中。
操作系統(tǒng)有很多的非法操作設(shè)計(jì),一個(gè)經(jīng)常會(huì)碰到的情況就是,當(dāng)一個(gè)進(jìn)程訪問一個(gè)非法地址(比如系統(tǒng)預(yù)留地址)時(shí)致命錯(cuò)誤將會(huì)產(chǎn)生。
Ora-07445錯(cuò)誤是一個(gè)非常普通的錯(cuò)誤,可能在oracle的任何代碼中產(chǎn)生,該錯(cuò)誤代碼更詳細(xì)的描述需要進(jìn)一步跟蹤其跟蹤文件。
Ora-07445的表現(xiàn)方式
在不同的平臺(tái)上,ora-07445可能出現(xiàn)的情況有所不同,兩種比較經(jīng)常出現(xiàn)的方式如下所示:
實(shí)例1
ORA-07445: exception encountered: core dump [run_some_SQL()+268] [SIGBUS] [Invalid address alignment] [] [] []
實(shí)例2
ORA-07445: exception encountered: core dump [10] [2122262800] [261978112] [] [] []
實(shí)例1說明:
l???????? 錯(cuò)誤發(fā)生在函數(shù)run_some_sql()中
l???????? 進(jìn)程收到的信號(hào)是SIGBUS
l???????? 一些其他相關(guān)信息。
實(shí)例2給定的信息相對(duì)較少
l???????? 沒有給出導(dǎo)致錯(cuò)誤的函數(shù)名稱
l???????? 進(jìn)程收到的信息是signal 10
l???????? 一些在本次錯(cuò)誤中無用的信息
錯(cuò)誤發(fā)生時(shí)需要搜集哪些信息
1,? alert.log文件,這個(gè)文件至少可以查出ora-07445錯(cuò)誤發(fā)生前后的其他相關(guān)錯(cuò)誤,確認(rèn)init.ora文件信息也包含在里邊。
2,? 自實(shí)例上次啟動(dòng)以來所有的ora-07445和ora-00600錯(cuò)誤及其跟蹤文件。
?
Trace file文件中信息的相關(guān)說明
??? *** 2002-05-08 23:35:18.224???? <---timestamp
*** SESSION ID:(194.14075) 2002-05-08 23:35:18.202
Exception signal: 10 (SIGBUS), code: 1 (Invalid address alignment), addr: 0x41e7, PC: kjrfnd()+44
*** 2002-05-08 23:35:19.404
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [kjrfnd()+44] [SIGBUS] [Invalid
address alignment] [16871] [] []???? <----the errror
Current SQL statement for this session:?? <---the current SQL statement
DELETE FROM MY_TABLE WHERE COL1 < :b1
----- PL/SQL Call Stack -----
? object?? ???line? object
? handle?? ?number? name
e560c680??? ????35? anonymous block
e560c680??? ???290? anonymous block
----- Call Stack Trace -----?????????? <----Stack trace starts here
calling??? ??????????call?? ?? entry?????????? ?????argument values in hex
location??? ????????type?? point? ????????????? ?(? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp()+168?? ????????CALL???? ????????ksedst()+0??? ????????540 ? 0 ? FFBE4F98 ?
???? ????????????? ????????????? ????????????? ????????????? ????????????? ????????????? ?FFBE4A3C ? FFBE4A20 ? 0 ?
ssexhd()+380?? ?????????CALL??? ?????????ksedmp()+0?? ??????? 3 ? 0 ? 1 ? FFBE56B8 ? 1 ?
???? ????????????? ????????????? ????????????? ????????????? ????????????? ????????????? ?6 ?
sigacthandler()+40? ?PTR_CALL?? 00000000??????? ??????A ? FFBE5F10 ? 19FE000 ?
???? ????????????? ????????????? ????????????? ????????????? ????????????? ????????????? ?19FE000 ? 0 ? 0 ?
kjrfnd()+44?? ????????????? ?PTR_CALL 00000000????????? ?????A ? FFBE5F10 ? FFBE5C58 ?
kjrref()+176?? ????????????? CALL???????????? kjrfnd()+0??????????? 4177 ? F6A7F020 ? 0 ? 41DF ?
kjuocl()+732?? ???????????? CALL???????????? kjrref()+0??????????? ?FFBE63AC ? 19FA400 ?
kjusuc()+1260?? ?????????CALL??? ?????????kjuocl()+0?? ?????????FFBE6218 ? EB5FB9A8 ?
???? ????????????? ????????????? ????????????? ????????????? ????????????? ????????????? ?EB5FB9A8 ? 5 ? 5 ? 0 ?
ksipget()+832?? ??????????CALL?? ??????????kjusuc()+0? ?????????19FA400 ? FFBE63AC ? 0 ?
???? ????????????? ????????????? ????????????? ????????????? ????????????? ????????????? ??E2A2ED40 ? 19FA400 ? 8 ?
ksqcmi()+3356?? ?????????CALL??? ?????????ksipget()+0?? ???????10020 ? FFBE6648 ? EE15430C ?
???? ????????????? ????????????? ????????????? ????????????? ????????????? ????????????? ??0 ? 0 ? 0 ?
ksqgtl()+944?? ????????????CALL ???????????? ksqcmi()+0????????? FFBE65A8 ? 1 ? EDEB4C90 ?
???? ????????????? ????????????? ????????????? ????????????? ????????????? ????????????? ??EE1542D4 ? 1 ? 0 ?
<... lots of stuff deleted here ...>
sou2o()+20?? ????????????? ?CALL??????????? ?opidrv()+0?????????? 3C ? FFBEF784 ? 19F8000 ?
???? ????????????? ????????????? ????????????? ????????????? ????????????? ????????????? ???2F6C6F67 ? 0 ? 0 ?
main()+160?? ????????????? ??CALL?????????? ??sou2o()+0????????? ??FFBEFA80 ? 3C ? 4 ?
???? ????????????? ????????????? ????????????? ????????????? ????????????? ????????????? ???FFBEFA70 ? 1746CF4 ?
???? ????????????? ????????????? ????????????? ????????????? ????????????? ????????????? ???1A06318 ?
_start()+220?? ????????????? CALL???????????? main()+0??????????? ???0 ? FFBEFC2C ? 1A1D478 ?
???? ????????????? ????????????? ????????????? ????????????? ????????????? ????????????? ??19F8000 ? 0 ? 0 ?
?????? ????????????? ????????????? ????????????? ?????????? <----Stack trace ends here
----- Argument/Register Address Dump -----
重現(xiàn)錯(cuò)誤
如果客戶可以隨意的重現(xiàn)ora-07445錯(cuò)誤,那么診斷并解決問題的時(shí)間將會(huì)被縮短。重現(xiàn)錯(cuò)誤的第一步當(dāng)然是找出造成錯(cuò)誤的current sql。
文檔Note 154170.1中主要描述了怎么查找當(dāng)前錯(cuò)誤的執(zhí)行語句。但我們需要注意的是,當(dāng)前問題可能是由于前面幾個(gè)相關(guān)的其他語句執(zhí)行時(shí)的上下文環(huán)境決定是否能出問題,所以有可能找到當(dāng)前語句但無法重現(xiàn)問題。
在找到系統(tǒng)出錯(cuò)時(shí)執(zhí)行的語句后我們需要確認(rèn)下面的問題:
l???????? 是不是只有在當(dāng)前參數(shù)時(shí)系統(tǒng)才會(huì)報(bào)錯(cuò)
l???????? 是否在每天的固定時(shí)間點(diǎn)出錯(cuò)
l???????? 出錯(cuò)跟哪些系統(tǒng)操作有主要關(guān)聯(lián),比如數(shù)據(jù)庫的備份或者其他高消耗操作
l???????? 是在特定的應(yīng)用程序和用戶下出錯(cuò)還是所有的程序和用戶都出錯(cuò)
l???????? 第一次報(bào)錯(cuò)是什么時(shí)間,當(dāng)時(shí)對(duì)系統(tǒng)做了什么改變
l???????? 系統(tǒng)出錯(cuò)時(shí)有沒有伴隨其他的錯(cuò)誤產(chǎn)生
?
?
?
如何找出錯(cuò)誤ora-07445發(fā)生時(shí)系統(tǒng)執(zhí)行的語句
在trace file中查找錯(cuò)誤出現(xiàn)時(shí)的語句主要分兩個(gè)步驟:首先找到錯(cuò)誤發(fā)生時(shí)的執(zhí)行語句,然后需要找到語句中綁定變量的值。
Step 1:Find the SQL
在跟蹤文件中查找字符串“Current cursor”(一般在cursor dump段的起始部分),使用current cursor后面的數(shù)字定位出錯(cuò)時(shí)系統(tǒng)的執(zhí)行語句。
如果找到的這個(gè)數(shù)字為0說明沒有dump出有效的執(zhí)行語句。
如果找到的這個(gè)數(shù)字n不為0,接著往下查找,定位到字符串“cursor n”其中n為剛找到的數(shù)字。從10.2版本后,你可能需要定位到字符串“cursor #n”,這里cursor name后面跟隨的語句就是我們需要的sql。
另外我們也可以通過查找字符串“Current SQL statement for this session”來定位我們需要查找的sql語句,通常情況下,這個(gè)語句出現(xiàn)在trace file文件的開始部分。
如果定位到的sql語句中引用了變量(:a1…)那么我們需要通過下面步驟2找出綁定的變量值。
Step 2:find values of the bind variables
如果定位得到的sql語句中出現(xiàn)了綁定變量,那么我們將會(huì)在cursor name后面發(fā)現(xiàn)”bind *”之類的字符串,其中×為0到n-1的值,n為sql語句中綁定變量的個(gè)數(shù)。
對(duì)每個(gè)綁定變量都有一系列的屬性說明列表,下面簡(jiǎn)單描述列表后面的屬性。
Dty : databype 1 varchar2 or nvarchar2
??????????? 2 number
??????????? 8 long
??????????? 11 rowid
??????????? 12 date
??????????? 23 raw
??????????? 24 long raw
??????????? 96 char
??????????? 112 clob or nclob
??????????? 113 blob
??????????? 114 bfile
Mxl: the maximum lenth
Scl: the scale(for number columns)
Pre: the precision(for number columns)
Value: 綁定變量的值
??? 通過解析上面的內(nèi)容,你可以得到綁定變量的類型及其數(shù)據(jù)值,也有一些情況(非常少),你在bind *后面找不到values字節(jié),那么我們就不能通過這種方式得到綁定變量的值。
Examples
In the following we will work through some examples of how to extract the SQL statement from trace files.
IMPORTANT: Replacing bind variables with literals can result in the optimizer choosing a different query path and thus the problem may not reproduce!
Example 1:
You should now be able to find the datatype of the bind variable (including length, scale, and precision if applicable) and the value.
The cursor dump starts with:
******************** Cursor Dump ************************
Current cursor: 2, pgadep: 1
Cursor Dump:
----------------------------------------
so we are looking for cursor 2:
----------------------------------------Cursor 2 (20139ad0): CURFETCH curiob: 2013bca4
curflg: 7 curpar: 20139ab0 curusr: 0 curses 587a250c
cursor name: select text from view$ where rowid=:1
child pin: 50a5b650, child lock: 50a5a628, parent lock: 50a5a844
xscflg: 20141466, parent handle: 4f348490, xscfl2: 400
nxt: 2.0x0000006c nxt: 1.0x000001d8
Cursor frame allocation dump:
frm: -------- Comment -------- Size Seg Off
bhp size: 52/560
bind 0: dty=11 mxl=16(16) mal=00 scl=00 pre=00 oacflg=18 oacfl2=1 size=16
offset=0
bfp=2013e9f4 bln=16 avl=16 flg=05
value=0000138C.0046.0004
The current SQL is:
select text from view$ where rowid=:1and the bind variable translates into:
:1 ~ bind 0 - ROWID (dty=11), value = 0000138C.0046.0004so we can eg. reconstruct the original SQL statement as:
SQL> variable a1 varchar2(20)SQL>?exec :a1 := '0000138C.0046.0004';
SQL>?select text from view$ where rowid=:a1;
Note that we construct the statement using a SQL*Plus bind variable in order to prevent the optimizer from choosing a different plan (not that it would make any difference for this particular example).
Example 2:
The cursor dump starts with:
Current cursor: 11, pgadep: 1
Cursor Dump:
----------------------------------------
ie. we should look for cursor 11:
----------------------------------------Cursor 11 (202cb9f0): CURBOUND curiob: 202f8b04
curflg: dd curpar: 0 curusr: 0 curses 30047c7c
cursor name: SELECT LOCKID FROM DBMS_LOCK_ALLOCATED WHERE NAME =
:b1 FOR UPDATE
child pin: 0, child lock: 300dc9b4, parent lock: 301730b8
xscflg: 1151421, parent handle: 3025b4dc
bind 0: dty=1 mxl=32(00) mal=00 scl=00 pre=00 oacflg=01
No bind buffers allocated
----------------------------------------
The current SQL statement is then:
SELECT LOCKID FROM DBMS_LOCK_ALLOCATED WHERE NAME = :b1 FOR UPDATEThe bind variable :b1 is of type VARCHAR2(32) (dty=1, mxl=32), but no value has been assigned to it at the time of the dump ("No bind buffers allocated").
Example 3:
Current cursor: 2, pgadep: 0
Cursor Dump:
----------------------------------------
...
----------------------------------------
Cursor 2 (20140444): CURNULL curiob: 0
curflg: 44 curpar: 0 curusr: 0 curses 701dc94c
----------------------------------------
In this case there is no SQL being executed at the time of the dump.
Example 4:
******************** Cursor Dump ************************Current cursor: 1, pgadep: 0
pgactx: ccf361c0 ctxcbk: 0 ctxqbc: 0 ctxrws: 0
Cursor Dump:
----------------------------------------
Cursor 1 (400d9478): CURBOUND curiob: 400e43d8
curflg: 4c curpar: 0 curusr: 0 curses d5348f80
cursor name: BEGIN myparser.convert('/tmp','workflow000_2.log',2); END;
child pin: d14a4d70, child lock: d1589968, parent lock: d14c64a0
xscflg: 100064, parent handle: d083f1c0, xscfl2: 4040408
nxt: 1.0x000000a8
Cursor frame allocation dump:
frm: -------- Comment -------- Size Seg Off
總結(jié)
以上是生活随笔為你收集整理的项目中遇到的ORA error 及解决办法 ---ora-07445的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 【HDU1325】Is It A Tre
- 下一篇: GitHub---最简单的使用