oracle10g 04030,一次ORA-04030问题的诊断(一)
今天客戶要進行應用發布,首先在測試環境進行測試,在測試環境測試索引重建的時候報如下錯誤:
ORA-04030: 在嘗試分配 64544 字節 (sort subheap,sort key) 時進程內存不足。
客戶在將這個問題發給我的時候,首先讓他們檢查一下操作系統ORACLE用戶資源的限制。
因為根據以往的經驗這個問題往往是oracle用戶進程的data seg size限制導致的。
但是通過客戶過來的信息發現 oracle用戶程序數據段并沒有任何限制。
$ id oracle
uid=400(oracle) gid=400(oinstall) groups=401(dba)
$ ulimit -a
time(seconds)??????? unlimited
file(blocks)???????? unlimited
data(kbytes)???????? unlimitedstack(kbytes)??????? unlimited
memory(kbytes)?????? unlimited
coredump(blocks)???? unlimited
nofiles(descriptors) 2000
$ ulimit -Ha
time(seconds)??????? unlimited
file(blocks)???????? unlimited
data(kbytes)???????? unlimitedstack(kbytes)??????? unlimited
memory(kbytes)?????? unlimited
coredump(blocks)???? unlimited
nofiles(descriptors) unlimited
然后檢查了一下客戶的SGA和PGA設置:
SQL> show parameter sga
NAME???????????????????????????????? TYPE??????? VALUE
------------------------------------ ----------- ------------------------------
lock_sga???????????????????????????? boolean???? FALSE
pre_page_sga???????????????????????? boolean???? FALSE
sga_max_size???????????????????????? big integer 15G
sga_target?????????????????????????? big integer 14G
SQL> show parameter pga
NAME???????????????????????????????? TYPE??????? VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target???????????????? big integer 8G
$ lsattr -El mem0
goodsize 40960 Amount of usable physical memory in Mbytes False
size???? 40960 Total amount of physical memory in Mbytes? False
客戶的SGA設置為14G,PGA設置為8G ,而操作系統的整個內存是40G,內存遠遠夠用的。
C:\Documents and Settings\shoupeng.yan>sqlplus xxxx/xxxx
SQL*Plus: Release 10.2.0.1.0 - Production on 星期三 3月 21 20:00:37 2012
Copyright (c) 1982, 2005, Oracle.? All rights reserved.
連接到:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining Scoring Engine
and Real Application Testing options
SQL> ALTER INDEX PK_ZJ_BDZTB_01 REBUILD;
ALTER INDEX PK_ZJ_BDZTB_01 REBUILD
*
第 1 行出現錯誤:
ORA-04030: 在嘗試分配 64544 字節 (sort subheap,sort key) 時進程內存不足
我通過windows下的sqlplus遠程連到數據庫中,親自執行了一下索引重建,在重建的過程中順監控了一下進程的PGA使用量。發現進程的PGA使用量大約在110M的時候就拋出ORA-04030錯誤了。而系統的整個PGA設置為8G,不應該不夠用。
因此我還是認為os limits限制導致的問題原因,既然單個進程的內存段有最大限制,我們可以通過開啟索引重建并行度以便每個進程占用的數據段減少一半以上。
在我開啟了2個并行度之后,索引確實重建成功了。
SQL> ALTER INDEX PK_ZJ_BDZTB_01 REBUILD PARALLEL 2;
索引已更改。
更奇怪的事情是:我直接登錄到數據庫服務器上進程索引重建,沒有加并行度也可以重建成功。
$ id oracle
uid=400(oracle) gid=400(oinstall) groups=401(dba)
$ sqlplus xxxx/xxxx
SQL*Plus: Release 10.2.0.4.0 - Production on Wed Mar 21 20:07:20 2012
Copyright (c) 1982, 2007, Oracle.? All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining Scoring Engine
and Real Application Testing options
SQL>? ALTER INDEX PK_ZJ_BDZTB_01 REBUILD;
Index altered.
一個是通過Windows客戶端連接到數據庫上的,一個是通過本地sqlplus登錄到數據庫中的,一個重建索引不成功,一個能成功。唯一的區別是,客戶端連接時通過監聽器,那么問題應該出現在監聽器上。
為此我做了一個假設:雖然目前的ulimit -a顯示的結果表示ORACLE用戶進程的數據段沒有限制,但是這個修改很可能是在監聽器啟動之后,監聽器啟動之后繼承了先前的ulimits的所有設置,而在此后重新修改的并沒有反饋到監聽器中。而通過客戶端sqlplus連到數據庫中的時候,
監聽器派生出的服務器進程繼承了監聽器中的ulimits相關的設置,導致服務器進程的數據段還是有限制的。而通過本地連接的數據庫不通過監聽器,直接繼承了當前的設置,所以本地連接建立索引不會報ORA-4030錯誤。
然后在OS上通過smit.script 我找了當前修改oracle用戶的limits限制的腳本:
#????[Dec 21 2010, 16:36:27]#
x() {
if [ $# -ge 2 ]
then
for i in "$@"
do
spam="$spam \"$i\""
done
eval chuser $spam
fi
}
x data='-1' stack='-1' oracle
這條命令是在2010年12月21日執行的。
而監聽器是在10-DEC-2010 09:44:38 啟動的,到目前運行了467天,也就是監聽器在chuser data='-1' stack='-1' oracle 命令執行之前就啟動了。
$ lsnrctl status
LSNRCTL for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Production on 21-MAR-2012 20:11:18
Copyright (c) 1991, 2007, Oracle.? All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=SXRYXDB)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias???????????????????? LISTENER
Version?????????????????? TNSLSNR for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Production
Start Date??????????????? 10-DEC-2010 09:44:38
Uptime??????????????????? 467 days 10 hr. 26 min. 40 sec
Trace Level?????????????? off
Security????????????????? ON: Local OS Authentication
SNMP????????????????????? OFF
Listener Log File???????? /u01/oracle/product/db10gr2/network/log/listener.log
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=SXRYXDB)(PORT=1521)))
Services Summary...
Service "zyxdb" has 1 instance(s).
Instance "zyxdb", status READY, has 1 handler(s) for this service...
Service "zyxdb_XPT" has 1 instance(s).
Instance "zyxdb", status READY, has 1 handler(s) for this service...
The command completed successfully
既然問題是由于監聽器導致的,那么重啟一下監聽器,重新讀取修改過的ulimits,即可解決問題。
$ lsnrctl stop
LSNRCTL for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Production on 21-MAR-2012 20:11:32
Copyright (c) 1991, 2007, Oracle.? All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=SXRYXDB)(PORT=1521)))
The command completed successfully
$ lsnrctl start
LSNRCTL for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Production on 21-MAR-2012 20:11:43
Copyright (c) 1991, 2007, Oracle.? All rights reserved.
Starting /u01/oracle/product/db10gr2/bin/tnslsnr: please wait...
TNSLSNR for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Production
System parameter file is /u01/oracle/product/db10gr2/network/admin/listener.ora
Log messages written to /u01/oracle/product/db10gr2/network/log/listener.log
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=SXRYXDB)(PORT=1521)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC0)))
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=SXRYXDB)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias???????????????????? LISTENER
Version?????????????????? TNSLSNR for IBM/AIX RISC System/6000: Version 10.2.0.4.0 - Production
Start Date??????????????? 21-MAR-2012 20:11:43
Uptime??????????????????? 0 days 0 hr. 0 min. 0 sec
Trace Level?????????????? off
Security????????????????? ON: Local OS Authentication
SNMP????????????????????? OFF
Listener Parameter File?? /u01/oracle/product/db10gr2/network/admin/listener.ora
Listener Log File???????? /u01/oracle/product/db10gr2/network/log/listener.log
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=SXRYXDB)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC0)))
Services Summary...
Service "PL***tProc" has 1 instance(s).
Instance "PL***tProc", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully
$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Wed Mar 21 20:11:47 2012
Copyright (c) 1982, 2007, Oracle.? All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining Scoring Engine
and Real Application Testing options
SQL> alter system register;
System altered.
SQL>
再次通過客戶端執行索引重建命令問題解決:
C:\Documents and Settings\shoupeng.yan>sqlplus xxxx/xxxx
SQL*Plus: Release 10.2.0.1.0 - Production on 星期三 3月 21 20:12:18 2012
Copyright (c) 1982, 2005, Oracle.? All rights reserved.
連接到:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining Scoring Engine
and Real Application Testing options
SQL>?ALTER INDEX PK_ZJ_BDZTB_01 REBUILD;
索引已更改。
SQL>
總結
以上是生活随笔為你收集整理的oracle10g 04030,一次ORA-04030问题的诊断(一)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: php登录注册连接数据库,利用PHP连接
- 下一篇: c语言字符初始化怎么表示,C语言初始化字