Oracle systemstate dump介绍
? ? 當(dāng)數(shù)據(jù)庫(kù)出現(xiàn)嚴(yán)重的性能問(wèn)題或者h(yuǎn)ang起的時(shí)候,那么我們非常需要通過(guò)systemstate dump來(lái)知道進(jìn)程在做什么,在等待什么,誰(shuí)是資源的持有者,誰(shuí)阻塞了別人。在出現(xiàn)上述問(wèn)題時(shí),及時(shí)收集systemstate dump非常有助于問(wèn)題原因的分析。一般Oracle Support工程是也是需要你提供systemstate dump生成的trace文件做分析,關(guān)于systemstate dump的資料,其實(shí)沒(méi)有非常詳細(xì)的官方介紹資料,都是一些零零散散的介紹。
當(dāng) 數(shù)據(jù)庫(kù)出現(xiàn)嚴(yán)重性能問(wèn)題或hang起的時(shí)候,服務(wù)器端sqlplus連接數(shù)據(jù)庫(kù)要么非常慢,要么根本無(wú)法連接。ORACLE 10g 開(kāi)始,sqlplus提供了這么一個(gè)功能參數(shù)-prelim,在sqlplus無(wú)法連接的情況下,連接登錄到數(shù)據(jù)庫(kù)。下面關(guān)于這些知識(shí)點(diǎn)的一個(gè)總結(jié)
?
There are two ways to connect to sqlplus using a preliminary connection.
sqlplus -prelim / as sysdba ? sqlplus /nolog set _prelim on connect / as sysdba?
用sysdba登錄到數(shù)據(jù)庫(kù)上:
$sqlplus / as sysdba
或者
$sqlplus -prelim / as sysdba <==當(dāng)數(shù)據(jù)庫(kù)已經(jīng)很慢或者h(yuǎn)ang到無(wú)法連接
?
下面是在metalink上的介紹如何在單機(jī)或RAC環(huán)境下做Systemstate或Hanganalyze(詳細(xì)信息,請(qǐng)見(jiàn)下面參考資料)
?
Collection commands for Hanganalyze and Systemstate: Non-RAC:
Sometimes, database may actually just be very slow and not actually hanging. It is therefore recommended,? where possible to get 2 hanganalyze and 2 systemstate dumps in order to determine whether processes are moving at all or whether they are "frozen".
Hanganalyze
sqlplus '/ as sysdba'
oradebug setmypid
oradebug unlimit
oradebug hanganalyze 3
-- Wait one minute before getting the second hanganalyze
oradebug hanganalyze 3
oradebug tracefile_name
exit
Systemstate
sqlplus '/ as sysdba'
oradebug setmypid
oradebug unlimit
oradebug dump systemstate 266
oradebug dump systemstate 266
oradebug tracefile_name
exit
?
Collection commands for Hanganalyze and Systemstate: RAC
There are 2 bugs affecting RAC that without the relevant patches being applied on your system, make using level 266 or 267 very costly. Therefore without these fixes in place it highly unadvisable to use these level
For information on these patches see:
Document 11800959.8 Bug 11800959 - A SYSTEMSTATE dump with level >= 10 in RAC dumps huge BUSY GLOBAL CACHE ELEMENTS - can hang/crash instances
Document 11827088.8 Bug 11827088 - Latch 'gc element' contention, LMHB terminates the instance
?
Note:? both bugs are fixed in 11.2.0.3.
?
Collection commands for Hanganalyze and Systemstate: RAC with fixes for bug 11800959 and bug 11827088
For 11g:
sqlplus '/ as sysdba'
oradebug setorapname reco
oradebug? unlimit
oradebug -g all hanganalyze 3
oradebug -g all hanganalyze 3
oradebug -g all dump systemstate 266
oradebug -g all dump systemstate 266
exit
Collection commands for Hanganalyze and Systemstate: RAC without fixes for Bug 11800959 and Bug 11827088
sqlplus '/ as sysdba'
oradebug setorapname reco
oradebug unlimit
oradebug -g all hanganalyze 3
oradebug -g all hanganalyze 3
oradebug -g all dump systemstate 258
oradebug -g all dump systemstate 258
exit
For 10g, run oradebug setmypid instead of oradebug setorapname reco:
sqlplus '/ as sysdba'
oradebug setmypid
oradebug unlimit
oradebug -g all hanganalyze 3
oradebug -g all hanganalyze 3
oradebug -g all dump systemstate 258
oradebug -g all dump systemstate 258
exit
In RAC environment, a dump will be created for all RAC instances in the DIAG trace file for each instance.
?
那么我們現(xiàn)在來(lái)看一個(gè)例子吧:
[oracle@DB-Server ~]$ sqlplus -prelim / as sysdba ? SQL*Plus: Release 10.2.0.5.0 - Production on Wed Mar 2 16:31:03 2016 ? Copyright (c) 1982, 2010, Oracle.? All Rights Reserved. ? SQL> oradebug setmypid Statement processed. SQL> oradebug unlimit Statement processed. SQL> oradebug dump systemstate 266 Statement processed. SQL> oradebug dump systemstate 266 Statement processed. SQL> oradebug tracefile_name /u01/app/oracle/admin/SCM2/udump/scm2_ora_13598.trc SQL> exit Disconnected from ORACLE
告警日志里面會(huì)看到類(lèi)似這樣的信息:
Wed Mar 02 16:32:08 CST 2016
System State dumped to trace file
Wed Mar 02 16:32:48 CST 2016
System State dumped to trace file /u01/app/oracle/admin/xxx/udump/scm2_ora_13598.trc
$ORACLE_BASE/admin/ORACLE_SID/udump/ 下找到對(duì)應(yīng)的trc文件,如下所示,你會(huì)看到大量系統(tǒng)中所有進(jìn)程的進(jìn)程狀態(tài)等信息。每個(gè)進(jìn)程對(duì)應(yīng)跟蹤文件中的一段內(nèi)容,反映該進(jìn)程的狀態(tài)信息,包括進(jìn)程信 息,會(huì)話(huà)信息,enqueues信息(主要是lock的信息)等等。
?
systemstate dump有多個(gè)級(jí)別:
2: dump (不包括lock element)
10: dump
11: dump + global cache of RAC
256: short stack (函數(shù)堆棧)
258: 256+2 -->short stack +dump(不包括lock element)
266: 256+10 -->short stack+ dump
267: 256+11 -->short stack+ dump + global cache of RAC
level 11和 267會(huì) dump global cache, 會(huì)生成較大的trace 文件,一般情況下不推薦。一般情況下,如果進(jìn)程不是太多,推薦用266,因?yàn)檫@樣可以dump出來(lái)進(jìn)程的函數(shù)堆棧,可以用來(lái)分析進(jìn)程在執(zhí)行什么操作。但 是生成short stack比較耗時(shí),如果進(jìn)程非常多,比如2000個(gè)進(jìn)程,那么可能耗時(shí)30分鐘以上。這種情況下,可以生成level 10 或者 level 258, level 258 比 level 10會(huì)多收集short short stack, 但比level 10少收集一些lock element data.
?
使用systemstate dump生成的trace文件可能會(huì)非常大,一般都會(huì)幾百兆甚至更大,雖然通過(guò)system state dump收集了進(jìn)程的相關(guān),但是如何有效的解讀相關(guān)信息,并診斷問(wèn)題是一個(gè)不小的難題和挑戰(zhàn)!
總結(jié)
以上是生活随笔為你收集整理的Oracle systemstate dump介绍的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: overfit underfit
- 下一篇: Maven 搭建spring boot多