日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

RAC 11.2.0.3 LISTENER异常终止

發布時間:2023/12/16 编程问答 25 豆豆
生活随笔 收集整理的這篇文章主要介紹了 RAC 11.2.0.3 LISTENER异常终止 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
今天同事遇到一個RAC監聽異常終止的問題,版本11.2.0.3操作系統AIX6.1,如下:
10-MAY-2014 11:44:16 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=__jdbc__)(USER=))(SERVER=DEDICATED)(SERVICE_NAME=zhdw)) * (ADDRESS=(PROTOC
OL=tcp)(HOST=10.195.160.162)(PORT=59755)) * establish * zhdw * 0
Sat May 10 11:44:27 2014
10-MAY-2014 11:44:27 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=zhdwdb1)(USER=grid))(COMMAND=status)(ARGUMENTS=64)(SERVICE=LISTENER_SCAN1)
(VERSION=186647296)) * status * 0
10-MAY-2014 11:44:35 * service_died * LsnrAgt * 12537
可以看到service_died 。以前遇到過died的問題是由于數據庫異常終止了,而且我還特意關注過,但是這里不一樣
因為實例根本沒有問題。所以就和同事一起進行了分析。


既然是RAC就要考慮到CRS是否進行了異常的LISTENER終止,所以首先看了CRS的ALTER日志,如下:
2014-05-13 11:59:40.629
[/u01/app/11.2.3/grid/bin/oraagent.bin(8847514)]CRS-5016:Process "/u01/app/11.2.3/grid/bin/lsnrctl" spawned by agent "/u01/app/11.2.3/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11.2.3/grid/log/zhdwdb2/agent/crsd/oraagent_grid/oraagent_grid.log"
2014-05-13 11:59:40.907
[/u01/app/11.2.3/grid/bin/oraagent.bin(8847514)]CRS-5016:Process "/u01/app/11.2.3/grid/opmn/bin/onsctli" spawned by agent "/u01/app/11.2.3/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11.2.3/grid/log/zhdwdb2/agent/crsd/oraagent_grid/oraagent_grid.log"
2014-05-13 12:01:23.234
可以看到ORAAGENT進行檢查的時候出現了錯誤,然后查看oraagent_grid.log日志。
2014-05-13 11:59:40.257: [ ? ?A**][2314] {0:3:13726} Agent received the message: RESOURCE_STOP[ora.ons zhdwdb2 1] ID 4099:28335
2014-05-13 11:59:40.258: [ ? ?A**][2314] {0:3:13726} Preparing STOP command for: ora.ons zhdwdb2 1
2014-05-13 11:59:40.258: [ ? ?A**][2314] {0:3:13726} ora.ons zhdwdb2 1 state changed from: ONLINE to: STOPPING
2014-05-13 11:59:40.262: [ ? ?A**][2314] {0:3:13726} Agent received the message: RESOURCE_STOP[ora.LISTENER.lsnr zhdwdb2 1] ID 4099:28336
2014-05-13 11:59:40.262: [ ? ?A**][2314] {0:3:13726} Preparing STOP command for: ora.LISTENER.lsnr zhdwdb2 1
2014-05-13 11:59:40.262: [ora.ons][2572] {0:3:13726} [stop] (:CLSN00108:) clsn_agent::stop {
2014-05-13 11:59:40.262: [ ? ?A**][2314] {0:3:13726} ora.LISTENER.lsnr zhdwdb2 1 state changed from: ONLINE to: STOPPING
2014-05-13 11:59:40.263: [ora.ons][2572] {0:3:13726} [stop] OnsAgent::stop {
2014-05-13 11:59:40.268: [ora.LISTENER.lsnr][1800] {0:3:13726} [stop] (:CLSN00108:) clsn_agent::stop {
2014-05-13 11:59:40.268: [ora.LISTENER.lsnr][1800] {0:3:13726} [stop] LsnrAgent::stop {
2014-05-13 11:59:40.268: [ USRTHRD][1800] {0:3:13726} Thread:RegEndpointThread:LISTENER stop {
2014-05-13 11:59:40.270: [ USRTHRD][5157] {0:3:13726} Thread:RegEndpointThread:LISTENER isRunning is reset to false here
2014-05-13 11:59:40.270: [ USRTHRD][1800] {0:3:13726} Thread:RegEndpointThread:LISTENER stop }
2014-05-13 11:59:40.270: [ora.LISTENER.lsnr][1800] {0:3:13726} [stop] lsnrctl stop LISTENER


可以看到CRS觸發了LISTENER的關閉,然后進一步查看orarootagent_root.log查看network資源是否異常發現


2014-05-13 11:59:40.184: [ora.net1.network][2057] {2:64347:2} [check] got lock
2014-05-13 11:59:40.184: [ora.net1.network][2057] {2:64347:2} [check] tryActionLock }
2014-05-13 11:59:40.184: [ora.net1.network][2057] {2:64347:2} [check] abort ?}
2014-05-13 11:59:40.185: [ora.net1.network][2057] {2:64347:2} [check] (:CLSN00110:) clsn_agent::abort }
2014-05-13 11:59:40.185: [ ? ?A**][2057] {2:64347:2} Command: check for resource: ora.net1.network zhdwdb2 1 completed with status: TIMEDOUT
2014-05-13 11:59:40.185: [ ? ?A**][2314] {2:64347:2} ora.net1.network zhdwdb2 1 state changed from: ONLINE to: UNKNOWN
2014-05-13 11:59:40.185: [ ? ?A**][2314] {2:64347:2} ora.net1.network zhdwdb2 1 would be continued to monitored!
2014-05-13 11:59:40.185: [ ? ?A**][2314] {2:64347:2} Switching offline monitor to intermedite one
2014-05-13 11:59:40.186: [ ? ?A**][2314] {2:64347:2} Started implicit monitor for [ora.net1.network zhdwdb2 1] interval=1000 delay=1000
2014-05-13 11:59:40.186: [ ? ?A**][2314] {2:64347:2} ora.net1.network zhdwdb2 1 state details has changed from: ?to: CHECK TIMED OUT
2014-05-13 11:59:40.186: [ ? ?A**][2314] {0:3:13726} Generating new Tint for unplanned state change. Original Tint: {2:64347:2}
2014-05-13 11:59:40.186: [ ? ?A**][2314] {0:3:13726} Agent sending message to PE: RESOURCE_STATUS[Proxy] ID 20481:1574940
2014-05-13 11:59:41.228: [ ? ?A**][2314] {2:64347:2} ora.net1.network zhdwdb2 1 state changed from: UNKNOWN to: ONLINE
2014-05-13 11:59:41.228: [ ? ?A**][2314] {2:64347:2} Switching offline monitor to online one
發現NETWORK資源在短時時間內由ONLINE 到UNKOWN狀態然后由UNKOWN到ONLINE狀態。這可能是由于網絡顫抖引起。


所以問題基本定位,由于NETWORK資源出現問題導致


下面說明為什么NETWROK資源出現問題監聽會出現問題。
LISTERN--VIP--PUBLIC IP --network(網卡)
SCAN LISTERN--SCAN IP --PUBLIC IP --network(網卡)
可以看到只要NETWROK資源出現問題就會影響到上層的各種資源。


然后解釋下為什么LOCAL LISTENER出現問題也會導致連接問題
client--- scan listener(根據負載分發,確定進入的實例)----LOCAL LISTENER(這一環出了問題還是不行)---進入數據庫


可以看到只要LOCAL LISTENER出現問題也會導致不能連接。
所以問題可能是
1、網絡顫抖
2、各種BUG


查看METALINK發現2篇文章說這個問題如下:
1、

Agent Check Timeout for Network and ACFS resources occasionally in AIX (文檔 ID 1558510.1)?


APPLIES TO:
Oracle Database - Enterprise Edition - Version 11.2.0.2 and later
IBM AIX on POWER Systems (64-bit)
SYMPTOMS

This is for AIX only. ?The problem is AIX specific.

The cluster alert.log shows check timeout errors for acfs and network resource occasionally:

< [/u01/app/11.2.0/grid/bin/orarootagent.bin(7143488)]CRS-5818:Aborted command 'check for resource: ora.drivers.acfs 1 1' for resource 'ora.drivers.acfs'. Details at (:CRSAGF00113:) {0:0:2} in /u01/app/11.2.0/grid/log/p01dou412/agent/ohasd/orarootagent_root/orarootagent_root.log.
< [/u01/app/11.2.0/grid/bin/orarootagent.bin(7143488)]CRS-5014:Agent "/u01/app/11.2.0/grid/bin/orarootagent.bin" timed out starting process "/u01/app/11.2.0/grid/bin/acfsload" for action "check": details at "(:CLSN00009:)" in "/u01/app/11.2.0/grid/log/p01dou412/agent/ohasd/orarootagent_root/orarootagent_root.log"
< [/u01/app/11.2.0/grid/bin/orarootagent.bin(20054094)]CRS-5818:Aborted command 'check for resource: ora.net1.network p01dou412 1' for resource 'ora.net1.network'. Details at (:CRSAGF00113:) {0:17:2} in /u01/app/11.2.0/grid/log/p01dou412/agent/crsd/orarootagent_root/orarootagent_root.log.

CAUSE

A defect with AIX causes the unix command "pwd" to hang for a minute or two.
This affects all LPARs on the server at the same time.

The OSWatcher also hangs for a minute or two because it uses pwd command, so have the customer set up OSwatcher and check if the OSWatcher hangs at the same time the check timeout error occurs. ?If OSWatcher hangs at the same time for about a minute or two, then the server is running into this problem.

SOLUTION

Resolved by upgrading AIX and firmware to the following:

Firmware
IBM,9117-MMB
The current permanent system firmware image is AM730_087
The current temporary system firmware image is AM730_087
VIO server version 2.2.1.4
AIX version 6100-07-04-1216

After the above patching, the problem is resolved.?

2、
VIP, SCAN VIP/Listener Fails Over and Listener Stops After Short Public Network Hiccup (文檔 ID 1333165.1)

APPLIES TO:

Oracle Database - Enterprise Edition - Version 11.2.0.1 to 12.1.0.1 [Release 11.2 to 12.1]
Information in this document applies to any platform.
SYMPTOMS

After check timed out, 11gR2 Grid Infrastructure network resource (usually ora.net1.network) goes to INTERMEDIATE state, then goes back to ONLINE very shortly. This note will not discuss cause of check time out, but most common cause is public network hiccup.

Once network resource goes into INTERMEDIATE state, it may trigger VIP, service, SCAN VIP/SCAN listener, ora.cvu and ora.ons etc to be failed over/go offline due to resource dependence, which could result in unnecessary connectivity issue for that period of time. After network resource is back online, affected resources may not come back online.

$GRID_HOME/log//crsd/crsd.log
2011-06-12 07:12:31.261: [ ? ?A**][10796] {0:1:2881} Received state change for ora.net1.network racnode1 1 [old state = ONLINE, new state = UNKNOWN]
2011-06-12 07:12:31.261: [ ? ?A**][10796] {0:1:2881} Received state LABEL change for ora.net1.network racnode1 1 [old label ?= , new label = CHECK TIMED OUT]
..
2011-06-12 07:12:31.297: [ ? CRSPE][12081] {0:1:2881} RI [ora.net1.network racnode1 1] new external state [INTERMEDIATE] old value: [ONLINE] on racnode1 label = [CHECK TIMED OUT]?
..
2011-06-12 07:12:31.981: [ ? ?A**][10796] {0:1:2882} Received state change for ora.net1.network racnode1 1 [old state = UNKNOWN, new state = ONLINE]
..
2011-06-12 07:12:32.307: [ ? CRSPE][12081] {0:1:2881} RI [ora.LISTENER.lsnr racnode1 1] new internal state: [STOPPING] old value: [STABLE]
2011-06-12 07:12:32.308: [ ? CRSPE][12081] {0:1:2881} CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'racnode1'
$GRID_HOME/log//agent/crsd/orarootagent_root/orarootagent_root.log
2011-06-12 07:12:08.965: [ ? ?A**][2070] {1:27767:2} Created alert : (:CRSAGF00113:) : ?Aborting the command: check for resource: ora.net1.network racnode1 1
2011-06-12 07:12:08.966: [ora.net1.network][2070] {1:27767:2} [check] clsn_agent::abort {
..
2011-06-12 07:12:31.257: [ ? ?A**][2070] {1:27767:2} Command: check for resource: ora.net1.network racnode1 1 completed with status: TIMEDOUT
2011-06-12 07:12:31.258: [ ? ?A**][2314] {1:27767:2} ora.net1.network racnode1 1 state changed from: ONLINE to: UNKNOWN
2011-06-12 07:12:31.258: [ ? ?A**][2314] {1:27767:2} ora.net1.network racnode1 1 would be continued to monitored!
2011-06-12 07:12:31.258: [ ? ?A**][2314] {1:27767:2} ora.net1.network racnode1 1 state details has changed from: ?to: CHECK TIMED OUT
..
2011-06-12 07:12:31.923: [ora.net1.network][2314][F-ALGO] {1:27767:2} CHECK initiated by timer for: ora.net1.network racnode1 1
..
2011-06-12 07:12:31.973: [ora.net1.network][8502][F-ALGO] {1:27767:2} [check] Command check for resource: ora.net1.network racnode1 1 completed with status ONLINE
2011-06-12 07:12:31.978: [ ? ?A**][2314] {1:27767:2} ora.net1.network racnode1 1 state changed from: UNKNOWN to: ONLINE
$GRID_HOME/log//agent/crsd/oraagent_/oraagent_.log
2011-06-12 07:12:32.335: [ ? ?A**][2314] {0:1:2881} Agent received the message: RESOURCE_STOP[ora.LISTENER.lsnr racnode1 1] ID 4099:14792
2011-06-12 07:12:32.335: [ ? ?A**][2314] {0:1:2881} Preparing STOP command for: ora.LISTENER.lsnr racnode1 1
2011-06-12 07:12:32.335: [ ? ?A**][2314] {0:1:2881} ora.LISTENER.lsnr racnode1 1 state changed from: ONLINE to: STOPPING

$GRID_HOME/log//alert.log
2012-01-10 06:48:18.474 [/ocw/grid/bin/orarootagent.bin(10485902)]CRS-5818:Aborted command 'check for resource: ora.net1.network racnode1 1' for resource 'ora.net1.network'. Details at (:CRSAGF00113:) {1:24200:2} in /ocw/grid/log/racnode1/agent/crsd/orarootagent_root/orarootagent_root.log.
2012-01-10 06:48:43.481 [/ocw/grid/bin/oraagent.bin(8847542)]CRS-5016:Process "/ocw/grid/bin/lsnrctl" spawned by agent "/ocw/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/ocw/grid/log/racnode1/agent/crsd/oraagent_grid/oraagent_grid.log"
2012-01-10 06:48:43.552 [/ocw/grid/bin/oraagent.bin(8847542)]CRS-5016:Process "/ocw/grid/opmn/bin/onsctli" spawned by agent "/ocw/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/ocw/grid/log/racnode1/agent/crsd/oraagent_grid/oraagent_grid.log"

CAUSE

SOLUTION
The issue is fixed in a few different bugs:

1. bug 12680491 fixes the dependence between network and VIP

The fix of bug 12680491 will add intermediate modifier to stop dependency between network resource and VIP to avoid unnecessary resource state change, it's included in 11.2.0.2 GI PSU4, 11.2.0.3 GI PSU3, 11.2.0.3 Windows Patch 7, 11.2.0.4 and above. This fix is recommended instead of fix for bug 12378938 to avoid the issue in first place.?

Once patch for this bug is applied, the following needs to be executed to change the dependence for all VIPs:

# $GRID_HOME/bin/crsctl modify res ora..vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora..network)"

For example:

# /ocw/grid/bin/crsctl modify res ora.racnode1.vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)"
Once the attribute is changed, a restart of nodeapps/VIP is needed to be in effect

2. bug 13582411 fixes the dependence between network and SCAN VIP/listener

The fix of bug 13582411 will add intermediate modifyer to stop dependency between network resource and SCAN VIP to avoid unnecessary resource state change, it's included in 11.2.0.3 GI PSU4, 11.2.0.4 and above. ?

Once patch for this bug is applied, the following needs to be executed to change the dependence for all SCAN VIPs and to restart SCAN VIPs:

# $GRID_HOME/bin/crsctl modify res ora.scan.vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora.net.network)"
For example:

# /ocw/grid/bin/crsctl modify res ora.scan1.vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)"
# /ocw/grid/bin/crsctl modify res ora.scan2.vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)"
# /ocw/grid/bin/crsctl modify res ora.scan3.vip -attr "STOP_DEPENDENCIES=hard(intermediate:ora.net1.network)"
# /ocw/grid/bin/srvctl stop scan -f
$ /ocw/grid/bin/srvctl start scan_listener?

3. bug 17435488 fixes the dependence between network and ora.cvu and ora.ons

The fix will add intermediate modifyer to stop dependency between network resource and ora.cvu and ora.ons to avoid unnecessary resource state change, it's included in 12.1.0.2

所以初步認為需要:
1、RAC 修改VIP SCAN VIP的依賴屬性
2、查看AIX 固件




來自 “ ITPUB博客 ” ,鏈接:http://blog.itpub.net/7728585/viewspace-1160374/,如需轉載,請注明出處,否則將追究法律責任。

轉載于:http://blog.itpub.net/7728585/viewspace-1160374/

總結

以上是生活随笔為你收集整理的RAC 11.2.0.3 LISTENER异常终止的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。