11G RAC 11.2.0.1.0实例evict故障处理
Aix 7.1
参考文档:
https://blogs.oracle.com/database4cn/rac
Resolving ORA-481 and "terminating the instance due to error 481" (Doc ID 1950963.1)
ORA-00481 After "The instance eviction reason is 0x2" due to Lack of Ticket (Doc ID 1644015.1)
归根结底是RAC节点之间消息传输流量控制的问题,除网络、硬件原因外,需要打补丁解决。
以下为第一次处理前具体的alertlog。
◆node1
Wed May 24 01:59:24 2017
Remote instance 2 kill is issued with system inc 228
LMON received an instance eviction notification from instance 1
The instance eviction reason is 0x2
The instance eviction map is 2
Reconfiguration started (old inc 228, new inc 230)
List of instances:
1 (myinst: 1)
Wed May 24 01:59:27 2017
Trace dumping is performing id=[cdmp_20170524015904]
Global Resource Directory frozen
* dead instance detected - domain 0 invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed May 24 01:59:28 2017
LMS 1: 14 GCS shadows cancelled, 1 closed, 0 Xw survived
Wed May 24 01:59:28 2017
LMS 0: 23 GCS shadows cancelled, 0 closed, 0 Xw survived
Wed May 24 01:59:28 2017
LMS 2: 21 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Wed May 24 01:59:32 2017
Instance recovery: looking for dead threads
Beginning instance recovery of 1 threads
Wed May 24 01:59:48 2017
parallel recovery started with 32 processes
Started redo scan
Wed May 24 01:59:49 2017
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Wed May 24 02:00:00 2017
Completed redo scan
read 2094379 KB redo, 237345 data blocks need recovery
Wed May 24 02:00:03 2017
Reconfiguration started (old inc 230, new inc 232)
List of instances:
1 2 (myinst: 1)
Global Resource Directory frozen
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed May 24 02:00:04 2017
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Wed May 24 02:00:04 2017
Wed May 24 02:00:04 2017
LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Wed May 24 02:02:27 2017
,
,此处大量12170
,
Wed May 24 02:05:34 2017
LMON (ospid: 35979900): terminating the instance due to error 481
Wed May 24 02:05:34 2017
System state dump is made for local instance
System State dumped to trace file /u01/app/11.2.0/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_diag_40894656.trc
Instance terminated by LMON, pid = 35979900
Wed May 24 08:12:27 2017
Starting ORACLE instance (normal)
sskgpgetexecname failed to get name
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 en1 192.168.0.0 configured from GPnP Profile for use as a cluster interconnect
Interface type 1 en0 10.209.199.0 configured from GPnP Profile for use as a public interface
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options.
Using parameter settings in server-side pfile /u01/app/11.2.0/oracle/product/11.2.0/db_1/dbs/initorcl1.ora
System parameters with non-default values:
processes = 2000
timed_statistics = TRUE
sga_max_size = 70G
spfile = "+DATA1/orcl/spfileorcl.ora"
sga_target = 50G
control_files = "+DATA1/orcl/controlfile/current.261.832429113"
control_files = "+ARCDG/orcl/controlfile/current.259.832429113"
db_block_size = 8192
compatible = "11.2.0.0.0"
log_archive_format = "%t_%s_%r.dbf"
cluster_database = TRUE
db_create_file_dest = "+DATA1"
db_recovery_file_dest = "+ARCDG"
db_recovery_file_dest_size= 2048G
thread = 1
undo_tablespace = "UNDOTBS1"
undo_retention = 10800
instance_number = 1
remote_login_passwordfile= "NONE"
db_domain = ""
service_names = "orcl"
dispatchers = "(PROTOCOL=TCP) (SERVICE=orclXDB)"
local_listener = "(ADDRESS = (PROTOCOL = TCP)(HOST = 10.209.199.4)(PORT = 1521))"
remote_listener = "rac-scan:1521"
result_cache_max_size = 268736K
audit_file_dest = "/u01/app/11.2.0/oracle/admin/orcl/adump"
audit_trail = "DB"
db_name = "orcl"
open_cursors = 1000
sql_trace = FALSE
optimizer_index_caching = 90
pga_aggregate_target = 20G
deferred_segment_creation= FALSE
aq_tm_processes = 5
diagnostic_dest = "/u01/app/11.2.0/oracle"
Deprecated system parameters with specified values:
sql_trace
End of deprecated system parameter listing
Cluster communication is configured to use the following interface(s) for this instance
192.168.0.1
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Wed May 24 08:12:32 2017
PMON started with pid=2, OS id=35652038
Wed May 24 08:12:32 2017
VKTM started with pid=3, OS id=33489120 at elevated priority
VKTM running at (10)millisec precision with DBRM quantum (100)ms
Wed May 24 08:12:32 2017
GEN0 started with pid=4, OS id=34209898
Wed May 24 08:12:32 2017
DIAG started with pid=5, OS id=36438538
Wed May 24 08:12:33 2017
DBRM started with pid=6, OS id=24576446
Wed May 24 08:12:33 2017
PING started with pid=7, OS id=29687920
Wed May 24 08:12:33 2017
PSP0 started with pid=8, OS id=36635296
Wed May 24 08:12:33 2017
ACMS started with pid=9, OS id=34341462
Wed May 24 08:12:33 2017
DIA0 started with pid=10, OS id=32047870
Wed May 24 08:12:33 2017
LMON started with pid=11, OS id=36045304
Wed May 24 08:12:35 2017
LMD0 started with pid=12, OS id=31391860
Wed May 24 08:12:35 2017
LMS0 started with pid=13, OS id=35979918 at elevated priority
Wed May 24 08:12:35 2017
LMS1 started with pid=14, OS id=22741294 at elevated priority
Wed May 24 08:12:36 2017
LMS2 started with pid=15, OS id=29229354 at elevated priority
Wed May 24 08:12:36 2017
RMS0 started with pid=16, OS id=20185780
Wed May 24 08:12:36 2017
LMHB started with pid=17, OS id=29425998
Wed May 24 08:12:36 2017
MMAN started with pid=18, OS id=9699954
Wed May 24 08:12:36 2017
DBW0 started with pid=19, OS id=31850686
Wed May 24 08:12:36 2017
DBW1 started with pid=20, OS id=36045442
Wed May 24 08:12:36 2017
DBW2 started with pid=21, OS id=40894536
Wed May 24 08:12:36 2017
DBW3 started with pid=22, OS id=29819564
Wed May 24 08:12:36 2017
DBW4 started with pid=23, OS id=36634890
Wed May 24 08:12:36 2017
LGWR started with pid=24, OS id=35783370
Wed May 24 08:12:37 2017
CKPT started with pid=25, OS id=35914258
Wed May 24 08:12:37 2017
SMON started with pid=26, OS id=32374796
Wed May 24 08:12:37 2017
RECO started with pid=27, OS id=35652280
Wed May 24 08:12:37 2017
RBAL started with pid=28, OS id=36438322
Wed May 24 08:12:37 2017
ASMB started with pid=29, OS id=39387144
Wed May 24 08:12:37 2017
MMON started with pid=30, OS id=2621900
NOTE: initiating MARK startup
Wed May 24 08:12:37 2017
Starting background process MARKMMNL started with pid=31, OS id=35979732 starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Wed May 24 08:12:37 2017
MARK started with pid=32, OS id=35717688
NOTE: MARK has subscribed
starting up 1 shared server(s) ...
lmon registered with NM - instance number 1 (internal mem no 0)
Reconfiguration started (old inc 0, new inc 236)
List of instances:
1 2 (myinst: 1)
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
* domain 0 valid according to instance 2
* domain 0 valid = 1 according to instance 2
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Wed May 24 08:12:45 2017
Submitted all GCS remote-cache requests
Fix write in gcs resources
Reconfiguration complete
Wed May 24 08:12:52 2017
LCK0 started with pid=34, OS id=24379778
Wed May 24 08:12:52 2017
Starting background process RSMN
Wed May 24 08:12:52 2017
RSMN started with pid=36, OS id=30802026
ORACLE_BASE from environment = /u01/app/11.2.0/oracle
Wed May 24 08:12:53 2017
ALTER DATABASE MOUNT
Wed May 24 08:12:53 2017
NOTE: Loaded library: System
Wed May 24 08:12:53 2017
SUCCESS: diskgroup DATA1 was mounted
SUCCESS: diskgroup ARCDG was mounted
Wed May 24 08:12:53 2017
NOTE: dependency between database orcl and diskgroup resource ora.DATA1.dg is established
NOTE: dependency between database orcl and diskgroup resource ora.ARCDG.dg is established
Wed May 24 08:12:57 2017
Successful mount of redo thread 1, with mount id 1472569957
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Lost write protection disabled
Completed: ALTER DATABASE MOUNT
Wed May 24 08:12:58 2017
ALTER DATABASE OPEN
Block change tracking file is current.
Picked broadcast on commit scheme to generate SCNs
Wed May 24 08:12:58 2017
SUCCESS: diskgroup DATA2 was mounted
NOTE: dependency between database orcl and diskgroup resource ora.DATA2.dg is established
SUCCESS: diskgroup DATA3 was mounted
NOTE: dependency between database orcl and diskgroup resource ora.DATA3.dg is established
Thread 1 advanced to log sequence 255944 (thread open)
Thread 1 opened at log sequence 255944
Current log# 2 seq# 255944 mem# 0: +ARCDG/orcl/onlinelog/group_2.261.840376035
Current log# 2 seq# 255944 mem# 1: +ARCDG/orcl/onlinelog/group_2.2113.840376039
Current log# 2 seq# 255944 mem# 2: +ARCDG/orcl/onlinelog/group_2.2114.840376041
Current log# 2 seq# 255944 mem# 3: +ARCDG/orcl/onlinelog/group_2.2116.840376043
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Starting background process CTWR
Wed May 24 08:12:59 2017
CTWR started with pid=39, OS id=32768642
Block change tracking service is active.
Wed May 24 08:13:00 2017
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 2.
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
Starting background process GTX0
Wed May 24 08:13:07 2017
GTX0 started with pid=42, OS id=35062526
Starting background process RCBG
Wed May 24 08:13:07 2017
RCBG started with pid=43, OS id=27721944
replication_dependency_tracking turned off (no async multimaster replication found)
Wed May 24 08:13:08 2017
Starting background process QMNC
Wed May 24 08:13:08 2017
QMNC started with pid=44, OS id=34013894
Completed: ALTER DATABASE OPEN
Wed May 24 08:13:17 2017
Starting background process CJQ0
Wed May 24 08:13:17 2017
CJQ0 started with pid=41, OS id=11469004
Wed May 24 08:13:43 2017
Starting background process SMCO
Wed May 24 08:13:43 2017
SMCO started with pid=120, OS id=42336480
,
,
,
Wed May 24 09:08:16 2017
Errors in file /u01/app/11.2.0/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_j001_33161436.trc:
ORA-12012: 鑎仟錨狮鎊骚琛^啼絕茕竈239 錨呛閊証
ORA-00001: 杩^蒎^蟐湾^辕涓^犁害鎊較哝^荨浠(CM.UK_LTE_ZY_CHECK_HISTORY)
ORA-06512: 錨塄 "CM.JOB_LTE_ZY_CHECK", line 27
ORA-06512: 錨塄 line 1 ◆node2
Wed May 24 01:59:01 2017
LMS1 (ospid: 26411086) received an instance eviction notification from instance 1 [2]
Wed May 24 01:59:01 2017
LMON received an instance eviction notification from instance 1
The instance eviction reason is 0x2
The instance eviction map is 2
Wed May 24 01:59:04 2017
PMON (ospid: 50725354): terminating the instance due to error 481
Wed May 24 01:59:04 2017
System state dump is made for local instance
System State dumped to trace file /u01/app/11.2.0/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_diag_42336686.trc
Wed May 24 01:59:04 2017
ORA-1092 : opitsk aborting process
Wed May 24 01:59:04 2017
License high water mark = 516
Instance terminated by PMON, pid = 50725354
USER (ospid: 50987430): terminating the instance
Instance terminated by USER, pid = 50987430
Wed May 24 01:59:29 2017
Starting ORACLE instance (normal)
sskgpgetexecname failed to get name
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface type 1 en1 192.168.0.0 configured from GPnP Profile for use as a cluster interconnect
Interface type 1 en0 10.209.199.0 configured from GPnP Profile for use as a public interface
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options.
Using parameter settings in server-side pfile /u01/app/11.2.0/oracle/product/11.2.0/db_1/dbs/initorcl2.ora
System parameters with non-default values:
processes = 2000
timed_statistics = TRUE
sga_max_size = 70G
spfile = "+DATA1/orcl/spfileorcl.ora"
sga_target = 50G
control_files = "+DATA1/orcl/controlfile/current.261.832429113"
control_files = "+ARCDG/orcl/controlfile/current.259.832429113"
db_block_size = 8192
compatible = "11.2.0.0.0"
log_archive_format = "%t_%s_%r.dbf"
cluster_database = TRUE
db_create_file_dest = "+DATA1"
db_recovery_file_dest = "+ARCDG"
db_recovery_file_dest_size= 2048G
thread = 2
undo_tablespace = "UNDOTBS2"
undo_retention = 10800
instance_number = 2
remote_login_passwordfile= "NONE"
db_domain = ""
dispatchers = "(PROTOCOL=TCP) (SERVICE=orclXDB)"
local_listener = "(ADDRESS = (PROTOCOL = TCP)(HOST = 10.209.199.5)(PORT = 1521))"
remote_listener = "rac-scan:1521"
result_cache_max_size = 268736K
audit_file_dest = "/u01/app/11.2.0/oracle/admin/orcl/adump"
audit_trail = "DB"
db_name = "orcl"
open_cursors = 1000
sql_trace = FALSE
optimizer_index_caching = 90
pga_aggregate_target = 20G
deferred_segment_creation= FALSE
aq_tm_processes = 5
diagnostic_dest = "/u01/app/11.2.0/oracle"
Deprecated system parameters with specified values:
sql_trace
End of deprecated system parameter listing
Cluster communication is configured to use the following interface(s) for this instance
192.168.0.2
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
Wed May 24 01:59:34 2017
PMON started with pid=2, OS id=66584656
Wed May 24 01:59:34 2017
VKTM started with pid=3, OS id=66846744 at elevated priority
VKTM running at (10)millisec precision with DBRM quantum (100)ms
Wed May 24 01:59:34 2017
GEN0 started with pid=4, OS id=26608090
Wed May 24 01:59:34 2017
DIAG started with pid=5, OS id=26083716
Wed May 24 01:59:34 2017
DBRM started with pid=6, OS id=24510972
Wed May 24 01:59:34 2017
PING started with pid=7, OS id=65077306
Wed May 24 01:59:34 2017
PSP0 started with pid=8, OS id=66781402
Wed May 24 01:59:34 2017
ACMS started with pid=9, OS id=66978040
Wed May 24 01:59:34 2017
DIA0 started with pid=10, OS id=66519050
Wed May 24 01:59:34 2017
LMON started with pid=11, OS id=66453694
Wed May 24 01:59:37 2017
LMD0 started with pid=12, OS id=23658758
Wed May 24 01:59:37 2017
LMS0 started with pid=13, OS id=66322614 at elevated priority
Wed May 24 01:59:37 2017
LMS1 started with pid=14, OS id=65798210 at elevated priority
Wed May 24 01:59:37 2017
LMS2 started with pid=15, OS id=15204812 at elevated priority
Wed May 24 01:59:38 2017
RMS0 started with pid=16, OS id=65732820
Wed May 24 01:59:38 2017
LMHB started with pid=17, OS id=65339438
Wed May 24 01:59:38 2017
MMAN started with pid=18, OS id=52298036
Wed May 24 01:59:38 2017
DBW0 started with pid=19, OS id=52232458
Wed May 24 01:59:38 2017
DBW1 started with pid=20, OS id=65273872
Wed May 24 01:59:38 2017
DBW2 started with pid=21, OS id=65208354
Wed May 24 01:59:38 2017
DBW3 started with pid=22, OS id=65994980
Wed May 24 01:59:38 2017
DBW4 started with pid=23, OS id=61210752
Wed May 24 01:59:38 2017
LGWR started with pid=24, OS id=6095298
Wed May 24 01:59:38 2017
CKPT started with pid=25, OS id=65470648
Wed May 24 01:59:38 2017
SMON started with pid=26, OS id=64946374
Wed May 24 01:59:38 2017
RECO started with pid=27, OS id=51642854
Wed May 24 01:59:39 2017
RBAL started with pid=28, OS id=64618728
Wed May 24 01:59:39 2017
ASMB started with pid=29, OS id=64553208
Wed May 24 01:59:39 2017
MMON started with pid=30, OS id=64422054
NOTE: initiating MARK startup
Wed May 24 01:59:39 2017
MMNL started with pid=31, OS id=42336716
Starting background process MARK
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Wed May 24 01:59:39 2017
MARK started with pid=32, OS id=51577168
NOTE: MARK has subscribed
Wed May 24 01:59:39 2017
starting up 1 shared server(s) ...
lmon registered with NM - instance number 2 (internal mem no 1)
Reconfiguration started (old inc 0, new inc 232)
List of instances:
1 2 (myinst: 2)
Global Resource Directory frozen
* allocate domain 0, invalid = TRUE
Communication channels reestablished
* domain 0 valid = 0 according to instance 1
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Wed May 24 01:59:48 2017
Submitted all GCS remote-cache requests
Wed May 24 02:05:10 2017
Trace dumping is performing id=[cdmp_20170524020534]
Wed May 24 02:05:18 2017
Reconfiguration started (old inc 232, new inc 234)
List of instances:
2 (myinst: 2)
Nested reconfiguration detected.
Global Resource Directory frozen
* dead instance detected - domain 0 invalid = TRUE
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
Wed May 24 02:05:19 2017
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Wed May 24 02:05:19 2017
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Wed May 24 02:05:19 2017
LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
Set master node info
Submitted all remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Post SMON to start 1st pass IR
Submitted all GCS remote-cache requests
Post SMON to start 1st pass IR
Fix write in gcs resources
Reconfiguration complete
Wed May 24 02:05:25 2017
LCK0 started with pid=34, OS id=51118396
Wed May 24 02:05:25 2017
Starting background process RSMN
Wed May 24 02:05:25 2017
RSMN started with pid=36, OS id=51773786
ORACLE_BASE not set in environment. It is recommended
that ORACLE_BASE be set in the environment
Reusing ORACLE_BASE from an earlier startup = /u01/app/11.2.0/oracle
Wed May 24 02:05:25 2017
ALTER DATABASE MOUNT
This instance was first to mount
Wed May 24 02:05:26 2017
NOTE: Loaded library: System
Wed May 24 02:05:26 2017
SUCCESS: diskgroup DATA1 was mounted
Wed May 24 02:05:26 2017
NOTE: dependency between database orcl and diskgroup resource ora.DATA1.dg is established
SUCCESS: diskgroup ARCDG was mounted
NOTE: dependency between database orcl and diskgroup resource ora.ARCDG.dg is established
Wed May 24 02:05:30 2017
Successful mount of redo thread 2, with mount id 1472569957
Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
Lost write protection disabled
Completed: ALTER DATABASE MOUNT
ALTER DATABASE OPEN
This instance was first to open
Wed May 24 02:05:30 2017
SUCCESS: diskgroup DATA2 was mounted
NOTE: dependency between database orcl and diskgroup resource ora.DATA2.dg is established
SUCCESS: diskgroup DATA3 was mounted
NOTE: dependency between database orcl and diskgroup resource ora.DATA3.dg is established
Block change tracking file is current.
Beginning crash recovery of 2 threads
parallel recovery started with 32 processes
Started redo scan
Wed May 24 02:05:44 2017
Completed redo scan
read 2158203 KB redo, 230150 data blocks need recovery
Started redo application at
Thread 1: logseq 255942, block 163388
Thread 2: logseq 236286, block 2067920
Recovery of Online Redo Log: Thread 1 Group 3 Seq 255942 Reading mem 0
Mem# 0: +ARCDG/orcl/onlinelog/group_3.2085.840375427
Mem# 1: +ARCDG/orcl/onlinelog/group_3.2084.840375445
Mem# 2: +ARCDG/orcl/onlinelog/group_3.263.840375447
Mem# 3: +ARCDG/orcl/onlinelog/group_3.2089.840375449
Recovery of Online Redo Log: Thread 2 Group 7 Seq 236286 Reading mem 0
Mem# 0: +ARCDG/orcl/onlinelog/group_7.2049.840374587
Mem# 1: +ARCDG/orcl/onlinelog/group_7.2052.840374591
Mem# 2: +ARCDG/orcl/onlinelog/group_7.2057.840374593
Mem# 3: +ARCDG/orcl/onlinelog/group_7.2058.840374597
Recovery of Online Redo Log: Thread 2 Group 5 Seq 236287 Reading mem 0
Mem# 0: +ARCDG/orcl/onlinelog/group_5.2025.840374445
Mem# 1: +ARCDG/orcl/onlinelog/group_5.2026.840374483
Mem# 2: +ARCDG/orcl/onlinelog/group_5.2029.840374489
Mem# 3: +ARCDG/orcl/onlinelog/group_5.2034.840374493
Wed May 24 02:06:05 2017
Completed redo application of 1702.07MB
Completed crash recovery at
Thread 1: logseq 255942, block 291036, scn 12495715727030
Thread 2: logseq 236287, block 2742038, scn 12495715600520
230150 data blocks read, 229989 data blocks written, 2158203 redo k-bytes read
Thread 1 advanced to log sequence 255943 (thread recovery)
Picked broadcast on commit scheme to generate SCNs
Wed May 24 02:06:06 2017
Thread 2 advanced to log sequence 236288 (thread open)
Thread 2 opened at log sequence 236288
Current log# 6 seq# 236288 mem# 0: +ARCDG/orcl/onlinelog/group_6.2040.840374575
Current log# 6 seq# 236288 mem# 1: +ARCDG/orcl/onlinelog/group_6.2041.840374579
Current log# 6 seq# 236288 mem# 2: +ARCDG/orcl/onlinelog/group_6.2042.840374581
Current log# 6 seq# 236288 mem# 3: +ARCDG/orcl/onlinelog/group_6.2048.840374585
Successful open of redo thread 2
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Starting background process CTWR
Wed May 24 02:06:07 2017
CTWR started with pid=89, OS id=49349114
Block change tracking service is active.
Wed May 24 02:06:07 2017
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 4.
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
Starting background process GTX0
Wed May 24 02:06:11 2017
GTX0 started with pid=98, OS id=42992042
Starting background process RCBG
Wed May 24 02:06:11 2017
RCBG started with pid=52, OS id=40370648
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Wed May 24 02:06:13 2017
QMNC started with pid=119, OS id=47514086
Completed: ALTER DATABASE OPEN
SMON: Parallel transaction recovery tried
Starting background process SMCO
Wed May 24 02:06:17 2017
SMCO started with pid=153, OS id=50724962
Wed May 24 02:06:17 2017
db_recovery_file_dest_size of 2097152 MB is 4.35% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Wed May 24 02:06:19 2017
Starting background process CJQ0
Wed May 24 02:06:19 2017
CJQ0 started with pid=46, OS id=43581760
Wed May 24 02:07:35 2017
Thread 2 advanced to log sequence 236289 (LGWR switch)
Current log# 7 seq# 236289 mem# 0: +ARCDG/orcl/onlinelog/group_7.2049.840374587
Current log# 7 seq# 236289 mem# 1: +ARCDG/orcl/onlinelog/group_7.2052.840374591
Current log# 7 seq# 236289 mem# 2: +ARCDG/orcl/onlinelog/group_7.2057.840374593
Current log# 7 seq# 236289 mem# 3: +ARCDG/orcl/onlinelog/group_7.2058.840374597
Wed May 24 02:08:09 2017
Thread 2 advanced to log sequence 236290 (LGWR switch)
Current log# 5 seq# 236290 mem# 0: +ARCDG/orcl/onlinelog/group_5.2025.840374445
Current log# 5 seq# 236290 mem# 1: +ARCDG/orcl/onlinelog/group_5.2026.840374483
Current log# 5 seq# 236290 mem# 2: +ARCDG/orcl/onlinelog/group_5.2029.840374489
Current log# 5 seq# 236290 mem# 3: +ARCDG/orcl/onlinelog/group_5.2034.840374493
Wed May 24 02:08:51 2017
Thread 2 advanced to log sequence 236291 (LGWR switch)
Current log# 6 seq# 236291 mem# 0: +ARCDG/orcl/onlinelog/group_6.2040.840374575
Current log# 6 seq# 236291 mem# 1: +ARCDG/orcl/onlinelog/group_6.2041.840374579
Current log# 6 seq# 236291 mem# 2: +ARCDG/orcl/onlinelog/group_6.2042.840374581
Current log# 6 seq# 236291 mem# 3: +ARCDG/orcl/onlinelog/group_6.2048.840374585
Wed May 24 02:09:30 2017
Thread 2 advanced to log sequence 236292 (LGWR switch)
Current log# 7 seq# 236292 mem# 0: +ARCDG/orcl/onlinelog/group_7.2049.840374587
Current log# 7 seq# 236292 mem# 1: +ARCDG/orcl/onlinelog/group_7.2052.840374591
Current log# 7 seq# 236292 mem# 2: +ARCDG/orcl/onlinelog/group_7.2057.840374593
Current log# 7 seq# 236292 mem# 3: +ARCDG/orcl/onlinelog/group_7.2058.840374597
针对这个RAC集群,做了加多的配置修正、优化,主要是如下所述。
◆数据库配置优化
0、前期大量的日志切换无法完成,导致数据库挂起,为此新增了loggroup,并删掉无用多余的logmember。
1、增大了ASM实例的内存,由350M--->2G
2、RAC并行进程强制在本地实例分配
3、为减少对性能的影响,关闭audit审计
4、AWR 资料保存时间由7天增加到30天
5、SGA增大10G至60G
◆修改网络相关的系统核心参数
/usr/sbin/no -p -o tcp_sendspace=4194304
/usr/sbin/no -p -o tcp_recvspace=4194304
/usr/sbin/no -p -o rfc1323=1
/usr/sbin/no -p -o sb_max=8388608
/usr/sbin/no -p -o udp_ephemeral_low=9000
/usr/sbin/no -p -o tcp_ephemeral_low=9000
◆修改内联网卡的参数
chdev -l en1 -a tcp_sendspace=1048576 -a rfc1323=1 -a tcp_recvspace=1048576
◆SWAP
物理内存大于 16GB,但 SWAP 空间低于 16 GB,不符合 Oracle 安装的标准建议。增大PS。chps -s 192 hd6
◆时区及时间确认,安全起见,暂不修改。
◆ntp确认,使用GI集群自身的时间同步,确保各个节点的时间同步,目前是几十秒的偏差,不是问题。
◆HA集群配置修改,当初安装集群时部分内容没有按照官方文档实施,本次做了补救,但不全面,可能仍存在未知的风险。
5月25日重启后,经多一段时间的观察,有所改善,但于6月7日9点又发生一次节点2当。
但此次已经不是节点的驱逐,说明上次的修正与配置优化起到了作用,初步断定与网络参数配置修改相关。日志如下,无evict日志。
Wed Jun 07 09:02:57 2017
Errors in file /u01/app/11.2.0/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_lms0_7930030.trc (incident=1023882):
ORA-00600: internal error code, arguments: [kjbrref:pkey], [3881577], [6], [7493495], [0], [], [], [], [], [], [], []
Incident details in: /u01/app/11.2.0/oracle/diag/rdbms/orcl/orcl2/incident/incdir_1023882/orcl2_lms0_7930030_i1023882.trc
Wed Jun 07 09:02:59 2017
Trace dumping is performing id=[cdmp_20170607090259]
Errors in file /u01/app/11.2.0/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_lms0_7930030.trc:
ORA-00600: internal error code, arguments: [kjbrref:pkey], [3881577], [6], [7493495], [0], [], [], [], [], [], [], []
LMS0 (ospid: 7930030): terminating the instance due to error 484
Instance terminated by LMS0, pid = 7930030
Wed Jun 07 09:14:01 2017
Starting ORACLE instance (normal)
针对此次的600内部错误,初步结论:
5月25日优化重启后,6月7日又发生过1次节点down的情况,经排查,基本确认是oracle BUG,在某些情况下会发生节点down。目前数据库PSU版本11.2.0.1.0,需要打补丁到最新版本11.2.0.1.6。后续需择机打补丁。
11G RAC 11.2.0.1.0实例evict故障处理的更多相关文章
- oracle 11G rac 11.2.0.1 打补丁9413827
这是升级到以后11.2.0.2.11.2.0.3.11.2.0.4的基础 主要参考两篇文章: Upgrade_11.2.0.1_GI_CRS_to_11.2.0.2_in_Linux.PDF文件 ht ...
- Oracle Database 11g Release 2(11.2.0.3.0) RAC On Redhat Linux 5.8 Using Vmware Workstation 9.0
一,简介 二,配置虚拟机 1,创建虚拟机 (1)添加三块儿网卡: 主节点 二节点 eth0: 公网 192.168.1.20/24 NAT eth0: 公网 192.168.1 ...
- Oracle 11g R2(11.2.0.4) RAC 数据文件路径错误解决--ORA-01157 ORA-01110: 数据文件
Oracle 11g R2(11.2.0.1) RAC 数据文件路径错误解决--ORA-01157 ORA-01110: 数据文件 oracle 11g R2(11.2.0.4) rac--scan ...
- Oracle Linux 6.3下安装Oracle 11g R2(11.2.0.3)
本文主要描写叙述了在Oracle Linux 6.3下安装Oracle 11gR2(11.2.0.3).从Oracle 11g開始,Oracle官方站点不再提供其Patch的下载链接,须要使用Meat ...
- Vmware Workstation实现CentOS6.10_x64 下ORACLE RAC 11.2.0.4的搭建
想必大家在学习ORACLE 11g时,都想搭建一个RAC的实验环境.在搭建RAC实验环境时,会碰到诸如IP怎么规划.虚拟机环境下怎么共享磁盘.ASM磁盘创建,以及安装过程中会遇到这样那样的问题.搭建一 ...
- ORACLE RAC 11.2.0.4 CentOS release 6.9 静默安装1.0版本
RAC11.2.0.4静默安装 1.0版本,20180613 #本文档IP地址使用X隐藏,个人可按照自己的当前环境IP进行适当修改 1. 清除原环境中的单实例软件 #清除原环境: 删除/etc/ora ...
- Oracle Study之--Oracle 单实例11.2.0.1.0升级到11.2.0.3.0
Oracle Study之--Oracle 单实例11.2.0.1.0升级到11.2.0.3.0 系统环境: 操作系统:RedHat EL6(64位) Oracle: Oracle 11gR2 ...
- Oracle 11.2.0.4单实例打PSU,OJVM PSU补丁快速参考
写在前面: 1.Oracel打每个补丁的操作有时存在差异,所以不管多熟悉,都应该在打任何补丁之前阅读新补丁中附带的readme. 2.Oracle每季度都会更新一个最新的PSU,本文最新指的是当前最新 ...
- oracle 11.2.0.4单实例文件系统安装与补丁
[TOC] 一,预安装处理 1.版本准备 操作系统:RHEL 6.5 数据库版本:Oracle 11.2.0.4 相关包:p13390677_112040_Linux-x86-64_1of7.zip ...
- Asianux 7.3安装Oracle 11.2.0.4单实例体验
环境:Asianux 7.3 需求:安装Oracle 11.2.0.4 单实例 背景:系统使用默认的最小安装部署,Oracle安装额外需要的包统一使用yum安装. 查看当前系统相关信息: [root@ ...
随机推荐
- React基本引入和JSX语法
1.1 React介绍 1.1.1. 官网 英文官网:https://reactjs.org/ 中文官网: https://react.docschina.org/ 1.1.2. 介绍描述 用于动态构 ...
- Linux系统运维之zabbix配置tomcat监控
一.介绍 半年前安装的zabbix监控,当时配合异地的测试人员给A项目做压力测试,主要监控项目部署的几台服务器的内存.CPU信息,以及后来网络I/O等,也没考虑JVM:最近闲下来,想完善下监控,故留此 ...
- 微信小程序 - 视图与逻辑
[黑马程序员前端微信小程序开发教程,微信小程序从基础到发布全流程_企业级商城实战(含uni-app项目多端部署)] https://www.bilibili.com/video/BV1834y1676 ...
- Java扩展Nginx之五:五大handler(系列最核心)
欢迎访问我的GitHub 这里分类和汇总了欣宸的全部原创(含配套源码):https://github.com/zq2599/blog_demos 本篇概览 本文是<Java扩展Nginx> ...
- 快速搭建 phpmyadmin 开发环境
真的很喜欢 phpmyadmin 这个 PHP 程序. 虽然工作上基本都用 Navicat,但是很多东西都是习惯了,就很难改变! 比如建表,执行 SQL 语句,界面操作肯定是 phpmyadmin 更 ...
- Python数据分析易错知识点归纳(一):基础知识
一.python基础 字符串replace方法 txt = txt.replace(s, ' ') # 光是txt.replace(s, ' ')是不会对txt产生影响的 # 下面每次循环replac ...
- switch写法详解
我们在开发项目中经常遇到对数据的判断进行相应的逻辑(if..else ,三元运算等),Switch 语句用来选择多个需要执行的代码块 ,一定程度上简化了if....else 1. 语法 switch ...
- Spring 的依赖注入
Spring 的依赖注入 @ 目录 Spring 的依赖注入 每博一文案 1. 依赖注入 1.1 构造注入 1.1.1 通过参数名进行构造注入 1.1.2 通过参数的下标,进行构造注入 1.1.3 不 ...
- linux 脚本:iptables-secure.sh
#!/bin/bash # 2022.2.28 by dewan # secutiry configuration. usage () { echo "$0 start # setup se ...
- Django 组织json格式
@api_view(['GET', 'POST']) def api_test(request): classes = Classes.objects.all() # classes_data = C ...