11G RAC 11.2.0.1.0实例evict故障处理
Aix 7.1
参考文档:
https://blogs.oracle.com/database4cn/rac
Resolving ORA-481 and "terminating the instance due to error 481" (Doc ID 1950963.1)
ORA-00481 After "The instance eviction reason is 0x2" due to Lack of Ticket (Doc ID 1644015.1)
归根结底是RAC节点之间消息传输流量控制的问题,除网络、硬件原因外,需要打补丁解决。
以下为第一次处理前具体的alertlog。
- ◆node1
- Wed May 24 01:59:24 2017
- Remote instance 2 kill is issued with system inc 228
- LMON received an instance eviction notification from instance 1
- The instance eviction reason is 0x2
- The instance eviction map is 2
- Reconfiguration started (old inc 228, new inc 230)
- List of instances:
- 1 (myinst: 1)
- Wed May 24 01:59:27 2017
- Trace dumping is performing id=[cdmp_20170524015904]
- Global Resource Directory frozen
- * dead instance detected - domain 0 invalid = TRUE
- Communication channels reestablished
- Master broadcasted resource hash value bitmaps
- Non-local Process blocks cleaned out
- Wed May 24 01:59:28 2017
- LMS 1: 14 GCS shadows cancelled, 1 closed, 0 Xw survived
- Wed May 24 01:59:28 2017
- LMS 0: 23 GCS shadows cancelled, 0 closed, 0 Xw survived
- Wed May 24 01:59:28 2017
- LMS 2: 21 GCS shadows cancelled, 0 closed, 0 Xw survived
- Set master node info
- Submitted all remote-enqueue requests
- Dwn-cvts replayed, VALBLKs dubious
- All grantable enqueues granted
- Post SMON to start 1st pass IR
- Wed May 24 01:59:32 2017
- Instance recovery: looking for dead threads
- Beginning instance recovery of 1 threads
- Wed May 24 01:59:48 2017
- parallel recovery started with 32 processes
- Started redo scan
- Wed May 24 01:59:49 2017
- Submitted all GCS remote-cache requests
- Post SMON to start 1st pass IR
- Fix write in gcs resources
- Reconfiguration complete
- Wed May 24 02:00:00 2017
- Completed redo scan
- read 2094379 KB redo, 237345 data blocks need recovery
- Wed May 24 02:00:03 2017
- Reconfiguration started (old inc 230, new inc 232)
- List of instances:
- 1 2 (myinst: 1)
- Global Resource Directory frozen
- Communication channels reestablished
- Master broadcasted resource hash value bitmaps
- Non-local Process blocks cleaned out
- Wed May 24 02:00:04 2017
- LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
- Wed May 24 02:00:04 2017
- Wed May 24 02:00:04 2017
- LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
- LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
- Set master node info
- Submitted all remote-enqueue requests
- Dwn-cvts replayed, VALBLKs dubious
- All grantable enqueues granted
- Post SMON to start 1st pass IR
- Wed May 24 02:02:27 2017
- ,
- ,此处大量12170
- ,
- Wed May 24 02:05:34 2017
- LMON (ospid: 35979900): terminating the instance due to error 481
- Wed May 24 02:05:34 2017
- System state dump is made for local instance
- System State dumped to trace file /u01/app/11.2.0/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_diag_40894656.trc
- Instance terminated by LMON, pid = 35979900
- Wed May 24 08:12:27 2017
- Starting ORACLE instance (normal)
- sskgpgetexecname failed to get name
- LICENSE_MAX_SESSION = 0
- LICENSE_SESSIONS_WARNING = 0
- Interface type 1 en1 192.168.0.0 configured from GPnP Profile for use as a cluster interconnect
- Interface type 1 en0 10.209.199.0 configured from GPnP Profile for use as a public interface
- Picked latch-free SCN scheme 3
- Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
- Autotune of undo retention is turned on.
- LICENSE_MAX_USERS = 0
- SYS auditing is disabled
- Starting up:
- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
- With the Partitioning, Real Application Clusters, OLAP, Data Mining
- and Real Application Testing options.
- Using parameter settings in server-side pfile /u01/app/11.2.0/oracle/product/11.2.0/db_1/dbs/initorcl1.ora
- System parameters with non-default values:
- processes = 2000
- timed_statistics = TRUE
- sga_max_size = 70G
- spfile = "+DATA1/orcl/spfileorcl.ora"
- sga_target = 50G
- control_files = "+DATA1/orcl/controlfile/current.261.832429113"
- control_files = "+ARCDG/orcl/controlfile/current.259.832429113"
- db_block_size = 8192
- compatible = "11.2.0.0.0"
- log_archive_format = "%t_%s_%r.dbf"
- cluster_database = TRUE
- db_create_file_dest = "+DATA1"
- db_recovery_file_dest = "+ARCDG"
- db_recovery_file_dest_size= 2048G
- thread = 1
- undo_tablespace = "UNDOTBS1"
- undo_retention = 10800
- instance_number = 1
- remote_login_passwordfile= "NONE"
- db_domain = ""
- service_names = "orcl"
- dispatchers = "(PROTOCOL=TCP) (SERVICE=orclXDB)"
- local_listener = "(ADDRESS = (PROTOCOL = TCP)(HOST = 10.209.199.4)(PORT = 1521))"
- remote_listener = "rac-scan:1521"
- result_cache_max_size = 268736K
- audit_file_dest = "/u01/app/11.2.0/oracle/admin/orcl/adump"
- audit_trail = "DB"
- db_name = "orcl"
- open_cursors = 1000
- sql_trace = FALSE
- optimizer_index_caching = 90
- pga_aggregate_target = 20G
- deferred_segment_creation= FALSE
- aq_tm_processes = 5
- diagnostic_dest = "/u01/app/11.2.0/oracle"
- Deprecated system parameters with specified values:
- sql_trace
- End of deprecated system parameter listing
- Cluster communication is configured to use the following interface(s) for this instance
- 192.168.0.1
- cluster interconnect IPC version:Oracle UDP/IP (generic)
- IPC Vendor 1 proto 2
- Wed May 24 08:12:32 2017
- PMON started with pid=2, OS id=35652038
- Wed May 24 08:12:32 2017
- VKTM started with pid=3, OS id=33489120 at elevated priority
- VKTM running at (10)millisec precision with DBRM quantum (100)ms
- Wed May 24 08:12:32 2017
- GEN0 started with pid=4, OS id=34209898
- Wed May 24 08:12:32 2017
- DIAG started with pid=5, OS id=36438538
- Wed May 24 08:12:33 2017
- DBRM started with pid=6, OS id=24576446
- Wed May 24 08:12:33 2017
- PING started with pid=7, OS id=29687920
- Wed May 24 08:12:33 2017
- PSP0 started with pid=8, OS id=36635296
- Wed May 24 08:12:33 2017
- ACMS started with pid=9, OS id=34341462
- Wed May 24 08:12:33 2017
- DIA0 started with pid=10, OS id=32047870
- Wed May 24 08:12:33 2017
- LMON started with pid=11, OS id=36045304
- Wed May 24 08:12:35 2017
- LMD0 started with pid=12, OS id=31391860
- Wed May 24 08:12:35 2017
- LMS0 started with pid=13, OS id=35979918 at elevated priority
- Wed May 24 08:12:35 2017
- LMS1 started with pid=14, OS id=22741294 at elevated priority
- Wed May 24 08:12:36 2017
- LMS2 started with pid=15, OS id=29229354 at elevated priority
- Wed May 24 08:12:36 2017
- RMS0 started with pid=16, OS id=20185780
- Wed May 24 08:12:36 2017
- LMHB started with pid=17, OS id=29425998
- Wed May 24 08:12:36 2017
- MMAN started with pid=18, OS id=9699954
- Wed May 24 08:12:36 2017
- DBW0 started with pid=19, OS id=31850686
- Wed May 24 08:12:36 2017
- DBW1 started with pid=20, OS id=36045442
- Wed May 24 08:12:36 2017
- DBW2 started with pid=21, OS id=40894536
- Wed May 24 08:12:36 2017
- DBW3 started with pid=22, OS id=29819564
- Wed May 24 08:12:36 2017
- DBW4 started with pid=23, OS id=36634890
- Wed May 24 08:12:36 2017
- LGWR started with pid=24, OS id=35783370
- Wed May 24 08:12:37 2017
- CKPT started with pid=25, OS id=35914258
- Wed May 24 08:12:37 2017
- SMON started with pid=26, OS id=32374796
- Wed May 24 08:12:37 2017
- RECO started with pid=27, OS id=35652280
- Wed May 24 08:12:37 2017
- RBAL started with pid=28, OS id=36438322
- Wed May 24 08:12:37 2017
- ASMB started with pid=29, OS id=39387144
- Wed May 24 08:12:37 2017
- MMON started with pid=30, OS id=2621900
- NOTE: initiating MARK startup
- Wed May 24 08:12:37 2017
- Starting background process MARKMMNL started with pid=31, OS id=35979732
- starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
- Wed May 24 08:12:37 2017
- MARK started with pid=32, OS id=35717688
- NOTE: MARK has subscribed
- starting up 1 shared server(s) ...
- lmon registered with NM - instance number 1 (internal mem no 0)
- Reconfiguration started (old inc 0, new inc 236)
- List of instances:
- 1 2 (myinst: 1)
- Global Resource Directory frozen
- * allocate domain 0, invalid = TRUE
- Communication channels reestablished
- * domain 0 valid according to instance 2
- * domain 0 valid = 1 according to instance 2
- Master broadcasted resource hash value bitmaps
- Non-local Process blocks cleaned out
- LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
- LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
- LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
- Set master node info
- Submitted all remote-enqueue requests
- Dwn-cvts replayed, VALBLKs dubious
- All grantable enqueues granted
- Wed May 24 08:12:45 2017
- Submitted all GCS remote-cache requests
- Fix write in gcs resources
- Reconfiguration complete
- Wed May 24 08:12:52 2017
- LCK0 started with pid=34, OS id=24379778
- Wed May 24 08:12:52 2017
- Starting background process RSMN
- Wed May 24 08:12:52 2017
- RSMN started with pid=36, OS id=30802026
- ORACLE_BASE from environment = /u01/app/11.2.0/oracle
- Wed May 24 08:12:53 2017
- ALTER DATABASE MOUNT
- Wed May 24 08:12:53 2017
- NOTE: Loaded library: System
- Wed May 24 08:12:53 2017
- SUCCESS: diskgroup DATA1 was mounted
- SUCCESS: diskgroup ARCDG was mounted
- Wed May 24 08:12:53 2017
- NOTE: dependency between database orcl and diskgroup resource ora.DATA1.dg is established
- NOTE: dependency between database orcl and diskgroup resource ora.ARCDG.dg is established
- Wed May 24 08:12:57 2017
- Successful mount of redo thread 1, with mount id 1472569957
- Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
- Lost write protection disabled
- Completed: ALTER DATABASE MOUNT
- Wed May 24 08:12:58 2017
- ALTER DATABASE OPEN
- Block change tracking file is current.
- Picked broadcast on commit scheme to generate SCNs
- Wed May 24 08:12:58 2017
- SUCCESS: diskgroup DATA2 was mounted
- NOTE: dependency between database orcl and diskgroup resource ora.DATA2.dg is established
- SUCCESS: diskgroup DATA3 was mounted
- NOTE: dependency between database orcl and diskgroup resource ora.DATA3.dg is established
- Thread 1 advanced to log sequence 255944 (thread open)
- Thread 1 opened at log sequence 255944
- Current log# 2 seq# 255944 mem# 0: +ARCDG/orcl/onlinelog/group_2.261.840376035
- Current log# 2 seq# 255944 mem# 1: +ARCDG/orcl/onlinelog/group_2.2113.840376039
- Current log# 2 seq# 255944 mem# 2: +ARCDG/orcl/onlinelog/group_2.2114.840376041
- Current log# 2 seq# 255944 mem# 3: +ARCDG/orcl/onlinelog/group_2.2116.840376043
- Successful open of redo thread 1
- MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
- Starting background process CTWR
- Wed May 24 08:12:59 2017
- CTWR started with pid=39, OS id=32768642
- Block change tracking service is active.
- Wed May 24 08:13:00 2017
- SMON: enabling cache recovery
- Successfully onlined Undo Tablespace 2.
- Verifying file header compatibility for 11g tablespace encryption..
- Verifying 11g file header compatibility for tablespace encryption completed
- SMON: enabling tx recovery
- Database Characterset is AL32UTF8
- No Resource Manager plan active
- Starting background process GTX0
- Wed May 24 08:13:07 2017
- GTX0 started with pid=42, OS id=35062526
- Starting background process RCBG
- Wed May 24 08:13:07 2017
- RCBG started with pid=43, OS id=27721944
- replication_dependency_tracking turned off (no async multimaster replication found)
- Wed May 24 08:13:08 2017
- Starting background process QMNC
- Wed May 24 08:13:08 2017
- QMNC started with pid=44, OS id=34013894
- Completed: ALTER DATABASE OPEN
- Wed May 24 08:13:17 2017
- Starting background process CJQ0
- Wed May 24 08:13:17 2017
- CJQ0 started with pid=41, OS id=11469004
- Wed May 24 08:13:43 2017
- Starting background process SMCO
- Wed May 24 08:13:43 2017
- SMCO started with pid=120, OS id=42336480
- ,
- ,
- ,
- Wed May 24 09:08:16 2017
- Errors in file /u01/app/11.2.0/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_j001_33161436.trc:
- ORA-12012: 鑎仟錨狮鎊骚琛^啼絕茕竈239 錨呛閊証
- ORA-00001: 杩^蒎^蟐湾^辕涓^犁害鎊較哝^荨浠(CM.UK_LTE_ZY_CHECK_HISTORY)
- ORA-06512: 錨塄 "CM.JOB_LTE_ZY_CHECK", line 27
- ORA-06512: 錨塄 line 1
- ◆node2
- Wed May 24 01:59:01 2017
- LMS1 (ospid: 26411086) received an instance eviction notification from instance 1 [2]
- Wed May 24 01:59:01 2017
- LMON received an instance eviction notification from instance 1
- The instance eviction reason is 0x2
- The instance eviction map is 2
- Wed May 24 01:59:04 2017
- PMON (ospid: 50725354): terminating the instance due to error 481
- Wed May 24 01:59:04 2017
- System state dump is made for local instance
- System State dumped to trace file /u01/app/11.2.0/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_diag_42336686.trc
- Wed May 24 01:59:04 2017
- ORA-1092 : opitsk aborting process
- Wed May 24 01:59:04 2017
- License high water mark = 516
- Instance terminated by PMON, pid = 50725354
- USER (ospid: 50987430): terminating the instance
- Instance terminated by USER, pid = 50987430
- Wed May 24 01:59:29 2017
- Starting ORACLE instance (normal)
- sskgpgetexecname failed to get name
- LICENSE_MAX_SESSION = 0
- LICENSE_SESSIONS_WARNING = 0
- Interface type 1 en1 192.168.0.0 configured from GPnP Profile for use as a cluster interconnect
- Interface type 1 en0 10.209.199.0 configured from GPnP Profile for use as a public interface
- Picked latch-free SCN scheme 3
- Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
- Autotune of undo retention is turned on.
- LICENSE_MAX_USERS = 0
- SYS auditing is disabled
- Starting up:
- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
- With the Partitioning, Real Application Clusters, OLAP, Data Mining
- and Real Application Testing options.
- Using parameter settings in server-side pfile /u01/app/11.2.0/oracle/product/11.2.0/db_1/dbs/initorcl2.ora
- System parameters with non-default values:
- processes = 2000
- timed_statistics = TRUE
- sga_max_size = 70G
- spfile = "+DATA1/orcl/spfileorcl.ora"
- sga_target = 50G
- control_files = "+DATA1/orcl/controlfile/current.261.832429113"
- control_files = "+ARCDG/orcl/controlfile/current.259.832429113"
- db_block_size = 8192
- compatible = "11.2.0.0.0"
- log_archive_format = "%t_%s_%r.dbf"
- cluster_database = TRUE
- db_create_file_dest = "+DATA1"
- db_recovery_file_dest = "+ARCDG"
- db_recovery_file_dest_size= 2048G
- thread = 2
- undo_tablespace = "UNDOTBS2"
- undo_retention = 10800
- instance_number = 2
- remote_login_passwordfile= "NONE"
- db_domain = ""
- dispatchers = "(PROTOCOL=TCP) (SERVICE=orclXDB)"
- local_listener = "(ADDRESS = (PROTOCOL = TCP)(HOST = 10.209.199.5)(PORT = 1521))"
- remote_listener = "rac-scan:1521"
- result_cache_max_size = 268736K
- audit_file_dest = "/u01/app/11.2.0/oracle/admin/orcl/adump"
- audit_trail = "DB"
- db_name = "orcl"
- open_cursors = 1000
- sql_trace = FALSE
- optimizer_index_caching = 90
- pga_aggregate_target = 20G
- deferred_segment_creation= FALSE
- aq_tm_processes = 5
- diagnostic_dest = "/u01/app/11.2.0/oracle"
- Deprecated system parameters with specified values:
- sql_trace
- End of deprecated system parameter listing
- Cluster communication is configured to use the following interface(s) for this instance
- 192.168.0.2
- cluster interconnect IPC version:Oracle UDP/IP (generic)
- IPC Vendor 1 proto 2
- Wed May 24 01:59:34 2017
- PMON started with pid=2, OS id=66584656
- Wed May 24 01:59:34 2017
- VKTM started with pid=3, OS id=66846744 at elevated priority
- VKTM running at (10)millisec precision with DBRM quantum (100)ms
- Wed May 24 01:59:34 2017
- GEN0 started with pid=4, OS id=26608090
- Wed May 24 01:59:34 2017
- DIAG started with pid=5, OS id=26083716
- Wed May 24 01:59:34 2017
- DBRM started with pid=6, OS id=24510972
- Wed May 24 01:59:34 2017
- PING started with pid=7, OS id=65077306
- Wed May 24 01:59:34 2017
- PSP0 started with pid=8, OS id=66781402
- Wed May 24 01:59:34 2017
- ACMS started with pid=9, OS id=66978040
- Wed May 24 01:59:34 2017
- DIA0 started with pid=10, OS id=66519050
- Wed May 24 01:59:34 2017
- LMON started with pid=11, OS id=66453694
- Wed May 24 01:59:37 2017
- LMD0 started with pid=12, OS id=23658758
- Wed May 24 01:59:37 2017
- LMS0 started with pid=13, OS id=66322614 at elevated priority
- Wed May 24 01:59:37 2017
- LMS1 started with pid=14, OS id=65798210 at elevated priority
- Wed May 24 01:59:37 2017
- LMS2 started with pid=15, OS id=15204812 at elevated priority
- Wed May 24 01:59:38 2017
- RMS0 started with pid=16, OS id=65732820
- Wed May 24 01:59:38 2017
- LMHB started with pid=17, OS id=65339438
- Wed May 24 01:59:38 2017
- MMAN started with pid=18, OS id=52298036
- Wed May 24 01:59:38 2017
- DBW0 started with pid=19, OS id=52232458
- Wed May 24 01:59:38 2017
- DBW1 started with pid=20, OS id=65273872
- Wed May 24 01:59:38 2017
- DBW2 started with pid=21, OS id=65208354
- Wed May 24 01:59:38 2017
- DBW3 started with pid=22, OS id=65994980
- Wed May 24 01:59:38 2017
- DBW4 started with pid=23, OS id=61210752
- Wed May 24 01:59:38 2017
- LGWR started with pid=24, OS id=6095298
- Wed May 24 01:59:38 2017
- CKPT started with pid=25, OS id=65470648
- Wed May 24 01:59:38 2017
- SMON started with pid=26, OS id=64946374
- Wed May 24 01:59:38 2017
- RECO started with pid=27, OS id=51642854
- Wed May 24 01:59:39 2017
- RBAL started with pid=28, OS id=64618728
- Wed May 24 01:59:39 2017
- ASMB started with pid=29, OS id=64553208
- Wed May 24 01:59:39 2017
- MMON started with pid=30, OS id=64422054
- NOTE: initiating MARK startup
- Wed May 24 01:59:39 2017
- MMNL started with pid=31, OS id=42336716
- Starting background process MARK
- starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
- Wed May 24 01:59:39 2017
- MARK started with pid=32, OS id=51577168
- NOTE: MARK has subscribed
- Wed May 24 01:59:39 2017
- starting up 1 shared server(s) ...
- lmon registered with NM - instance number 2 (internal mem no 1)
- Reconfiguration started (old inc 0, new inc 232)
- List of instances:
- 1 2 (myinst: 2)
- Global Resource Directory frozen
- * allocate domain 0, invalid = TRUE
- Communication channels reestablished
- * domain 0 valid = 0 according to instance 1
- Master broadcasted resource hash value bitmaps
- Non-local Process blocks cleaned out
- LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
- LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
- LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
- Set master node info
- Submitted all remote-enqueue requests
- Dwn-cvts replayed, VALBLKs dubious
- All grantable enqueues granted
- Wed May 24 01:59:48 2017
- Submitted all GCS remote-cache requests
- Wed May 24 02:05:10 2017
- Trace dumping is performing id=[cdmp_20170524020534]
- Wed May 24 02:05:18 2017
- Reconfiguration started (old inc 232, new inc 234)
- List of instances:
- 2 (myinst: 2)
- Nested reconfiguration detected.
- Global Resource Directory frozen
- * dead instance detected - domain 0 invalid = TRUE
- Communication channels reestablished
- Master broadcasted resource hash value bitmaps
- Non-local Process blocks cleaned out
- Wed May 24 02:05:19 2017
- LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
- Wed May 24 02:05:19 2017
- LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
- Wed May 24 02:05:19 2017
- LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
- Set master node info
- Submitted all remote-enqueue requests
- Dwn-cvts replayed, VALBLKs dubious
- All grantable enqueues granted
- Post SMON to start 1st pass IR
- Submitted all GCS remote-cache requests
- Post SMON to start 1st pass IR
- Fix write in gcs resources
- Reconfiguration complete
- Wed May 24 02:05:25 2017
- LCK0 started with pid=34, OS id=51118396
- Wed May 24 02:05:25 2017
- Starting background process RSMN
- Wed May 24 02:05:25 2017
- RSMN started with pid=36, OS id=51773786
- ORACLE_BASE not set in environment. It is recommended
- that ORACLE_BASE be set in the environment
- Reusing ORACLE_BASE from an earlier startup = /u01/app/11.2.0/oracle
- Wed May 24 02:05:25 2017
- ALTER DATABASE MOUNT
- This instance was first to mount
- Wed May 24 02:05:26 2017
- NOTE: Loaded library: System
- Wed May 24 02:05:26 2017
- SUCCESS: diskgroup DATA1 was mounted
- Wed May 24 02:05:26 2017
- NOTE: dependency between database orcl and diskgroup resource ora.DATA1.dg is established
- SUCCESS: diskgroup ARCDG was mounted
- NOTE: dependency between database orcl and diskgroup resource ora.ARCDG.dg is established
- Wed May 24 02:05:30 2017
- Successful mount of redo thread 2, with mount id 1472569957
- Database mounted in Shared Mode (CLUSTER_DATABASE=TRUE)
- Lost write protection disabled
- Completed: ALTER DATABASE MOUNT
- ALTER DATABASE OPEN
- This instance was first to open
- Wed May 24 02:05:30 2017
- SUCCESS: diskgroup DATA2 was mounted
- NOTE: dependency between database orcl and diskgroup resource ora.DATA2.dg is established
- SUCCESS: diskgroup DATA3 was mounted
- NOTE: dependency between database orcl and diskgroup resource ora.DATA3.dg is established
- Block change tracking file is current.
- Beginning crash recovery of 2 threads
- parallel recovery started with 32 processes
- Started redo scan
- Wed May 24 02:05:44 2017
- Completed redo scan
- read 2158203 KB redo, 230150 data blocks need recovery
- Started redo application at
- Thread 1: logseq 255942, block 163388
- Thread 2: logseq 236286, block 2067920
- Recovery of Online Redo Log: Thread 1 Group 3 Seq 255942 Reading mem 0
- Mem# 0: +ARCDG/orcl/onlinelog/group_3.2085.840375427
- Mem# 1: +ARCDG/orcl/onlinelog/group_3.2084.840375445
- Mem# 2: +ARCDG/orcl/onlinelog/group_3.263.840375447
- Mem# 3: +ARCDG/orcl/onlinelog/group_3.2089.840375449
- Recovery of Online Redo Log: Thread 2 Group 7 Seq 236286 Reading mem 0
- Mem# 0: +ARCDG/orcl/onlinelog/group_7.2049.840374587
- Mem# 1: +ARCDG/orcl/onlinelog/group_7.2052.840374591
- Mem# 2: +ARCDG/orcl/onlinelog/group_7.2057.840374593
- Mem# 3: +ARCDG/orcl/onlinelog/group_7.2058.840374597
- Recovery of Online Redo Log: Thread 2 Group 5 Seq 236287 Reading mem 0
- Mem# 0: +ARCDG/orcl/onlinelog/group_5.2025.840374445
- Mem# 1: +ARCDG/orcl/onlinelog/group_5.2026.840374483
- Mem# 2: +ARCDG/orcl/onlinelog/group_5.2029.840374489
- Mem# 3: +ARCDG/orcl/onlinelog/group_5.2034.840374493
- Wed May 24 02:06:05 2017
- Completed redo application of 1702.07MB
- Completed crash recovery at
- Thread 1: logseq 255942, block 291036, scn 12495715727030
- Thread 2: logseq 236287, block 2742038, scn 12495715600520
- 230150 data blocks read, 229989 data blocks written, 2158203 redo k-bytes read
- Thread 1 advanced to log sequence 255943 (thread recovery)
- Picked broadcast on commit scheme to generate SCNs
- Wed May 24 02:06:06 2017
- Thread 2 advanced to log sequence 236288 (thread open)
- Thread 2 opened at log sequence 236288
- Current log# 6 seq# 236288 mem# 0: +ARCDG/orcl/onlinelog/group_6.2040.840374575
- Current log# 6 seq# 236288 mem# 1: +ARCDG/orcl/onlinelog/group_6.2041.840374579
- Current log# 6 seq# 236288 mem# 2: +ARCDG/orcl/onlinelog/group_6.2042.840374581
- Current log# 6 seq# 236288 mem# 3: +ARCDG/orcl/onlinelog/group_6.2048.840374585
- Successful open of redo thread 2
- MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
- Starting background process CTWR
- Wed May 24 02:06:07 2017
- CTWR started with pid=89, OS id=49349114
- Block change tracking service is active.
- Wed May 24 02:06:07 2017
- SMON: enabling cache recovery
- Successfully onlined Undo Tablespace 4.
- Verifying file header compatibility for 11g tablespace encryption..
- Verifying 11g file header compatibility for tablespace encryption completed
- SMON: enabling tx recovery
- Database Characterset is AL32UTF8
- No Resource Manager plan active
- Starting background process GTX0
- Wed May 24 02:06:11 2017
- GTX0 started with pid=98, OS id=42992042
- Starting background process RCBG
- Wed May 24 02:06:11 2017
- RCBG started with pid=52, OS id=40370648
- replication_dependency_tracking turned off (no async multimaster replication found)
- Starting background process QMNC
- Wed May 24 02:06:13 2017
- QMNC started with pid=119, OS id=47514086
- Completed: ALTER DATABASE OPEN
- SMON: Parallel transaction recovery tried
- Starting background process SMCO
- Wed May 24 02:06:17 2017
- SMCO started with pid=153, OS id=50724962
- Wed May 24 02:06:17 2017
- db_recovery_file_dest_size of 2097152 MB is 4.35% used. This is a
- user-specified limit on the amount of space that will be used by this
- database for recovery-related files, and does not reflect the amount of
- space available in the underlying filesystem or ASM diskgroup.
- Wed May 24 02:06:19 2017
- Starting background process CJQ0
- Wed May 24 02:06:19 2017
- CJQ0 started with pid=46, OS id=43581760
- Wed May 24 02:07:35 2017
- Thread 2 advanced to log sequence 236289 (LGWR switch)
- Current log# 7 seq# 236289 mem# 0: +ARCDG/orcl/onlinelog/group_7.2049.840374587
- Current log# 7 seq# 236289 mem# 1: +ARCDG/orcl/onlinelog/group_7.2052.840374591
- Current log# 7 seq# 236289 mem# 2: +ARCDG/orcl/onlinelog/group_7.2057.840374593
- Current log# 7 seq# 236289 mem# 3: +ARCDG/orcl/onlinelog/group_7.2058.840374597
- Wed May 24 02:08:09 2017
- Thread 2 advanced to log sequence 236290 (LGWR switch)
- Current log# 5 seq# 236290 mem# 0: +ARCDG/orcl/onlinelog/group_5.2025.840374445
- Current log# 5 seq# 236290 mem# 1: +ARCDG/orcl/onlinelog/group_5.2026.840374483
- Current log# 5 seq# 236290 mem# 2: +ARCDG/orcl/onlinelog/group_5.2029.840374489
- Current log# 5 seq# 236290 mem# 3: +ARCDG/orcl/onlinelog/group_5.2034.840374493
- Wed May 24 02:08:51 2017
- Thread 2 advanced to log sequence 236291 (LGWR switch)
- Current log# 6 seq# 236291 mem# 0: +ARCDG/orcl/onlinelog/group_6.2040.840374575
- Current log# 6 seq# 236291 mem# 1: +ARCDG/orcl/onlinelog/group_6.2041.840374579
- Current log# 6 seq# 236291 mem# 2: +ARCDG/orcl/onlinelog/group_6.2042.840374581
- Current log# 6 seq# 236291 mem# 3: +ARCDG/orcl/onlinelog/group_6.2048.840374585
- Wed May 24 02:09:30 2017
- Thread 2 advanced to log sequence 236292 (LGWR switch)
- Current log# 7 seq# 236292 mem# 0: +ARCDG/orcl/onlinelog/group_7.2049.840374587
- Current log# 7 seq# 236292 mem# 1: +ARCDG/orcl/onlinelog/group_7.2052.840374591
- Current log# 7 seq# 236292 mem# 2: +ARCDG/orcl/onlinelog/group_7.2057.840374593
- Current log# 7 seq# 236292 mem# 3: +ARCDG/orcl/onlinelog/group_7.2058.840374597
针对这个RAC集群,做了加多的配置修正、优化,主要是如下所述。
◆数据库配置优化
0、前期大量的日志切换无法完成,导致数据库挂起,为此新增了loggroup,并删掉无用多余的logmember。
1、增大了ASM实例的内存,由350M--->2G
2、RAC并行进程强制在本地实例分配
3、为减少对性能的影响,关闭audit审计
4、AWR 资料保存时间由7天增加到30天
5、SGA增大10G至60G
◆修改网络相关的系统核心参数
/usr/sbin/no -p -o tcp_sendspace=4194304
/usr/sbin/no -p -o tcp_recvspace=4194304
/usr/sbin/no -p -o rfc1323=1
/usr/sbin/no -p -o sb_max=8388608
/usr/sbin/no -p -o udp_ephemeral_low=9000
/usr/sbin/no -p -o tcp_ephemeral_low=9000
◆修改内联网卡的参数
chdev -l en1 -a tcp_sendspace=1048576 -a rfc1323=1 -a tcp_recvspace=1048576
◆SWAP
物理内存大于 16GB,但 SWAP 空间低于 16 GB,不符合 Oracle 安装的标准建议。增大PS。chps -s 192 hd6
◆时区及时间确认,安全起见,暂不修改。
◆ntp确认,使用GI集群自身的时间同步,确保各个节点的时间同步,目前是几十秒的偏差,不是问题。
◆HA集群配置修改,当初安装集群时部分内容没有按照官方文档实施,本次做了补救,但不全面,可能仍存在未知的风险。
5月25日重启后,经多一段时间的观察,有所改善,但于6月7日9点又发生一次节点2当。
但此次已经不是节点的驱逐,说明上次的修正与配置优化起到了作用,初步断定与网络参数配置修改相关。日志如下,无evict日志。
- Wed Jun 07 09:02:57 2017
- Errors in file /u01/app/11.2.0/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_lms0_7930030.trc (incident=1023882):
- ORA-00600: internal error code, arguments: [kjbrref:pkey], [3881577], [6], [7493495], [0], [], [], [], [], [], [], []
- Incident details in: /u01/app/11.2.0/oracle/diag/rdbms/orcl/orcl2/incident/incdir_1023882/orcl2_lms0_7930030_i1023882.trc
- Wed Jun 07 09:02:59 2017
- Trace dumping is performing id=[cdmp_20170607090259]
- Errors in file /u01/app/11.2.0/oracle/diag/rdbms/orcl/orcl2/trace/orcl2_lms0_7930030.trc:
- ORA-00600: internal error code, arguments: [kjbrref:pkey], [3881577], [6], [7493495], [0], [], [], [], [], [], [], []
- LMS0 (ospid: 7930030): terminating the instance due to error 484
- Instance terminated by LMS0, pid = 7930030
- Wed Jun 07 09:14:01 2017
- Starting ORACLE instance (normal)
针对此次的600内部错误,初步结论:
5月25日优化重启后,6月7日又发生过1次节点down的情况,经排查,基本确认是oracle BUG,在某些情况下会发生节点down。目前数据库PSU版本11.2.0.1.0,需要打补丁到最新版本11.2.0.1.6。后续需择机打补丁。
11G RAC 11.2.0.1.0实例evict故障处理的更多相关文章
- oracle 11G rac 11.2.0.1 打补丁9413827
这是升级到以后11.2.0.2.11.2.0.3.11.2.0.4的基础 主要参考两篇文章: Upgrade_11.2.0.1_GI_CRS_to_11.2.0.2_in_Linux.PDF文件 ht ...
- Oracle Database 11g Release 2(11.2.0.3.0) RAC On Redhat Linux 5.8 Using Vmware Workstation 9.0
一,简介 二,配置虚拟机 1,创建虚拟机 (1)添加三块儿网卡: 主节点 二节点 eth0: 公网 192.168.1.20/24 NAT eth0: 公网 192.168.1 ...
- Oracle 11g R2(11.2.0.4) RAC 数据文件路径错误解决--ORA-01157 ORA-01110: 数据文件
Oracle 11g R2(11.2.0.1) RAC 数据文件路径错误解决--ORA-01157 ORA-01110: 数据文件 oracle 11g R2(11.2.0.4) rac--scan ...
- Oracle Linux 6.3下安装Oracle 11g R2(11.2.0.3)
本文主要描写叙述了在Oracle Linux 6.3下安装Oracle 11gR2(11.2.0.3).从Oracle 11g開始,Oracle官方站点不再提供其Patch的下载链接,须要使用Meat ...
- Vmware Workstation实现CentOS6.10_x64 下ORACLE RAC 11.2.0.4的搭建
想必大家在学习ORACLE 11g时,都想搭建一个RAC的实验环境.在搭建RAC实验环境时,会碰到诸如IP怎么规划.虚拟机环境下怎么共享磁盘.ASM磁盘创建,以及安装过程中会遇到这样那样的问题.搭建一 ...
- ORACLE RAC 11.2.0.4 CentOS release 6.9 静默安装1.0版本
RAC11.2.0.4静默安装 1.0版本,20180613 #本文档IP地址使用X隐藏,个人可按照自己的当前环境IP进行适当修改 1. 清除原环境中的单实例软件 #清除原环境: 删除/etc/ora ...
- Oracle Study之--Oracle 单实例11.2.0.1.0升级到11.2.0.3.0
Oracle Study之--Oracle 单实例11.2.0.1.0升级到11.2.0.3.0 系统环境: 操作系统:RedHat EL6(64位) Oracle: Oracle 11gR2 ...
- Oracle 11.2.0.4单实例打PSU,OJVM PSU补丁快速参考
写在前面: 1.Oracel打每个补丁的操作有时存在差异,所以不管多熟悉,都应该在打任何补丁之前阅读新补丁中附带的readme. 2.Oracle每季度都会更新一个最新的PSU,本文最新指的是当前最新 ...
- oracle 11.2.0.4单实例文件系统安装与补丁
[TOC] 一,预安装处理 1.版本准备 操作系统:RHEL 6.5 数据库版本:Oracle 11.2.0.4 相关包:p13390677_112040_Linux-x86-64_1of7.zip ...
- Asianux 7.3安装Oracle 11.2.0.4单实例体验
环境:Asianux 7.3 需求:安装Oracle 11.2.0.4 单实例 背景:系统使用默认的最小安装部署,Oracle安装额外需要的包统一使用yum安装. 查看当前系统相关信息: [root@ ...
随机推荐
- 4. DI相关内容
我们先来思考 向一个类中传递数据的方式有几种? 普通方法(set 方法) 构造方法 依赖注入描述了在容器中建立 bean 与 bean 之间的依赖关系的过程,如果 bean 运行需要的是数字或字符串呢 ...
- python学习--解析网页
# -*- coding: utf-8 -*-"""Created on Thu Oct 17 14:04:21 2019 @author: DELL"&quo ...
- Python 运行 shell 命令的一些方法
哈喽大家好,我是咸鱼 我们知道,python 在自动化领域中被广泛应用,可以很好地自动化处理一些任务 就比如编写 Python 脚本自动化执行重复性的任务,如文件处理.数据处理.系统管理等需要运行其他 ...
- 如何修改电脑的BIOS密码?
本文介绍设置.修改Windows电脑BIOS模式密码的具体方法. 一般的,电脑默认都是不含有BIOS密码的,可以直接在开机时不输入任何密码进入BIOS模式:而在某些特定的场合,我们可能需要对其 ...
- fread()模板
char buf[1<<20],*p1,*p2;#define GC (p1==p2&&(p2=(p1=buf)+fread(buf,1,1<<20,stdin ...
- Linux 命令:grub2-mkconfig
检索这个命令的,肯定都知道 grub 是 bootloader 程序,用于引导系统启动.配置文件是 grub.conf,现在一般的 grub 版本是grub2. 当机器上安装有多个内核.或者多个操作系 ...
- 2021-10-09 Core学习
控制器学习 如果有ID参数,根据前面定义的{controller=Home}/{action=Index}/{id?} 可以换成一下格式 页面学习 视图 基架搭建 然后在nuget控制台添加 Add- ...
- ubuntu安装msf签名认证失败
添加命令 apt-get --allow-unauthenticated upgrade 来允许未认证签名软件安装,但是可能有恶意软件安装进来,可以使用 sudo apt-key adv --keys ...
- Nacos启动时报错No DataSource set排查
问题描述 最近在学习Nacos组件,使用的是最新版本:2.2.3. 在本地虚拟机CentOS 8.5.2111环境中安装Nacos,并使用standalone模式启动,同时配置使用外部MySQL数据库 ...
- 洛谷 P1122 最大子树和 题解
一道入门的树形DP. 首先我们对于数据进行有序化处理,这便于我们利用数据结构特点(可排序性)来发觉数据性质(有序.单调.子问题等等性质),以便于后续的转化.推理和处理.有序化可以"转化和创造 ...