HP-UX 10.2.0.4 RAC crash

Fri Oct 30 10:04:32 2009
Errors in file /oracle/product/db/10.2/admin/xxdlyx/udump/xxdlyx1_ora_14693.trc:
ORA-00603: Message 603 not found; No message file for product=RDBMS, facility=ORA
ORA-27544: Message 27544 not found; No message file for product=RDBMS, facility=ORA
ORA-27300: Message 27300 not found; No message file for product=RDBMS, facility=ORA; arguments: [socket] [23]
ORA-27301: Message 27301 not found; No message file for product=RDBMS, facility=ORA; arguments: [File table overflow]
ORA-27302: Message 27302 not found; No message file for product=RDBMS, facility=ORA; arguments: [sskgxpcre1]
Fri Oct 30 10:04:32 2009
Trace dumping is performing id=[cdmp_20091030100432]
Fri Oct 30 10:04:43 2009
Process PZ97 died, see its trace file
Fri Oct 30 10:04:43 2009
Errors in file /oracle/product/db/10.2/admin/xxdlyx/udump/xxdlyx1_ora_14818.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-27544: Failed to map memory region for export
ORA-27300: OS system dependent operation:socket failed with status: 23
ORA-27301: OS failure message: File table overflow
ORA-27302: failure occurred at: sskgxpcre1
Fri Oct 30 10:12:53 2009
Errors in file /oracle/product/db/10.2/admin/xxdlyx/udump/xxdlyx1_ora_18058.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00604: error occurred at recursive SQL level 2
ORA-01116: error in opening database file 1
ORA-01110: data file 1: ‘/dev/vgdlyx01/rsystem_4g’
ORA-27041: unable to open file
HPUX-ia64 Error: 23: File table overflow
Additional information: 3
ORA-00604: error occurred at recursive SQL level 2
ORA-01116: error in opening database file 1
ORA-01110: data file 1: ‘/dev/vgdlyx01/rsystem_4g’
ORA-27041: unable to open file
HPUX-ia64 Error: 23: File table overflow
Additional information: 3
ORA-01116: error in opening database file 24
ORA-01110: data file 24: ‘/dev/vgdlyx01/rvgyx_16g_019′
ORA-27041: unable to open file
HPUX-ia64 Error: 23: File table overflow
Additional information: 3
……
Sat Oct 31 16:23:14 2009
Process q002 died, see its trace file
Sat Oct 31 16:23:19 2009
ksvcreate: Process(q002) creation failed
Sat Oct 31 16:25:00 2009
IPC Send timeout detected. Receiver ospid 7729
Sat Oct 31 16:25:01 2009
Errors in file /oracle/product/db/10.2/admin/xxdlyx/bdump/xxdlyx1_lmd0_7729.trc:
Sat Oct 31 16:25:14 2009
Errors in file /oracle/product/db/10.2/admin/xxdlyx/bdump/xxdlyx1_diag_7718.trc:
ORA-00600: internal error code, arguments: [kghfrempty:ds], [0x9FFFFFFFFC560010], [], [], [], [], [], []
Sat Oct 31 16:25:34 2009
Shutting down instance (abort)

Metalink 739557.1如是说:

Applies to:
Oracle Server – Enterprise Edition – Version: 10.2.0.3 to 11.1.0.6
HP-UX Itanium
HP-UX PA-RISC (64-bit)
HP IA64 HPUNIXHP 9000 Series HP-UX (64-bit)
Symptoms
There are several symptoms for this issue:

 Racgimon log reports these errors:
imon_r1024.log:

2008-07-04 16:18:24.791: [RACG][20] [25433][20][ora.r1024.r10241.inst]:
GIMH: GIM-00104: Health check failed to connect to instance.
GIM-00090: OS-dependent operation:mmap failed with status: 12
GIM-00091: OS failure message: Not enough space
GIM-00092: OS failure occurred at: sskgmsmr_13

You may see huge file handles to $ORACLE_HOME/dbs/hc_<SID>.dat file
run lsof or glance to verify
(See Note 390474.1 What Is The $ORACLE_HOME/dbs/hc_<ORACLE_SID>.dat File?)

RAC Instance may crash due to following errors :

Alert.log
Tue Jun 3 09:21:32 2008
Errors in file /opt/ora10/app/oracle/admin/ABACUS1/udump/mydb1a_ora_20420.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-27544: Failed to map memory region for export
ORA-27300: OS system dependent operation:socket failed with status: 23
ORA-27301: OS failure message: File table overflow
ORA-27302: failure occurred at: sskgxpcre1
Tue Jun 3 09:22:49 2008
Shutting down instance (abort)
System messages has following errors around time of crash

/var/adm/syslog/syslog.log
Jun 3 09:22:26 nPar0 vmunix: file: table is full
Jun 3 09:22:27 nPar0 vmunix: file: table is full
Changes
This occurs frequently after RAC upgrade to 10.2.0.4, but can also occur in 10.2.0.3.
Issue also occurs on 11.1.0.6 and is fixed in 11.1.0.7.
Cause
The cause of this problem has been identified and verified in an unpublished Bug 6931689. It is caused by an mmap error.

Bug 7235094 RACGIMON HAS FILE HANDLE LEAK ON HEALTHCHECK FILE

Bug 8203436 INSTANCE CAN’T STOP USING SRVCTL.
were closed as duplicate of unpublished Bug 6931689.
Solution
To fix this issue please apply following patch:
Patch 7298531   CRS MLR#2 ON TOP OF 10.2.0.4 FOR BUGS 6931689 7174111 6912026 7116314
or
Patch 7493592 CRS 10.2.0.4 Bundle Patch #2

Be aware that the fix has to be applied to the 10.2.0.4 database home to fix the problem.

Random Posts

留下评论