Tuesday, June 21, 2011

Death by a thousand trace files


Found a large number of trace files generated in the /udump admin directory.

Each file looked like the following:
$ cat orcl_ora_12191.trc
/u01/app/oracle/admin/ORCL/udump/orcl_ora_12191.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1
System name: SunOS
Node name: hostname
Release: 5.10
Version: Generic_144488-14
Machine: sun4u
Instance name: ORCL
Redo thread mounted by this instance: 1
Oracle process number: 389
Unix process pid: 12191, image: oracle@hostname

opiino: Attach failed! error=-1 ifvp=0
Not much help on support.oracle.com or google in diagnosing this issue.

Oracle support was helpful in doing some tracing which proved that these were failures to connect rather than existing connections that were lost.

With that information in hand, we correlated the timestamps of the trace files to timestamps of listener connections. This narrowed our issue down to a specific IP address, and by that a specific application.

Our lead application administrator was able to determine the root cause of the failed connections, an outdated library used in development of the application:
From what I can tell CoreLab Oracle is now dotConnect for Oracle from Devart. They resolved this issue with version 5.70.140.

Whew.

No comments: