After getting Heartbeat Failure alert, I cross checked the scxadmin status :
#scxadmin -status
scxcimserver: is running
scxcimprovagt: is stopped
Also I noticed that the SCOM process scxcimprovagt is generating coredump files in our production cluster sprodn2 and sprodn1(see below).
root@sprodn2:/:# uname -a;date
SunOS sprodn2 5.10 Generic_148888-05 sun4u sparc SUNW,SPARC-Enterprise
Tue Dec 24 12:18:49 AST 2013
root@sprodn2:/:#
root@sprodn2:/:# ls -lt core*
-rw------- 1 root root 66860506 Dec 19 07:10
core
core_21446:
total 148288
-rw------- 1 root root 75868658 Dec 20 15:36
core
root@sprodn2:/:#
root@sprodn2:/:# file /core
/core: ELF 32-bit MSB core file SPARC Version 1, from 'scxcimprovagt'
root@sprodn2:/:#
root@sprodn2:/:# file /core_21446/core
/core_21446/core: ELF 32-bit MSB core file SPARC Version 1, from 'scxcimprovagt'
root@sprodn2:/:#
root@sprodn2:/:# ls -lt /opt/microsoft/scx/bin/scxcimprovagt
-rwxr-xr-x 1 root root 54704 Mar 22 2011 /opt/microsoft/scx/bin/scxcimprovagt
root@sprodn2:/:#
root@sprodn2:/:#
dbx /opt/microsoft/scx/bin/scxcimprovagt ./core
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.6' in your .dbxrc
Reading scxcimprovagt
core file header read successfully
Reading ld.so.1
Reading libpegpmservice.so.1
Reading libpegprovidermanager.so.1
Reading libDefaultProviderManager.so.1
Reading libpegprovider.so.1
Reading libpegconfig.so.1
Reading libpegclient.so.1
Reading libpegqueryexpression.so.1
Reading libpegwql.so.1
Reading libpegquerycommon.so.1
Reading libpegcommon.so.1
Reading libpthread.so.1
Reading libdl.so.1
Reading libsocket.so.1
Reading libnsl.so.1
Reading libxnet.so.1
Reading libCstd.so.1
Reading librt.so.1
Reading libpam.so.1
Reading libCrun.so.1
Reading libm.so.2
Reading libthread.so.1
Reading libc.so.1
Reading libpegprm.so.1
Reading libpegrepository.so.1
Reading libssl.so.0.9.7
Reading libcrypto.so.0.9.7
Reading libaio.so.1
Reading libmd.so.1
Reading libcmd.so.1
Reading libCstd_isa.so.1
Reading libssl_extra.so.0.9.7
Reading libcrypto_extra.so.0.9.7
Reading libc_psr.so.1
Reading libCMPIProviderManager.so.1
Reading libscf.so.1
Reading libdoor.so.1
Reading libuutil.so.1
Reading libgen.so.1
Reading libmp.so.2
Reading libBridgeWaysOracleProviderModule.so.21209.0.0
Reading libclntsh.so.11.1
Reading libnnz11.so
Reading libstdc++.so.6.0.3
Reading libgcc_s.so.1
Reading libkstat.so.1
Reading libresolv.so.2
Reading libsched.so.1
Reading libm.so.1
t@1 (l@1) terminated by signal ABRT (Abort)
0xfea4af84: __lwp_park+0x0014: bcc,a,pt %icc,__lwp_park+0x24 ! 0xfea4af94
(dbx) where
current thread: t@1
=>[1] __lwp_park(0x4, 0x0, 0x0, 0x0, 0xfec70000, 0x1), at 0xfea4af84
[2] mutex_lock_queue(0xfec52a00, 0x0, 0xfeac5a60, 0x0, 0x1c00, 0x1d3c), at 0xfea432e0
[3] malloc(0x6b1, 0x1, 0xea654, 0xfef95748, 0xfeac23f0, 0xfeacc5e0), at 0xfe9d7dd8
[4] Pegasus::AnonymousPipe::readMessage(0xffbffd04, 0xffbff664, 0xff104aa0, 0xff0f638c, 0x32800, 0x6b0), at 0xfef95fc0
[5] Pegasus::ProviderAgent::_readAndProcessRequest(0xffbffb88, 0x0, 0x10000000, 0x2ae98, 0x1a131, 0x13c00), at 0x15a10
[6] Pegasus::ProviderAgent::run(0xffbffb88, 0xffbff79c, 0xffbff784, 0xffbff710, 0x2b784, 0xffbffbc8), at 0x153dc
[7] main(0xff100e14, 0xc, 0x4e8f0, 0x2ae98, 0xfffed4d0, 0x800), at 0x18510
(dbx) quit
dbx: internal warning: td_ta_clear_event() failed -- debugger service failed
dbx: internal warning: td_ta_sync_tracking_enable(0) failed -- debugger service failed
root@sprodn2:/:#
root@sprodn2:/:# scxadmin -status
scxcimserver: is running
scxcimprovagt: 3 instances running
root@sprodn2:/:#
root@sprodn2:/:# scxadmin -restart
svc:/application/management/scx-cimd:default enabled.
svcadm: Instance "svc:/application/management/scx-cimd:default" is in maintenance state.
RETURN CODE: 3
root@sprodn2:/:# scxadmin -status
scxcimserver: is stopped
scxcimprovagt: is stopped
root@sprodn2:/:#
root@sprodn2:/:# scxadmin -start
svc:/application/management/scx-cimd:default enabled.
svcadm: Instance "svc:/application/management/scx-cimd:default" is in maintenance state.
RETURN CODE: 3
root@sprodn2:/:#
root@sprodn2:/:# svcs -a |grep scx
maintenance 12:35:03 svc:/application/management/scx-cimd:default
root@sprodn2:/:# svcadm disable svc:/application/management/scx-cimd:default
root@sprodn2:/:# svcs -a |grep scx
disabled 12:37:54 svc:/application/management/scx-cimd:default
root@sprodn2:/:# svcadm enable svc:/application/management/scx-cimd:default
root@sprodn2:/:#
root@sprodn2:/:# svcs -a |grep scx
online 12:38:06 svc:/application/management/scx-cimd:default
root@sprodn2:/:#
root@sprodn2:/:# scxadmin -status
scxcimserver: is running
scxcimprovagt: 1 instance running
root@sprodn2:/:#
- Edited by
machu007
Tuesday, January 07, 2014 7:50 AM
added more details