Described here is a technique used by TXU on the AIX systems to cause the systems to automatically notify the enterprise error logging systems in the event of hardware errors. This error notification technique may contain several mechanisms by which notification is performed, but is fed by the single error logging system built into AIX.
To configure the AIX Error Logging system, perform the following steps.
/tmp/hardware.add
" containing
the following Error Notification object definition:
errnotify: en_type = PERM en_class = "H" en_method = "/usr/sbin/errnotify.ksh $1 $2 $3 $4 $5 $6 $7 $8 $9"
To add the object to the Error Notification object class, run the command:
odmadd /tmp/hardware.add
The "odmadd
" command adds the Error Notification object
contained in "/tmp/hardware.add
" to the errnotify file.
odmget -q"en_class='H' and en_type='PERM' and en_method='/usr/sbin/errnotify.ksh \$1 \$2 \$3 \$4 \$5 \$6 \$7 \$8 \$9'" errnotify
The "odmget
" command locates the Error Notification
object within the ODM that has an en_class value of "H
", an
en_type of "PERM
", and an en_method of
"/usr/sbin/errnotify.ksh \$1 \$2 \$3 \$4 \$5 \$6 \$7 \$8
\$9
", and displays the object. The following output is returned:
errnotify: en_pid = 0 en_name = "" en_persistenceflg = 0 en_label = "" en_crcid = 0 en_class = "H" en_type = "PERM" en_alertflg = "" en_resource = "" en_rtype = "" en_rclass = "" en_symptom = "" en_err64 = "" en_dup = "" en_method = "/usr/sbin/errnotify.ksh $1 $2 $3 $4 $5 $6 $7 $8 $9"
root
":
savebase
odmdelete -q"en_class='H' and en_type='PERM' and en_method='/usr/sbin/errnotify.ksh \$1 \$2 \$3 \$4 \$5 \$6 \$7 \$8 \$9'" errnotify
The "odmdelete
" command locates the Error Notification
object within the ODM that has an en_class value of "H
", an
en_type of "PERM
", and an en_method of
"/usr/sbin/errnotify.ksh $1 $2 $3 $4 $5 $6 $7 $8 $9
", and
removes it from the Error Notification Object Class in the ODM.
NOTE: One problem with this error notification
technique is even though the "savebase
" command may have
been run to save the Error Notification object in the ODM, sometimes
this ODM change is lost. To ensure the Error Notification method
exists, each time an AIX system is rebooted the ODM should be checked
for this method, and if it doesn't exist, add it. The script to check
the ODM to determine if the PERMANENT HARDWARE Error Notification Method
exists, must exist on each AIX system and must have the file name, permissions
owner, and group as follows:
chmod 555 /usr/sbin/ckerrnotify.ksh chown bin /usr/sbin/ckerrnotify.ksh chgrp bin /usr/sbin/ckerrnotify.ksh
Script Source Code for "ckerrnotify.ksh"
This document contains the source code for the Disaster Recovery script "ckerrnotify.ksh".
This file last modified 02/04/09
To ensure the PERMANENT HARDWARE Error Notification Method is in the
ODM, the "ckerrnotify.ksh
" script must be executed every
time the machine reboots. The following entry in the
"/etc/inittab
" will execute the
"/etc/rc.local
" script when the system is rebooted:
local:2:once:/etc/rc.local > /dev/console 2>&1
To add this entry to the "/etc/inittab
" run the
following command:
mkitab "local:2:once:/etc/rc.local > /dev/console 2>&1" chmod 555 /etc/rc.local chown bin /etc/rc.local chgrp bin /etc/rc.local
The "ckerrnotify.ksh
" script should be executed from
within the "/etc/rc.local
" script. A snippet of code from
the "rc.local
" script follows. This code checks to see if
the "/usr/sbin/errnotify.ksh
" script exists and is
executable, and runs the "/usr/sbin/ckerrnotify.ksh
" script
to check the ODM:
... ... ...
Script Source Code for "odmrclocal.ksh"
This document contains the source code for the Disaster Recovery script "odmrclocal.ksh".
This file last modified 02/04/09
... ... ...
The PERMANENT HARDWARE Error Notification Method specifies a script to run in the event an error of this type is logged to the AIX error log. This error notification script must exist on each AIX system and must have the file name, permissions, owner and group as follows:
chmod 555 /usr/sbin/errnotify.ksh chown bin /usr/sbin/errnotify.ksh chgrp bin /usr/sbin/errnotify.ksh
This script performs the function of sending error notifications to designated targets such as Tivoli, e-mail, Openview, etc.
Script Source Code for "errnotify.ksh"
This document contains the source code for the Disaster Recovery script "errnotify.ksh".
This file last modified 02/04/09
The following 3 records are added to each error message processed by the error notification script.
Machine Class: RS/6000 Machine Type: $( lsattr -El sys0 -a modelname | awk '{ print $2}' ) Operating System: AIX $( oslevel )
Two other files are required by the PERMANENT HARDWARE Error Notification script. These files provide the communication mechanism to send messages from the script to Tivoli TEC and must exist on each AIX system in order for the error notification to work. These files may exist in a variety of directories, depending upon the version of software installed. The Error Notification script assumes it will run these files with the following file names:
/usr/sbin/lcf_env.sh /usr/sbin/wpostemsg
To provide consistency, a symbolic link is created to point to each of the required files. But before creating the link, the actual location of each file must be determined:
find / -name lcf_env.sh -print find / -name wpostemsg -print
The results of the "find
" commands are used with the
symbolic link command to create a link to each of the required
files:
ln -s <Full path to lcf_env.sh> /usr/sbin/lcf_env.sh ln -s <Full path to wpostemsg> /usr/sbin/lcf_env.sh