Mt Xia: Technical Consulting Group

Business Continuity / Disaster Recovery / High Availability
Data Center Automation / Audit Response / Audit Compliance

-
Current Location
-

css
  GlobalSysAdmin
    AIX

-

digg Digg this page
del.icio.us Post to del.icio.us
Slashdot Slashdot it!


Business Web Site Hosting
$3.99 / month includes Tools,
Shopping Cart, Site Builder

www.siteox.com

FREE Domain Registration
included with Web Site Hosting
Tools, Social Networking, Blog

www.siteox.com

Permanent Hardware Error Notification Method

Described here is a technique used by TXU on the AIX systems to cause the systems to automatically notify the enterprise error logging systems in the event of hardware errors. This error notification technique may contain several mechanisms by which notification is performed, but is fed by the single error logging system built into AIX.

To configure the AIX Error Logging system, perform the following steps.

  1. Create a file called "/tmp/hardware.add" containing the following Error Notification object definition:

    
    
    errnotify:
         en_type = PERM
         en_class = "H"
         en_method = "/usr/sbin/errnotify.ksh $1 $2 $3 $4 $5 $6 $7 $8 $9"
    
    
    

    To add the object to the Error Notification object class, run the command:

    
    odmadd /tmp/hardware.add
    
    

    The "odmadd" command adds the Error Notification object contained in "/tmp/hardware.add" to the errnotify file.

  2. To verify that the Error Notification object was added to the object class, enter:

    
    odmget -q"en_class='H' and en_type='PERM' and en_method='/usr/sbin/errnotify.ksh \$1 \$2 \$3 \$4 \$5 \$6 \$7 \$8 \$9'" errnotify
    
    

    The "odmget" command locates the Error Notification object within the ODM that has an en_class value of "H", an en_type of "PERM", and an en_method of "/usr/sbin/errnotify.ksh \$1 \$2 \$3 \$4 \$5 \$6 \$7 \$8 \$9", and displays the object. The following output is returned:

    
    
    errnotify:
        en_pid = 0
        en_name = ""
        en_persistenceflg = 0
        en_label = ""
        en_crcid = 0
        en_class = "H"
        en_type = "PERM"
        en_alertflg = ""
        en_resource = ""
        en_rtype = ""
        en_rclass = ""
        en_symptom = ""
        en_err64 = ""
        en_dup = ""
        en_method = "/usr/sbin/errnotify.ksh $1 $2 $3 $4 $5 $6 $7 $8 $9"
    
    

  3. To save the PERMANENT HARDWARE error notification object in the ODM so it will exist through successive reboots of the system, run the following command as "root":

    
    savebase
    
    

  4. To delete the PERMANENT HARDWARE Error Notification object from the Error Notification object class, enter:

    
    odmdelete -q"en_class='H' and en_type='PERM' and en_method='/usr/sbin/errnotify.ksh \$1 \$2 \$3 \$4 \$5 \$6 \$7 \$8 \$9'" errnotify
    
    

    The "odmdelete" command locates the Error Notification object within the ODM that has an en_class value of "H", an en_type of "PERM", and an en_method of "/usr/sbin/errnotify.ksh $1 $2 $3 $4 $5 $6 $7 $8 $9", and removes it from the Error Notification Object Class in the ODM.


NOTE: One problem with this error notification technique is even though the "savebase" command may have been run to save the Error Notification object in the ODM, sometimes this ODM change is lost. To ensure the Error Notification method exists, each time an AIX system is rebooted the ODM should be checked for this method, and if it doesn't exist, add it. The script to check the ODM to determine if the PERMANENT HARDWARE Error Notification Method exists, must exist on each AIX system and must have the file name, permissions owner, and group as follows:


chmod 555 /usr/sbin/ckerrnotify.ksh
chown bin /usr/sbin/ckerrnotify.ksh
chgrp bin /usr/sbin/ckerrnotify.ksh

Script Source Code for "ckerrnotify.ksh"

This document contains the source code for the Disaster Recovery script "ckerrnotify.ksh".



This file last modified 02/04/09


To ensure the PERMANENT HARDWARE Error Notification Method is in the ODM, the "ckerrnotify.ksh" script must be executed every time the machine reboots. The following entry in the "/etc/inittab" will execute the "/etc/rc.local" script when the system is rebooted:


local:2:once:/etc/rc.local > /dev/console 2>&1

To add this entry to the "/etc/inittab" run the following command:


mkitab "local:2:once:/etc/rc.local > /dev/console 2>&1"
chmod 555 /etc/rc.local
chown bin /etc/rc.local
chgrp bin /etc/rc.local

The "ckerrnotify.ksh" script should be executed from within the "/etc/rc.local" script. A snippet of code from the "rc.local" script follows. This code checks to see if the "/usr/sbin/errnotify.ksh" script exists and is executable, and runs the "/usr/sbin/ckerrnotify.ksh" script to check the ODM:


...
...
...


Script Source Code for "odmrclocal.ksh"

This document contains the source code for the Disaster Recovery script "odmrclocal.ksh".



This file last modified 02/04/09

... ... ...


The PERMANENT HARDWARE Error Notification Method specifies a script to run in the event an error of this type is logged to the AIX error log. This error notification script must exist on each AIX system and must have the file name, permissions, owner and group as follows:


chmod 555 /usr/sbin/errnotify.ksh
chown bin /usr/sbin/errnotify.ksh
chgrp bin /usr/sbin/errnotify.ksh

This script performs the function of sending error notifications to designated targets such as Tivoli, e-mail, Openview, etc.

Script Source Code for "errnotify.ksh"

This document contains the source code for the Disaster Recovery script "errnotify.ksh".



This file last modified 02/04/09


The following 3 records are added to each error message processed by the error notification script.


Machine Class: RS/6000
Machine Type: $( lsattr -El sys0 -a modelname | awk '{ print $2}' )
Operating System: AIX $( oslevel )


Two other files are required by the PERMANENT HARDWARE Error Notification script. These files provide the communication mechanism to send messages from the script to Tivoli TEC and must exist on each AIX system in order for the error notification to work. These files may exist in a variety of directories, depending upon the version of software installed. The Error Notification script assumes it will run these files with the following file names:


/usr/sbin/lcf_env.sh
/usr/sbin/wpostemsg

To provide consistency, a symbolic link is created to point to each of the required files. But before creating the link, the actual location of each file must be determined:


find / -name lcf_env.sh -print
find / -name wpostemsg -print

The results of the "find" commands are used with the symbolic link command to create a link to each of the required files:


ln -s <Full path to lcf_env.sh> /usr/sbin/lcf_env.sh
ln -s <Full path to wpostemsg> /usr/sbin/lcf_env.sh

-
ODM Error Notify Method
-
 


FREE Domain Registration
included with Web Site Hosting
Tools, Social Networking, Blog

www.siteox.com

Business Web Site Hosting
$3.99 / month includes Tools,
Shopping Cart, Site Builder

www.siteox.com