Monday 28 April 2014

hpacucli automatic alerting of failed component on Linux

In case that you have HP server, best way for automatic alerting for fail hard drive is tool called hpacucli.

Command that you should issue to see if there are any failed components is

#hpacucli ctrl all show config|egrep -i Failed 

In case that everything is OK, exit of this command will be 0.
In case that you  have failed component(for instance cache battery) your exit will be like this

# hpacucli ctrl all show config detail|egrep -i Failed
   Battery/Capacitor Status: Failed (Replace Batteries)



You can create script that will issue this command every now and then.
My script look like this

#!/bin/bash
/usr/sbin/hpacucli ctrl all show config|egrep -i Failed |wc -l >/tmp/state_disks.txt


This will put in file /tmp/state_disks.txt number of failed components. Everything is OK when this number is 0. With your alerting system(nagious,zabbix,etc.) you can read content of this file and set up alert messages for it or you can setup mail to be sent to admin from system directly.


No comments: