This week’s TechMail is Learning more about Nagios for server monitoring which reviews the book Learning NAGIOS 3.0 published by Packt Publishing. This is a pretty decent book for anyone interested in learning more about Nagios. It taught me, who’s been using Nagios for some time, a few new tricks. Not necessarily a good read, but a darn fine reference manual (although reading from front to back would be good for someone who really wants to get up to speed on the full power of Nagios quickly). Read the TechMail for the full review.
Tag Archives: nagios
hddtemp wrapper for Nagios
I was bored tonight so I wrote a wrapper for hddtemp for Nagios monitoring. I have a bit of a quirky setup for Nagios where I run the local system checks on remote systems via netcat, ipsvd, and a script to handle the query. This allows me to monitor remote drive space, current users, total processes, and current load. Using hddtemp, I can now monitor the temperature of the drives in those machines (which also gives me an idea of how hot/cold the server room itself is).
This may need some tweaking to work with other Nagios setups, but shouldn’t be too hard to adapt. One of these days I’ll do a writeup on my Nagios configuration. Anyways, the wrapper script is as follows. It could probably be optimized a bit more, but it works well enough. WordPress doesn’t handle the indents very well, so keep that in mind.
#!/bin/sh
usage() {
echo "${0} -w [warn] -c [crit] [drives]"
}
if [ "${1}" == "-h" -o "${1}" == "--help" ]; then
usage
exit 0
fi
if [ "${1}" == "-w" ]; then
shift
warn="${1}"
shift
else
usage
exit 1
fi
if [ "${1}" == "-c" ]; then
shift
crit="${1}"
shift
else
usage
exit 1
fi
while [ "${1}" != "" ]; do
drives="${drives} ${1}"
shift
done
if [ "${drives}" == "" ]; then
usage
exit 1
fi
status=0
smsg=""
htemp=0
for drive in ${drives}; do
msg=""
stats=`/usr/local/sbin/hddtemp ${drive}`
model=`echo ${stats} | cut -d ':' -f 2`
temp=`echo ${stats} | cut -d ':' -f 3 | cut -d ' ' -f 2`
dev=`echo ${drive}|cut -d '/' -f 3`
if [ "${temp}" -ge "${warn}" ]; then
if [ "${status}" != "2" ]; then
status=1
fi
fi
if [ "${temp}" -ge "${crit}" ]; then
status=2
fi
if [ "${temp}" -gt "${htemp}" ]; then
htemp="${temp}"
fi
smsg="${smsg}${dev}=${temp}C; "
done
case "${status}" in
2)
wmsg="CRITICAL"
;;
1)
wmsg="WARN"
;;
0)
wmsg="OK"
;;
esac
echo "HDDTEMP ${wmsg} - ${smsg}|hddtemp=${htemp};${warn};${crit};0"
The output, in Nagios’ status view looks like:
HDDTEMP OK - hda=22C: sda=24C: sdb=24C:
It’s called as “hddtemp-mon -w 30 -c 35 /dev/hda /dev/sda /dev/sdb”.

