I've been very happy with some free perl based
monitoring software called "mon", available from:
http://www.kernel.org/software/mon/
It can check everything we are interested in from
one place, and is very configurable, so it can nag
us at intervals, or just tell us when something is
down and then back up again.
Mon can contact as via e-mail or pager, which is
very handy.
It's still true that there are a lot of false
reports though, hiccups in the system, or services
that go down just briefly and come back up on their
own.
We run mon on a server that runs no mission
critical applications so we don't get in a catch-22
of losing the mission critical apps and the
monitoring software at the same time.
We also wrote a small script that emails once a
day to tell us that the monitoring script is still
running, so we don't have to worry that mon quit
running for some reason.
-mark
Summersault
website desi...