Portrait: saga_patio_old_school.jpg |
If you were to log in, you'd be able to get more information on your fellow community member.
I've been very happy with some free perl based monitoring software called "mon", available from: http://www.kernel.org/software/mon/ It can check everything we are interested in from one place, and is very configurable, so it can nag us at intervals, or just tell us when something is down and then back up again. Mon can contact as via e-mail or pager, which is very handy. It's still true that there are a lot of false reports though, hiccups in the system, or services that go down just briefly and come back up on their own. We run mon on a server that runs no mission critical applications so we don't get in a catch-22 of losing the mission critical apps and the monitoring software at the same time. We also wrote a small script that emails once a day to tell us that the monitoring script is still running, so we don't have to worry that mon quit running for some reason.