Learn me to trust my own fucking logs, will you.
One of the more useful monitoring tools I've got is a simple shell-wrapped "HEAD" script that polls our cluster and reports an "OK" or "ERR" (slow responses trigger a "Hrm..", along with the current, median, and standard deviation of the response, and total error counts. That sits in an omnipresent, always-on-top small-font terminal window.
Something like:
2012-03-30 12:03 i=9948
Host Status Cur Med sd Err
www OK 0.22 0.24 0.44 6