We are looking for a Network Monitoring solution to replace our current one, Xymon, with a more robust solution. We will test various solutions until we find one that fits the requirements we outline below.
Whichever monitoring system(s) we select must (at minimum) be able to:
The new solution will provide attractive features, including external configurability capable of being automated, useful graphical statistics (uptime, resource usage, etc), among several other requirements highlighted above that are not currently available with our existing Network Monitor.
By the end of the semester, we hope to find a suitable replacement solution for Xymon, or at the very least narrow down the potential solutions. Additionally, we hope to test out various solutions in hopes of determining the best fit. Ideally we will have a candidate successor for Xymon by the end of the semester configured to Chris A's standards; along with documentation to set it up in the exact same manner. We will have several shorter term goals that will be outlined below, ultimately contributing to this.
We will require two blank machines to install the tools we will be testing out, and later on an HTTP server we can turn on/off at our own will.
There are those who argue (persuasively, in Chris A's view) that monitoring for the purpose of alerting about problems (eg: "wake someone up so they fix it") and monitoring for the purposes of graphing trends over time (eg: track resource consumption, plan new purchases/deployments/etc) are two fundamentally different jobs, and are best served by different systems. It would obviously be easier to only deploy one system that does both tasks, but if the choice is between one system that does two tasks poorly or two systems that each do one task well, we should keep this in mind.
Note: These are options to consider. You should not limit yourself to just these systems, nor should you assume that all of these systems are good options.