I'm trying to use check_mk WATO api to configure a "scheduled down time" when a machine is stopped for example. Or even using the opsworks time based instances to avoid false alarms. I also would like to start monitoring new instances, and stop when the instance is terminated.
The hosts are running docker containers started by fig/docker-compose. Each fig.yml has its own 'monitor' container which is configured to monitor all the containers running on that host. This way I can monitor normal things on the host (cpu, disk, load, etc) and also only one HTTP check to my monitor container.
This is configured using the "custom json" opsworks and chef.
I saw that there are a whole new world about monitoring out there (prometheus, boson, shinken, etc ) and the SaaS like boundary, datalog etc.
My primary concern is to alert sysadmin guys when an http service (the monitor container) from one of the hosts returns something diferrent of HTTP 200 OK status code, for example.
Which tool or service would you guys suggest me ?
Thanks