Docker monitoring and container monitoring in general is an area that has historically been difficult. There has been a lot of movement and progress in the last year or so to beef up container monitoring tools but in my experience the tools have either been expensive or difficult to configure and complicated to use. The combination or Rancher and Prometheus has finally given me hope. Now it is easy easy to setup and configure a distributed monitoring solution without paying a high price.
Prometheus has recently added support for Rancher via the Rancher exporter, which is great news. This is by far the easiest method I have discovered thus far for experimenting with Prometheus.
For those that don’t know much about Prometheus, it is an up and coming project created by engineers at Soundcloud which is hosted on Github. Prometheus is focused on monitoring, specifically focusing on container and Docker monitoring. Prometheus uses a polling based model for “scraping” metrics out of predefined endpoints. The Prometheus Rancher exporter enables Prometheus to scrape Rancher server specific metrics, which are very useful to have. To build on that, one other point worth mentioning here is that Prometheus has a very nice, flexible design built upon different client libraries in a similar way to Graphite, so adding support and instrumenting code for different types of platforms is easy to implement. Check out the list of exporters in the Prometheus docs for idea on how to get started exporting metrics.
This post won’t cover setting up Rancher server or any of the Rancher environment since it is well documented in other places. I won’t touch on alerting here either because I honestly haven’t had much time to dig into it much yet. So with that said, the first step I will focus on in this post is getting Prometheus set up and running. Luckily it is extremely easy to accomplish this using the Rancher catalog and the Prometheus template.
Once Prometheus has been bootstrapped and everything is up, test it out by navigating to the Grafana home dashboard created by the bootstrap process. Since this is a simple demo, my dashboard is located at the IP of the server using port 3000 which is the only port that should need to be publicly exposed if you are interested in sharing the Grafana dashboard.
The default Grafana credentials for this catalog template are admin/admin for the username and password, which is noted in the catalog notes found here. The Prometheus tools ship with some nice preconfigured dashboards, so after you have things set up, it is definitely worth checking out some of them.
If you look around the dashboards you will probably notice that metrics for the Rancher server aren’t available by default. To enable these metrics we need to configure Prometheus to connect to the Rancher API, as noted in the Rancher monitoring guide.
Navigate to http://<SERVER_IP>:8080/v1/settings/graphite.host on your Rancher server, then in the top right click edit, and then update the value there to point to the server address where InfluxDB was deployed to.
After this setting has been configured, restart the Rancher server container, wait a few minutes and then check Grafana.
As you can see, metrics are now flowing in the the dashboard.
Now that we have the basics configured, we can drill down in to individual containers to get a more granular view of what is happening in the environment. This type of granularity is great because it gives a very detailed view of what exactly is going on inside our environment and gives us an easy way to share visuals with other team members. Prometheus offers a web interface to interact with the query language and visual results, which is useful to help figure out what kinds of things to visualize in Grafana.
Navigate to the server that the Prometheus server container is deployed to on port 9090. You should see a screen similar to the following.
There is documentation about how to get started with using this tool, so I recommend taking a look and playing around with it yourself. Once you find some useful metrics, visualized in the graph view, grab the query used to generate the graph and add a new dashboard to Grafana.
Prometheus offers a lot of power and flexibility and is a great tool for monitoring. I am still very new to Prometheus but so far it looks very promising and I have to say I’m really impressed with the amount of polish and detail I was able to get in just an afternoon of experimenting. I will be updating this post as I get more exposure to Prometheus and get more metrics and monitoring set up so stay tuned.