As an operations team engineer, monitoring virtual infrastructure health is a pretty big deal. You aim to catch and resolve issues for hardware, devices, storage, memory, network, hosts, and the like in computing grids and clusters very early on. Monitoring systems like Ganglia are a must-have for such purposes. I’ll show a quick setup of Ganglia monitoring via ElasticBox that you can add to any deployment and track infrastructure performance.
In ElasticBox, built-in monitoring is available as a service in a self-serve catalog for your engineering and operations teams to launch to any cloud infrastructure on-demand. ElasticBox supports a wide range of configuration management tools, orchestrates provisioning on popular cloud providers, and allows teams to collaborate on deployment assets. If you follow or use ElasticBox, you know that complex deployments happen much faster in a few clicks versus long hours.
Ganglia is a useful monitoring service for large-scale web applications. It provides distributed monitoring at scale for clusters and grids. It’s popular because it’s easy to set up and tracks a ton of metrics. It monitors computing systems including hardware, storage, network, and software. You can port metrics for alerting and visualization by integrating services like Nagios or use Ganglia’s built-in visualization.
Here I have Ganglia monitoring in unicast mode set up in a couple of boxes in an ElasticBox service catalog. A node box installs monitoring on the node. And a meta node box installs Ganglia monitoring and aggregates metrics from all the nodes connecting to it.
I’ll go over the box set up first.
Ganglia meta box
This box deploys a master node that monitors its resources and the metrics of all the nodes connecting to it. The configuration shows events to install, configure, and start the Ganglia services.
Based on the Linux distribution type, the box installs gmeta, the daemon that polls gmond for aggregate cluster data. Then it installs RRDtool to store metrics in RRD files. And it installs the gmond daemon, PHP runtime, and the web UI. The box configures all the common metrics that the gmeta and gmond daemons should gather and configures the web UI. Because the configuration settings are stored in variables, you can configure whatever metrics you like including how you want the master to communicate with the nodes.
Since gmond and gmeta daemons transmit metric packets over TCP and UDP protocols, the box opens TCP and UDP ports to allow the master gmeta daemon to poll the gmonds on connecting nodes. You only need to do this in case the kernel firewall is enabled.
The box starts Ganglia monitoring by starting the daemons and the web front end services.
Ganglia node box
This box connects to the master monitoring node over a binding.
Like the meta box, it installs the monitoring service and the gmond daemon to gather data from the node and expose the metrics over TCP. The box configures the gmond daemon to track all the common metrics. Then opens the TCP and UDP ports over which the gmond transmits metric packets. Finally, it starts the Ganglia service on the node. It connects to the master node at deploy time using the binding information configured in the file variable.
Typical Ganglia deployments
How does a typical deployment work? As an operations team member, you deploy the meta monitoring node first so all other instances can bind and transmit metrics to it.
Next, share the monitoring node box in the service catalog with your engineering and operation team colleagues.
Here my colleague deployed a node and connected to the master.
I can now monitor it from the master node UI.
The best way to consume Ganglia is to stack the monitoring node box in any application box so you can monitor the application nodes that others deploy.
Hope you found the walkthrough of the Ganglia monitoring service box useful. If you’re interested in these boxes or in setting up monitoring for deployments through ElasticBox, shoot me an email. I would love to help and hear your use cases.