Skip to main content

Monitoring RabbitMQ with Prometheus and Grafana

Monitoring is one of the most important parts of any production setup. Good monitoring is critical to detect any issues before they impact your systems and eventually the users.
Prometheus is an open source time series data store. It works on a pull based model where you have to expose an endpoint from where Prometheus can pull. We can use prometheus_rabbitmq_exporter plugin to expose /api/metrics endpoint in the context on RabbitMQ management API. For installation and setup instructions, we can follow this article on RabbitMQ website. Once the setup is done, read this article to learn more about which metrics you should monitor and how to do that using Prometheus.
First we will look at system metrics:
  • Node Load Average: This metric indicates the Average load on CPU. It should be less than the number of cores on the node CPU. You should setup alerts if this goes higher than number of CPU cores available. Query to setup the graph for this is:
node_load1{instance=~”rabbit-cluster-node-.*”}
  • Node Used Memory: This metric indicates the average amount of memory used on each node. you should setup alerts if this goes higher than the watermark. Watermark is generally (0.4 * Node Memory). Query to setup this graph is:
node_memory_MemUsed{instance=~”rabbit-cluster-node-.*”}
  • Queue Depth: This metric how many messages are currently waiting to be consumed in any queue. If this number increases too much than it can cause performance issues due to increased memory usage. Queries to setup graph for this are:
sum(rabbitmq_queue_messages{queue=~”$queue”, policy=~”$policy”, durable=~”$durable”, vhost=~”$vhost”})
sum(rabbitmq_queue_messages_ram{queue=~"$queue", policy=~"$policy", durable=~"$durable", vhost=~"$vhost"})
um(rabbitmq_queue_messages_persistent{queue=~"$queue", policy=~"$policy", durable=~"$durable", vhost=~"$vhost"})
These would also give the number of messages being stored in memory and on the disk for any queue.
  • Number of consumers for a Queue: Setup an alert if number of consumers any queue drops lower than a pre defined threshold. Query to setup graph for this metric is:
sum(rabbitmq_queue_consumers{queue=~"$queue", policy=~"$policy", durable=~"$durable", vhost=~"$vhost"})
  • Queue Consumer Utilization: Consumer Utilization is the proportion of time that a queue’s consumers could take new messages. This is a number number 0 to 100%. If a queue has consumer utilization as 100%, then its able to push out messages as fast as it can and doesn’t need to wait on consumers. Setup alerts on this if it drops lower than a pre defined threshold for any queue. Query to setup graph for this metric is:
max(rabbitmq_queue_consumer_utilisation{queue=~"$queue", policy=~"$policy", durable=~"$durable", vhost=~"$vhost"})
  • Queue Memory: Queue Memory indicates the amount of memory consumed by the queue. Set to alert on this metric if it breaches a predefined threshold for any queue. Query to setup graph for this metric is:
max(rabbitmq_queue_memory{queue=~"$queue", policy=~"$policy", durable=~"$durable", vhost=~"$vhost"})

Comments

Popular posts from this blog

Monitoring Spring Boot API response times with Prometheus

Prometheus is a very useful tool for monitoring web applications. In this blog post, we will see how to use it to monitor Spring Boot API response times. You have to include following dependencies in your build.gradle file: compile group: 'io.prometheus', name: 'simpleclient_hotspot', version: '0.0.26' compile group: 'io.prometheus', name: 'simpleclient_servlet', version: '0.0.26' compile group: 'io.prometheus', name: 'simpleclient', version: '0.0.26' Now you will have to expose a Rest Endpoint, so that Prometheus can collect the metrics by scraping it at regular intervals. To do that, you would have to include these Java configuration classes: @Configuration @ConditionalOnClass(CollectorRegistry.class) public class Config { private static final CollectorRegistry metricRegistry = CollectorRegistry. defaultRegistry ; @Bean ServletRegistrationBean registerPrometheusExporterServlet() { retu...

Redis Pipelines and Transactions with Golang

Redis is an in memory datastore mostly used as a cache. Clients can send commands to server using TCP protocol and get the response back. So, usually a request works like this. Client sends a request to server and waits for the response to be sent back. Server processes the command and writes the response to the socket for client to read. Sometimes, in application flow we have to process multiple keys at once. In this case, for each request there will be a network overhead for the round trip between server and client. We can reduce this network overhead by sending commands to the Redis server in a batched manner and then process all the responses at once. This can be achieved using pipelines as well as transactions. Pipelining in Redis is when you send all the commands to the server at once and receive all the replies in one go. It does not provide any guarantees like no other commands will be processed between your pipelined commands. Transactions in Redis are meant to be ...

Monitoring Go micro services using Prometheus

In this age of web scale architecture, Golang has become the language of choice for many developers to implement high throughput micro services. One of the key components of running and maintaining these services is to be able to measure the performance. Prometheus is a time series based database which has become very popular for monitoring micro services. In this blog, we will see how to implement monitoring for your Go micro service using prometheus. We will be using the official Prometheus library github.com/prometheus/client_golang/prometheus/promhttp to expose the go metrics. You can use Promhttp library’s HTTP handler as the handler function to expose the metrics. package main import ( "github.com/gorilla/mux" "github.com/prometheus/client_golang/prometheus/promhttp" "net/http" ) func main() { router := mux.NewRouter() router.Handle( "/metrics" , promhttp.Handler()) http.ListenAndServe( ":8080" , router ) }...