Service Instance Metrics
This page describes the metrics used to monitor state of an a9s Messaging Service Instance. For further information on how to monitor an a9s Service Instance, see the Set up Monitoring section of the Application Developer's documentation.
Metrics
Following a list of the RabbitMQ metric patterns which will be streamed to a colocated Prometheus endpoint. The metrics can be divided into the groups:
- Node Metrics
- Queue Metrics
- Ghost Queues
- Port Checks
- General
Starting with a9s Messaging 4.0, metrics have been changed to include all available RabbitMQ metrics for the three endpoints. Node, Queue and General (Cluster-wide metrics).
For more information about all available metrics for a9s Messaging 4.0 see the RabbitMQ Monitoring documentation and the RabbitMQ HTTP API documentation.
Node Metrics
These metrics provide insight into detailed information about the state of nodes. Most of the metrics represent point-in-time absolute values. The latter metrics are most useful when compared to their previous values and historical mean/percentile values.
Documentation source:
- RabbitMQ Monitoring Node Documentation
- RabbitMQ HTTP API Documentation
- The metrics description on this documentation might change between version.
Group Id | Type | Description |
---|---|---|
disk_free | Integer | Disk free space in bytes. |
disk_free_details.rate | Float | Rate of the disk_free metric. |
disk_free_limit | Integer | Point at which the disk alarm will go off. |
fd_total | Integer | File descriptors available. |
fd_used | Integer | Used file descriptors. |
io_read_avg_time | Integer | Average wall time (milliseconds) for each disk read operation in the last statistics interval. |
io_read_avg_time_details.rate | Float | Rate of the io_read_avg_time metric. |
io_read_bytes | Integer | Total number of bytes read from disk by the persister. |
io_read_bytes_details.rate | Float | Rate of the io_read_bytes metric. |
io_read_count | Integer | Total number of read operations by the persister. |
io_read_count_details.rate | Float | Rate of the io_read_count metric. |
io_sync_avg_time | Integer | Average wall time (milliseconds) for each fsync() operation in the last statistics interval. |
io_sync_avg_time_details.rate | Float | Rate of the io_sync_avg_time metric. |
io_write_avg_time | Integer | Average wall time (milliseconds) for each disk write operation in the last statistics interval. |
io_write_avg_time_details.rate | Float | Rate of the io_write_avg_time metric. |
mem_limit | Integer | Point at which the memory alarm will go off. |
mem_used | Integer | Memory used in bytes. |
mem_used_details.rate | Float | Rate of the mem_used metric. |
partitioned | Integer | Displays the partitions visible to this node. The value is 1 when partitions amount is greater than 0, otherwise it is 0. |
proc_total | Integer | Maximum number of Erlang processes. |
proc_used | Integer | Number of Erlang processes in use. |
sockets_total | Integer | File descriptors available for use as sockets. |
sockets_used | Integer | File descriptors used as sockets. |
uptime | Integer | Time since the Erlang VM started, in milliseconds. |
Metrics
*.rabbitmq.*.*.*.*.node.disk_free
*.rabbitmq.*.*.*.*.node.disk_free_details.rate
*.rabbitmq.*.*.*.*.node.disk_free_limit
*.rabbitmq.*.*.*.*.node.fd_total
*.rabbitmq.*.*.*.*.node.fd_used
*.rabbitmq.*.*.*.*.node.io_read_avg_time
*.rabbitmq.*.*.*.*.node.io_read_avg_time_details.rate
*.rabbitmq.*.*.*.*.node.io_read_bytes
*.rabbitmq.*.*.*.*.node.io_read_bytes_details.rate
*.rabbitmq.*.*.*.*.node.io_read_count
*.rabbitmq.*.*.*.*.node.io_read_count_details.rate
*.rabbitmq.*.*.*.*.node.io_sync_avg_time
*.rabbitmq.*.*.*.*.node.io_sync_avg_time_details.rate
*.rabbitmq.*.*.*.*.node.io_write_avg_time
*.rabbitmq.*.*.*.*.node.io_write_avg_time_details.rate
*.rabbitmq.*.*.*.*.node.mem_limit
*.rabbitmq.*.*.*.*.node.mem_used
*.rabbitmq.*.*.*.*.node.mem_used_details.rate
*.rabbitmq.*.*.*.*.node.partitioned
*.rabbitmq.*.*.*.*.node.proc_total
*.rabbitmq.*.*.*.*.node.proc_used
*.rabbitmq.*.*.*.*.node.sockets_total
*.rabbitmq.*.*.*.*.node.sockets_used
*.rabbitmq.*.*.*.*.node.uptime
Queue Metrics
These metrics provide insight into information about queue metrics. They expose the information of each individual queue available on the system.
Documentation source:
- RabbitMQ List Queues Documentation
- RabbitMQ Monitoring Node Documentation
- RabbitMQ HTTP API Documentation
- The metrics description on this documentation might change between version.
Group Id | Type | Description |
---|---|---|
consumers | Integer | Consumers on a queue. |
memory | Integer | Queue's memory. |
messages | Integer | Sum of ready and unacknowledged messages - total queue depth. |
messages_details.rate | Float | Rate of the messages metric. |
messages_ram | Integer | Ready and unacknowledged messages stored in memory. |
messages_ready | Integer | Messages ready to be delivered to consumers. |
messages_ready_details.rate | Float | Rate of the messages_ready metric. |
messages_ready_ram | Integer | Ready messages stored in memory. |
messages_unacknowledged | Integer | Messages delivered to consumers but not yet acknowledged. |
messages_unacknowledged_details | Float | Rate of the messages_unacknowledged metric. |
messages_unacknowledged_ram | Integer | Unacknowledged messages stored in memory. |
message_bytes | Integer | Size in bytes of ready and unacknowledged messages. |
message_bytes_persistent | Integer | Size in bytes of persistent messages. |
message_bytes_ram | Integer | Size of ready and unacknowledged messages stored in memory. |
message_bytes_ready | Integer | Size in bytes of ready messages. |
message_bytes_unacknowledged | Integer | Size in bytes of all unacknowledged messages. |
backing_queue_status.avg_ack_egress_rate | Integer | Average rate of leaving unacknowledged messages. |
backing_queue_status.avg_ack_ingress_rate | Integer | Average rate of arriving unacknowledged messages. |
backing_queue_status.avg_egress_rate | Integer | Average engress rate. |
backing_queue_status.avg_ingress_rate | Integer | Average ingress rate. |
backing_queue_status.len | Integer | Total queue length. |
backing_queue_status.next_seq_id | Integer | Next sequence ID. |
backing_queue_status.q1 | Integer | Amount of queues type Q1. |
backing_queue_status.q2 | Integer | Amount of queues type Q2. |
backing_queue_status.q3 | Integer | Amount of queues type Q3. |
backing_queue_status.q4 | Integer | Amount of queues type Q4. |
Metrics
*.rabbitmq.*.*.*.*.queue.<queue-name>.consumers
*.rabbitmq.*.*.*.*.queue.<queue-name>.memory
*.rabbitmq.*.*.*.*.queue.<queue-name>.messages
*.rabbitmq.*.*.*.*.queue.<queue-name>.messages_details.rate
*.rabbitmq.*.*.*.*.queue.<queue-name>.messages_persistent
*.rabbitmq.*.*.*.*.queue.<queue-name>.messages_ram
*.rabbitmq.*.*.*.*.queue.<queue-name>.messages_ready
*.rabbitmq.*.*.*.*.queue.<queue-name>.messages_ready_details.rate
*.rabbitmq.*.*.*.*.queue.<queue-name>.messages_ready_ram
*.rabbitmq.*.*.*.*.queue.<queue-name>.messages_unacknowledged
*.rabbitmq.*.*.*.*.queue.<queue-name>.messages_unacknowledged_details.rate
*.rabbitmq.*.*.*.*.queue.<queue-name>.messages_unacknowledged_ram
*.rabbitmq.*.*.*.*.queue.<queue-name>.message_bytes
*.rabbitmq.*.*.*.*.queue.<queue-name>.message_bytes_persistent
*.rabbitmq.*.*.*.*.queue.<queue-name>.message_bytes_ram
*.rabbitmq.*.*.*.*.queue.<queue-name>.message_bytes_ready
*.rabbitmq.*.*.*.*.queue.<queue-name>.message_bytes_unacknowledged
*.rabbitmq.*.*.*.*.queue.<queue-name>.backing_queue_status.avg_ack_egress_rate
*.rabbitmq.*.*.*.*.queue.<queue-name>.backing_queue_status.avg_ack_ingress_rate
*.rabbitmq.*.*.*.*.queue.<queue-name>.backing_queue_status.avg_egress_rate
*.rabbitmq.*.*.*.*.queue.<queue-name>.backing_queue_status.avg_ingress_rate
*.rabbitmq.*.*.*.*.queue.<queue-name>.backing_queue_status.len
*.rabbitmq.*.*.*.*.queue.<queue-name>.backing_queue_status.next_seq_id
*.rabbitmq.*.*.*.*.queue.<queue-name>.backing_queue_status.q1
*.rabbitmq.*.*.*.*.queue.<queue-name>.backing_queue_status.q2
*.rabbitmq.*.*.*.*.queue.<queue-name>.backing_queue_status.q3
*.rabbitmq.*.*.*.*.queue.<queue-name>.backing_queue_status.q4
The backing_queue_status
provides information about the state of a queue in RabbitMQ, using various metrics related to
the queue’s internal state. These metrics are only present if announced by the RabbitMQ queue information.
Please be aware that the periods (.
) of the queue's name are replaced with underscores (_
) in the queue metrics.
This is necessary since a periods is reserved as path separator for Graphite metric names.
backing_queue_status
DeprecationAs of RabbitMQ 3.13, the backing_queue_status
metric has been deprecated and is removed with a9s Messaging 4.
Therefore, as of a9s Messaging 4.X and above, the backing_queue_status
are not available anymore.
Ghost Queues
The idea behind this metric is to support the detection of ghost queues. Ghost queues are a known rabbitmq bug/artefact, so ideally this metric shouldn't appear.
*.rabbitmq.*.*.*.*.<ghost_queue>.<ghost-queue-name>
Please be aware that the periods (.
) of the queue's name are replaced with underscores (_
) in the queue metrics.
This is necessary since a periods is reserved as path separator for Graphite metric names.
Port Checks
These metrics provide insight into information about a9s RabbitMQ services ports.
Group Id | Type | Description |
---|---|---|
amqp_port_open | Integer (0 - Closed / 1 - Open) | Show the amqp open port. |
amqp_tls_port_open | Integer (0 - Closed / 1 - Open) | Show the amqp tls open ports. |
epmd_port_open | Integer (0 - Closed / 1 - Open) | Show the epmd open port. |
http_api_port_open | Integer (0 - Closed / 1 - Open) | Show the http api open port. |
mqtt_client_port_open | Integer (0 - Closed / 1 - Open) | Show the mqtt client open port. |
mqtt_client_tls_port_open | Integer (0 - Closed / 1 - Open) | Show the mqtt client tls open port. |
stomp_port_open | Integer (0 - Closed / 1 - Open) | Show the stomp open port. |
stomp_tls_port_open | Integer (0 - Closed / 1 - Open) | Show the stomp tls open port. |
web_mqtt_port_open | Integer (0 - Closed / 1 - Open) | Show the web mqtt tls open port. |
web_stomp_port_open | Integer (0 - Closed / 1 - Open) | Show the web stomp open port. |
Metrics
*.rabbitmq.*.*.*.*.port_checks.amqp_port_open
*.rabbitmq.*.*.*.*.port_checks.amqp_tls_port_open
*.rabbitmq.*.*.*.*.port_checks.epmd_port_open
*.rabbitmq.*.*.*.*.port_checks.http_api_port_open
*.rabbitmq.*.*.*.*.port_checks.mqtt_client_port_open
*.rabbitmq.*.*.*.*.port_checks.mqtt_client_tls_port_open
*.rabbitmq.*.*.*.*.port_checks.stomp_port_open
*.rabbitmq.*.*.*.*.port_checks.stomp_tls_port_open
*.rabbitmq.*.*.*.*.port_checks.web_mqtt_port_open
*.rabbitmq.*.*.*.*.port_checks.web_stomp_port_open
General
These metrics provide insight into general information about a9s Messaging.
Documentation source:
- RabbitMQ Monitoring Cluster-Wide-Metrics Documentation
- RabbitMQ HTTP API Documentation
- The metrics description on this documentation might change between versions.
Group Id | Type | Description |
---|---|---|
connections | Integer | Total amount of connections. |
collect_time | Integer (ms) | The time it took to collect the server status information. |
metric_fetch_status | Integer (0 - success / 1 - fail) | Status of the last server status collection operation. |
Metrics
*.rabbitmq.*.*.*.*.connections
*.rabbitmq.*.*.*.*.collect_time
*.rabbitmq.*.*.*.*.metric_fetch_status
For a9s Messaging Service Instances which are still using RabbitMQ < 4.0, only Legacy Syslog format is supported:
["hostname1:port1", "hostname2:port2"]