Getting metrics from a9s-pg

Metrics and intervals

In order not to consume all of the CPU, memory and network resources assigned to your a9s-pg deployment only limited set of metrics are collected.

  • Metrics group: replication
    • Interval: 10 seconds
    • Metrics:
      • wal_receiver_count
      • ha.master
      • ha.standby
      • ha.wal_dir_size
  • Metrics group: database size
    • Interval: 5 minutes
    • Metrics:
      • database.*.pg_database_size
  • Metrics group: system
    • Interval: 10 seconds
    • Metrics:
      • disk_size
      • disk_used
      • load_one_min
      • load_five_min
      • load_fifteen_min
      • mem_total
      • mem_used
      • mem_free
      • mem_shared
      • mem_buffers
      • mem_cached
      • swap_total
      • swap_used
      • swap_free
      • cpu.user
      • cpu.priority
      • cpu.system
      • cpu.idle

Setting up a9s pg with metrics

In order to stream replication, database size, and system metrics from your a9s-pg deployment you must provide a couple of variables (for instance via your iaas config):

VariableExample value
iaas.a9s_pg.graphite_endpoints[ 192.168.50.1:5050 ]
iaas.a9s_pg.metrics_prefixmatrix

You should also ensure that a9s-pg can connect the graphite endpoint you have specified.

Resource considerations

Adding Logstash to your a9s-pg deployment will take up some additional memory and CPU time. Due to the size of the JVM in which Logstash runs it is advisable to add at least 2GB more ram to each of your a9s-pg nodes. The CPU usage is quite low the majority of the time, however it will spike around collection intervals, this is worth bearing in mind when diagnosing intermittent performance issues.

Disable Logstash metrics

To deploy a9s-pg without the Logstash metrics we provide an ops file to be applied during a9s-pg deployment: ops/a9s-pg-without-logstash-metrics.yml.