a9s PostgreSQL Performance Considerations

Caching

PostgreSQL uses a combination of techniques to keep data often used in memory. PostgreSQL has its "own" memory segment and it also relies on the OS's caching to avoid disk I/O.

This blog post gives a good, brief overview: Understanding caching in PostgreSQL - An in-depth guide

This blog post introduces several PostgreSQL modules useful for system and query performance analysis (with bad English and it is slightly outdated though): Caching in PostgreSQL

Monitoring

Several modules exist that make performance monitoring easier in PostgreSQL.

pg_autoexplain

The auto_explain module provides a means for logging execution plans of slow
statements automatically, without having to run EXPLAIN by hand. This is
especially helpful for tracking down un-optimized queries in large applications.

Website

What is an execution plan? When computing a query PostgreSQL creates execution plans and chooses the "quickest" amongst them. See the documentation for EXPLAIN for more information. Also, this chapter might be relevant Chapter 61. How the Planner Uses Statistics.

pg_stat_statements

The pg_stat_statements module provides a means for tracking execution statistics
of all SQL statements executed by a server.

Website

The pg_stat_statements module is certainly interesting for our Logstash plugin because it will allow us to monitor PostgreSQL service instance and e.g. give a list of the "ten slowest queries" and some hints why they are so slow. See the blogpost Understanding caching in PostgreSQL - An in-depth guide for a usage example.

pg_buffercache

The pg_buffercache module provides a way to view PostgreSQL's internal cache. It could be interesting for our Logstash plugin.

pgfincore

The pgfincore is similar to pg_buffercache but instead works on OS caches and thus we can figure out which of PostgreSQL's pages are currently in memory.

pg_prewarm

The pg_prewarm module provides a convenient way to load relation data into
either the operating system buffer cache or the PostgreSQL buffer cache.

Website

pg_hibernator

This PostgreSQL extension is a set-it-and-forget-it solution to save and restore
the PostgreSQL shared-buffers contents, across PostgreSQL server restarts.

Website

The module stores "block identifiers" on disk and loads those blocks into PostgreSQL's shared buffer after a restart. This means that "not a lot" of additional permanent storage space is required.

PostgreSQL configuration

shared_buffers

In this blogpost Tuning shared_buffers and wal_buffers Robert Haas (a PostgreSQL contributor) argues why the configuration option shared_buffers should be kept between 25% and 40% of the host's total RAM. Note that the blogpost is from 2012 so when he says "not more than 8GB" that should be regarded as outdated.

He also gives an explanation on how wal_buffers should be configured.

effective_cache_size

The configuration option effective_cache_size is used by PostgreSQL for its execution plans. It uses the value of effective_cache_size to determine whether to do a table scan or an index scan. It should reflect the size of shared_buffers plus the OS's page cache. Default: 4GB.

See this page for more information.

Further reading

a9s PostgreSQL custom parameters

Documentation of which values can be changed within a9s PostgreSQL.

Inside PostgreSQL buffers

Greg Smith describes the internals of data organisation and caching for PostgreSQL 8.3 which is still applicable today in Inside PostgreSQL buffers

Understand cache hit rate and index usage

Craig Kertiens writes about how to determine cache hit rates (which could be exposed to Logstash) and index usage: http://www.craigkerstiens.com/2012/10/01/understanding-postgres-performance/