Skip to main content
Version: 31.1.0

Postgresql Stuck on Pre-Start

Description

Sometimes it can happen that the primary goes down (for an update, for example) but both standby nodes disagree about who should be promoted. This causes the pre-start to wait for a primary that is never promoted.

Identifying Issue

When this happens, you can see in the hanging node the following log line repeatedly:

Log file: /var/vcap/sys/log/postgresql-ha/pre-start.stderr.log

1224: Tue Jun 11 14:43:00 UTC 2019 - [info][ctl_utils.sh:192][get_master] - curl result: ###{"message":"Master node unavailable","status":"waiting","data":"Waiting for a standby to be promoted"}###

Note that this is normal to happen a few times, but if the node is waiting on pre-start it should not stay in this state for more than a few minutes.

When running the status debug script (/var/vcap/jobs/postgresql-ha/bin/debug/status.sh), this is what the cluster state looks like:

{
"message": "Node status",
"status": "succeeded",
"data": {
"id": "bc6fd987-a0b0-42fe-a0be-2180881ba5f4",
"node_index": 0,
"node_ip": "10.244.20.10",
"bootstrap": true,
"firstboot": false,
"status": "no_response",
"blocked": true,
"latest_checkpoint": "2/5A000028",
"mode": "shutdown",
"master_in_majority_partition": false,
"upstream_node_id": "bc6fd987-a0b0-42fe-a0be-2180881ba5f4",
"failover_in_progress": false,
"processes": {
"postgresql": "unknown",
"repmgrd": "unknown"
},
"pg_config": {
"max_connections": "1024"
}
}
}
{
"message": "Node status",
"status": "succeeded",
"data": {
"id": "835493aa-9312-4792-b2a8-1d62b6e5f6f8",
"node_index": 1,
"node_ip": "10.244.21.10",
"bootstrap": false,
"firstboot": false,
"status": "running",
"blocked": false,
"latest_checkpoint": "2/5A000028",
"mode": "standby",
"master_in_majority_partition": false,
"upstream_node_id": "unknown",
"failover_in_progress": false,
"processes": {
"postgresql": "running",
"repmgrd": "starting"
},
"pg_config": {
"max_connections": "1024"
}
}
}
{
"message": "Node status",
"status": "succeeded",
"data": {
"id": "ab3ff693-4645-4ddd-8f7c-294e0028871b",
"node_index": 2,
"node_ip": "10.244.22.10",
"bootstrap": false,
"firstboot": false,
"status": "running",
"blocked": false,
"latest_checkpoint": "2/5A000028",
"mode": "standby",
"master_in_majority_partition": false,
"upstream_node_id": "unknown",
"failover_in_progress": false,
"processes": {
"postgresql": "running",
"repmgrd": "starting"
},
"pg_config": {
"max_connections": "1024"
}
}
}

From there, you can see that the primary node (node being updated) has its main processes in unknown state. And both standby nodes have the PostgreSQL process running, but repmgr is always trying to start.

When investigating the logs of the standby nodes, you should see the following lines:

Log File: /var/vcap/sys/log/postgresql-ha/repmgrd.err.log On standby node 2:

[2019-06-11 09:26:43] [NOTICE] node 3 is the best candidate for new master, attempting to follow...
[2019-06-11 09:26:44] [NOTICE] node 2 now following new upstream node 3
[2019-06-11 09:26:44] [WARNING] Unable to create event record: ERROR: cannot execute INSERT in a read-only transaction

And on standby node 3:

[2019-06-11 09:26:44] [NOTICE] node 2 is the best candidate for new master, attempting to follow...
[2019-06-11 09:26:45] [NOTICE] node 3 now following new upstream node 2
[2019-06-11 09:26:45] [WARNING] Unable to create event record: ERROR: cannot execute INSERT in a read-only transaction

Note that node 2 thinks node 3 is the best candidate, and node 3 thinks node 2 is the best candidate. They both try to follow each other, but they can not update their repmgr records since they are both standby nodes, and therefore read-only nodes.

Recovering Cluster

Identifying Which Node to Promote to Primary

Check for one of the standby nodes with latest checkpoint. Connect agains the local PostgreSQL on each standby node and execute the following query:

SELECT pg_last_wal_receive_lsn();

Note that because they are both standby nodes, we use pg_last_wal_receive_lsn, if you want to execute it on the primary, please use pg_current_wal_lsn .

Choose the node with latest WAL LSN, usually they are both at the same point, in this case you can choose any of the nodes for the following steps.

Be aware of standby nodes with broken replication. By promoting a standby node that is too far behind the current primary, you risk data loss. In this case, follow the procedure described on Replication Lag and as always, make a copy of the current data directory of all nodes.

Promoting Node Manually

The promotion scrip will check for replication lag, and promotion will fail if the replication lag is bigger than the configured limit. You can check the replication lag as vcap user:

source /var/vcap/jobs/postgresql-ha/helpers/postgresql_utils.sh
$(check_replication_lag)

As vcap user, execute:

/var/vcap/jobs/postgresql-ha/bin/promote_command.sh master

If promotion is failing due to replication lag being to big, force the promotion:

/var/vcap/jobs/postgresql-ha/bin/promote_command.sh force-master

Keep an eye of the cluster status while executing the script, you can check that with /var/vcap/jobs/postgresql-ha/bin/debug/status.sh.

Wait until the script finishes, after this you cluster should be able to go ahead with the update. If you have canceled the update task somehow, you can trigger it again.