Skip to main content
Version: Develop

Forking and Migration

This document describes options to migrate data from an older a9s PostgreSQL Service Instance using an older PostgreSQL version to an a9s PostgreSQL Service Instance with a higher version, from the Application Developer's perspective.

Dump and Restore From Origin Instance to New Instance

This feature is only available for migrating to a9s PostgreSQL 11 and newer.

If you have an existing a9s PostgreSQL Service Instance, you can create a copy of its data by creating a new a9s Service Instance with the custom parameter copy_from.

The custom parameter copy_from requires in the default case an object with the following keys:

  • host: The hostname of the existing instance.
  • username: The username of the existing instance.
  • password: The password of the existing instance.

The values for the required keys can be derived from any service binding (either cf service-key or from cf env APP_NAME when bound to an a9s Service Instance).

When the parameter role_privileges with CREATEDB has been used and there are additional databases next to the main database, the copy_from parameter requires the key main_database that specifies what is the main database on the origin a9s Service Instance. The main database name can also be derived from the binding information in the field name.

caution

Be aware that when using the additional databases, you must take responsibility for handling manually created databases during the migration once only the main_database is fully migrated.

An example call to create a new a9s Service Instance and initiate the database with data from another a9s Service Instance looks the following way:

cf create-service a9s-postgresql postgresql-single-small new-instance-name -c '{"copy_from": {"host": "rgd25db46-psql-master-alias.node.dc1.consul.dsf2", "username": "a9s9d374017060a7d4573bc61baf714500f4d948631", "password": "a9se2fdcf291672e4555293b13c4b7119ba2f77312f"} }'

The create-service call will validate whether the given credentials are correct. In the case of multiple databases, the presence of a value for the main_database will also be validated.

The databases postgres, template0 and template will not be copied over.

Behind the scenes, the new a9s Service Instance will copy over the data using pg_dump and pg_restore calls. The pg_dump call receives the command line arguments -O -x --quote-all-identifiers --lock-wait-timeout=300s --verbose --format=custom.

If an error happens during the time the a9s Service Instance copies the data from the old a9s Service Instance to the new a9s Service Instance, Cloud Foundry will just report back the instance failed. To receive more information why the copy process failed, it is recommended to specify a syslog endpoint via custom parameter to receive the logs from the copy process. The copy process reports progress and error lines with the text copy_from included.

The credentials stored from the origin a9s Service Instance on the new a9s Service Instance will disappear the next time the instance gets updated.

Migrating to a9s PostgreSQL 15 and Above

With PostgreSQL 15, the creation permission was removed from the public schema for regular users, and the owner was changed to the pg_database_owner role. Therefore, users under the cfuser role can no longer create tables or other objects on the public schema. Due to this change, a new schema called shared was introduced to ensure the same behavior as before.

When copy_from is used to migrate from an older version of PostgreSQL to version 15 or greater, all data from the public schema will be automatically migrated to the new shared schema. This is done by renaming the restored schema public to shared.

Migration Matrix

This feature is only supported when used to copy over data from a9s Service Instances that have either an older or the same major version as the new a9s Service Instance.

This means that migration paths such as 11 -> 13, 13 -> 15, or 11 -> 11 are supported, while backwards migration paths such as 13 -> 11 or 11 -> 10 are not supported by this feature. This is due to limitations in PostgreSQL's pg_dump utility.

The migration matrix below provides a visual guide to the supported migration paths:

PostgreSQL 10 (target)PostgreSQL 11 (target)PostgreSQL 13 (target)PostgreSQL 15 (target)PostgreSQL 17 (target)
PostgreSQL 10 (source)
PostgreSQL 11 (source)
PostgreSQL 13 (source)
PostgreSQL 15 (source)
PostgreSQL 17 (source)

Caveats

  • It is recommended to try out the copy_from procedure and to validate all expected data has been restored before doing the real switch.
  • If possible, this feature should be executed at a time when the database is not under much load.
  • This feature has been tested with example feature sets for the following supported extensions: 'postgis', 'uuid-ossp', 'ltree','pgcrypto', 'citext', 'vector'. Your experience may differ when using advanced features of these extensions.
  • All objects in all databases being copied MUST be owned by the cfuser role.
caution

When migrating data you might need more diskspace than the old a9s PostgreSQL Service Instance currently requires due to WAL files occupying additional space. For this reason, we recommend always copying to a single instance, and then updating from a single instance to a cluster one.

You need to pay special attention to this if you are using continuous archiving, since uploading WAL files takes a long time compared to WAL files being consumed by replication. In addition, the amount of newly created WAL files in the target instance during the migration may cause a temporary spike in resource usage, as the recycling of WAL files has to be performed at a later point in time.

Keeping Track of the Migration Progress

info

This monitoring approach is given for informational purposes only. Please keep in mind that the total processing time and the progress' pace might fluctuate depending on various factors, such as:

  • Your environment.
  • Your Service Plan.
  • The size of the records.
  • The presence of indices and constraints.

You can estimate your current progress by making use of the metric_pg_database_size on the target a9s PostgreSQL Service Instance and by observing its available metrics, which allows you to extrapolate in order to have an approximate processing time.

For example, let us assume that you have a database that is 100GB, and you decide to monitor after the progress after 10 minutes and notice that the progress is currently at 120MB, after 20 mins 250MB; with this you can initially estimate a duration for the whole process by comparing the current size to the known size of the source a9s PostgreSQL Service Instance's original database, with the possibility of recalculating when it slows down due to expensive transactions (e.g REINDEX).

Create a Fork of a Service Instance

Currently, a9s PostgreSQL supports two approaches to migrate data from a running Service Instance to a new Service Instance. Specifically, it supports the Download Backup and Apply Manually approach described below, and the copy-from feature explained in Migration.

Download Backup and Apply Manually

This procedure of forking a Service Instance involves creating a backup of a Service Instance, modifying the backup bit and restoring it to a different Service Instance.

caution

This method should only be used to restore the data on the same PostgreSQL version as the source service. For migrating data to a different version, see Dump and Restore From Origin Instance to New Instance.

Having two Service Instances is a prerequisite for the process:

$ cf s
Output
Getting services in org my_org / space my_space as user@domain.com...

name service plan bound apps last operation
psql1 a9s-postgresql15 postgresql-replica-small create succeeded
psql2 a9s-postgresql15 postgresql-replica-small create succeeded

Having a service key for each of the Service Instances comes in handy later when we need info about the database and credentials to connect to the database. Create one for each Service Instance using e.g.:

cf create-service-key psql1 mykey.

You can then have a look at the keys contents:

$ cf service-key psql1 mykey
Output
Getting key mykey for service instance psql1 as user@domain.com...

{
"host": "pcsd85301a-psql-master-alias.node.dc1.consul",
"hosts": [
"pcsd85301a-pg-0.node.dc1.consul",
"pcsd85301a-pg-1.node.dc1.consul",
"pcsd85301a-pg-2.node.dc1.consul"
],
"name": "pcsd85301a",
"password": "a9spass",
"port": 5432,
"uri": "postgres://a9suser:a9spass@pcsd85301a-psql-master-alias.node.dc1.consul:5432/pcsd85301a",
"username": "a9suser"
}
$ cf service-key psql2 mykey
Output
Getting key mykey for service instance psql2 as user@domain.com...

{
"host": "pcsd59c00c-psql-master-alias.node.dc1.consul",
"hosts": [
"pcsd59c00c-pg-0.node.dc1.consul",
"pcsd59c00c-pg-0.node.dc1.consul",
"pcsd59c00c-pg-0.node.dc1.consul"
],
"name": "pcsd59c00c",
"password": "a9spass",
"port": 5432,
"uri": "postgres://a9suser:a9spass@pcsd59c00c-psql-master-alias.node.dc1.consul:5432/pcsd59c00c",
"username": "a9suser"
}

We need the database names for later use as well as the credentials for psql2. The database name is the trailing part of the URI. For psql1 the database name is pcsd85301a and for psql2 the database name is pcsd59c00c

Additional prerequisites regarding command line tools:

  • BASH (some shell)
  • cat
  • sed
  • openssl
  • psql (the command line client for psql)

Open the Service Dashboard of the Service Instance you want to fork. We use psql1 for this example. You can find the Service Dashboard URL like this:

$ cf service psql1
Output
Showing info of service psql1 in org my_org / space my_space as user@domain.com...

name: psql1
service: a9s-postgresql15
bound apps:
tags:
plan: postgresql-replica-small
description: Dedicated PostgreSQL Service Instances and clusters powered by the anynines Service Framework
documentation:
dashboard: https://a9s-postgresql-dashboard.de.a9s.eu/service-instances/271ceaa0-eded-43ba-a305-aaecab326079

[...]

Make sure you set a encryption password for the backups using the Service Instance dashboard. Create a backup using the Service Dashboard. Download the backup to your local machine. The filename will be something like racsd92baee-1522222422893. Decrypt the backup and write its contents to a file:

cat racsd92baee-1522222422893 | openssl enc -aes256 -md md5 -d -pass 'pass:mytopsecretpassword' | gunzip -c > backup.sql

Replace the database name of the original Service Instance in the backup with the database name of the other Service Instance:

cat backup.sql | sed -e 's/pcsd85301a/pcsd59c00c/g' > backup.fork.sql

Create a tunnel to the admin interface as explained here of the PostgreSQL Service Instance that will be the fork of the original Service Instance. For this example, we will use psql1 as the original Service Instance and psql2 as the fork Service Instance.

Let us assume the following setup for a tunnel to psql2:

cf ssh someapp -L 127.0.0.1:5432:racsd92baee.service.dc1.a9ssvc:5432

Finally, we can restore the data:

psql -h 127.0.0.1 -p 5432 -U a9s27ae5c9bfee2937a8dc04b1d7d9b9c9690c97704 pcsd59c00c < backup.fork.sql

Don't get confused if you see error messages. The backup contains data that we lack the necessary permissions to restore. In the end our data gets restored and that's our primary goal.