Disaster Recovery
Currently, a9s PostgreSQL, a9s Messaging and a9s KeyValue are the only Data Services that support performing a disaster recovery execution into a newly created Service Instance. All other a9s Data Services will need some manual interaction, which are not specified and may vary from Data Service to Data Service.
Trying to apply the information contained on this document to an a9s Data Service aside from a9s PostgreSQL can leave your Service Instances in an irrecuperable state.
This document helps you to create a fork of an existing Service Instance, based on the last saved state. This procedure can also be used to recover the last saved state of a deleted Service Instance.
For an Application Developer to be able to perform a disaster recovery, both the a9s Backup Manager and the Service Instances need to be prepared. In other words your Platform Operator must configure the a9s Backup Manager and a fresh and compatible Service Instance must be ready for use.
As long the disaster recovery store is reachable, the steps described in this document can be used to recover the last saved state of a given Service Instance, even if the whole system is not available on the primary site.
To know more about configuration techniques for the disaster recovery store, consult the Platform Operator documentation for Disaster Recovery.
This document describes the process using the a9s Public API v1, but it can also be done using the a9s Cloud Foundry CLI plugin for disaster recovery.
Overview
In a nutshell, the Disaster Recovery, from the perspective of the Application Developer, can be seen as a series of steps that can be summarized as follows:
- Procuring the Backup GUID
- Procuring the encryption key
- Performing the operation
This document will walk your through these general steps for both the a9s Cloud Foundry CLI plugin for disaster recovery and the a9s Public API v1.
Recommendations
We strongly recommend saving the information described here specifically the Backup GUID and encryption key, in a secure location. We recommend you do this as soon as your Service Instance is created, or has a major operation performed on it (e.g after updates, recreation, changes).
The reason for this is to be prepared in case of a total disaster, such as the entire Service Instance being lost. Having the information mentioned above ensures that you can recover the last saved state of said Service Instance.
Note: The a9s Backup Manager does not provide the encryption key, as this must be set by you.
Downloading and Installing the a9s Cloud Foundry CLI Plugin
You can get the plugin from our public S3 bucket. The plugin is available for the following operating systems:
- Linux (sha:493bd71f62baea006f9de8ac1df6a4e5382c3a3a3e95303482ce9072a13491fc)
- MacOS (sha:3616fdf1dadd8452e439a3d49796cfb1b8f085c1e041adce08ae70e69ed164e5)
- Windows (sha:d609d4bb70eade48141e145153ac1a59b6ab9e8a9b0d075706d1103063d9fbd3)
Once you have downloaded the binary file you can install it into the Cloud Foundry CLI with the following command:
cf install-plugin <source-path>/cf-disaster-recovery-plugin -f
You may notice that we are using the -f
flag in the command. This is necessary because this plugin is not signed by
Cloud Foundry, therefore the -f
flag is needed to ignore the unsigned certificate when installing the binary to the
CLI.
Prepare the Service Instance
When performing disaster recovery, it is necessary to have some information about the original Service Instance holding the data to be recovered, as we need to locate the correct backup collection and, if necessary, create a fresh Service Instance with the same settings:
Key | Description |
---|---|
Dashboard URL | The URL to the Dashboard of the Service Instance. Used to retrieve the Backup GUID. |
Backup GUID | The GUID which the a9s Backup Manager is using to refer to backups of a given Service Instance. |
Backup Encryption Key | The encryption key used to encrypt the backup |
Service Name | The name of the service used to create the Service Instance. |
Plan Name | The name of the plan used to create the Service Instance |
Custom Service Parameters | Custom parameter set during creating or updating of the Service Instance |
As mentioned before we strongly recommend retrieving this information as soon as you create or update your Service Instance, especially when updating the encryption key, and storing it in a secure place. This way you will be able to recover from more catastrophic scenarios.
If you are using Cloud Foundry, most of the information can be retrieved using the Cloud Foundry CLI with a simple
cf service <service-instance-name>
, except:
- The Service Instance's encryption key
- The Service Instance's Backup GUID
It is worth noting that the following caveats:
The new Service Instance has a completely different Backup GUID. This means that any backup that is created in the new Service Instance will have a different Backup GUID. Therefore it is imperative to properly distinguish them during the process and when saving them for future reference.
The new Service Instance will not have an user defined encryption key until you set one yourself. This encryption key will not affect the current disaster recovery, but it is important to note it down. It is imperative to properly distinguish both encryption key during the process and when saving them for future reference.
Setting the Backup's Encryption Key
In order to perform a disaster recovery without help from a Platform Operator (assuming the disaster recovery has been properly configured), the Service Instance's encryption key must be set. This can be done through different methods:
- as a parameter of the request when retrieving the Backup GUID
- via the a9s Cloud Foundry CLI plugin for disaster recovery
- via the a9s Public API v1
- as a parameter when updating the Service Instance's backups' configuration via the a9s Public API v1
If using the default encryption key or the encryption key is unknown, the only way to retrieve it is via the a9s Backup Manager database with the help of your Platform Operator. For more information, please check the Platform Operator Disaster Recovery documentation.
Keep in mind that setting a new encryption key will overwrite and replace any previously existing encryption key. This new encryption key is not retroactively applied, so any prior backups cannot be decrypted with it.
It is worth noting that while the a9s Backup Manager can change the encryption key, it does not return it, thus it is necessary to save it manually, in a secure place.
Getting the Backup GUID
Both the a9s Public API v1 and the a9s Cloud Foundry CLI plugin for disaster recovery can be used to retrieve the Backup GUID.
The Backup GUID provided by both the a9s Public API v1 and the a9s Cloud Foundry CLI plugin for disaster recovery differs from the Service Instance GUID provided by Cloud Foundry.
Getting the Backup GUID Using the a9s Public API v1 Directly
Retrieving all the necessary data using the a9s Public API v1 is only possible if the Service Instance still exists, for this reason, we recommended performing these steps directly after the creation, update or updating the encryption key, and store the information in a safe location.
Authentication
To use the a9s Public API v1, you need to be authenticated and authorized. You will also need to create a cookie, which will store your authenticated session. The necessary steps to do so are explained in the a9s Public API documentation.
After the authentication step, you will have the bearer_token
and url
environment variables and the cookie file
called session.cookie
.
Endpoint
POST /v1/instances/:instance_id/disaster-recovery/prepare
Parameters
Key | Type | Description |
---|---|---|
encryption_key | String | Optional The key which the a9s Backup Manager is using to encrypt backups. |
Response
Key | Description |
---|---|
backup_guid | The GUID which the a9s Backup Manager is using to refer to backups. |
cURL Example
curl -X POST --cookie test.cookie --cookie-jar test.cookie --location \
--header "Content-Type: application/json" \
--header "Authorization: ${bearer_token}" \
"${url}/disaster-recovery/prepare" \
--data '{"encryption_key":"<encryption_key>"}'
Prepare Disaster Recovery via the CF CLI Plugin
You can prepare a disaster recovery backup by running the following command:
cf prepare-disaster-recovery <service-instance-name> [<encryption-key>]
where an already existing Service Instance is used as a parameter, with the encryption-key
being optional.
After execution the CF CLI will provide with all the necessary information for the disaster recovery backup. This response contains information such as the Backup GUID of the disaster recovery backup.
While the encryption-key
is an optional parameter, keep in mind that if it is not set it needs to be known, as
the Backup Manager will not return the current encryption key.
Perform the Disaster Recovery
To be able to perform a disaster recovery, you must first create a new Service Instance to host the data that you want to recover. This requires the new Service Instance to be created with the same parameters and with the same or bigger plan as the old Service Instance to avoid unexpected behavior when restoring the data (e.g.: a persistent disk that can not store the whole data). This information must be gathered in the Prepare the Service Instance section.
Cloud Foundry Example:
cf create-service <service name> <plan_name> my_psql_service -c <custom service parameters>
E.g.:
cf create-service a9s-postgresql13 postgresql-replica-small my_psql_service -c '{"continuous_archiving": "enabled"}'
After that, a disaster recovery can be triggered, with the additional information shown below:
Key | Required | Description |
---|---|---|
Backup GUID | True | The GUID which the a9s Backup Manager is using to refer to backups. |
Backup Encryption Key | True | The encryption key used to decrypt the Backup. |
If a Service Instance is configured with continuous archiving it can only be recovered to a Service Instance configured with continuous archiving.
You can check the status of the disaster recovery task using the a9s Dashboard of the Service Instance recovering the data. After the task is finished with success, the Service Instance is ready for use.
After the restore finishes, it is necessary to create a new service binding to the new Service Instance.
Perform the Disaster Recovery Using the API Directly
Create new Service Instance
First, you create a new Service Instance, as shown in the example below:
cf create-service a9s-postgresql13 postgresql-replica-small my_recovered_psql_service -c '{"continuous_archiving": "enabled"}'
Authentication
To use the API, you need to be authenticated and authorized. You will also need to create a cookie, which will store your authenticated session. The necessary steps to do so are explained in the a9s Public API documentation.
After the authentication step, you will have the bearer_token
and url
environment variables and the cookie file
called session.cookie
.
You have to use the a9s Service Dashboard URL of the new Service Instance for the url
variable as the API
call to perform the disaster recovery is made against the new Service Instance.
Endpoint
POST /v1/instances/:instance_id/disaster-recovery/perform
Parameters
Name | Type | Description |
---|---|---|
backup_guid | string | The GUID which the a9s Backup Manager is using to refer to backups. |
encryption_key | string | The encryption key that was used to encrypt the backups. |
Perform Disaster Recovery with cURL
To perform a disaster recovery via cURL, the request must be executed against the full url
of the new Service Instance
as described in the example below:
curl -X POST --cookie test.cookie --cookie-jar test.cookie --location \
--header "Content-Type: application/json" \
--header "Authorization: ${bearer_token}" \
"${url}/disaster-recovery/perform" \
--data '{"encryption_key":"<encryption_key>", "backup_guid":"<backup_guid>"}'
Perform Disaster Recovery via the CF CLI Plugin
You can restore a disaster recovery backup by running the following command:
cf disaster-recovery-restore <service-instance> <backup-guid> <encryption-key>
where the information received in the prepare method's response are the needed parameters. This will restore a disaster recovery backup on an existing Service Instance (not the original Service Instance).