a9s Parachute v1
The a9s Parachute v1 is intended to ensure that certain services are stopped properly in the event of a disk filling up, in order to prevent data loss. For this purpose, it runs on every a9s Data Service Instance. The a9s Parachute v1 monitors both the ephemeral and the persistent disk, and as soon as the disk usage exceeds a configured threshold, it stops the configured services.
The following a9s Data Services are using a9s Parachute v1:
- a9s KeyValue
- a9s LogMe2
- a9s Messaging
- a9s MongoDB
- a9s Prometheus
- a9s PostgreSQL
- a9s Redis®*
- a9s Search
Configuration
The a9s Parachute can be configured via the following BOSH properties:
Property | Default Value | Description |
---|---|---|
a9s-parachute.ephemeral.usage_limit | 80 | Limit in percent for the ephemeral disk usage. 0 deactivates the check. |
a9s-parachute.persistent.usage_limit | 80 | Limit in percent for the persistent disk usage. 0 deactivates the check. |
a9s-parachute.services_to_stop | An array with the names of the monit processes to stop when the usage limit for a disk is reached. If no process names are specified, an error is raised. |
Observability
To make the a9s Parachute observable, the following lock files are created in the /var/vcap/sys/run/a9s-parachute
directory on the VM of the respective node of the service instance when the a9s Parachute is triggered:
a9s-parachute-activated
: Indicates that the a9s Parachute has been triggered and the configured services have been stoppeda9s-parachute-activated-ephemeral
: Indicates that the a9s Parachute for the ephemeral disk has been triggereda9s-parachute-activated-persistent
: Indicates that the a9s Parachute for the persistent disk has been triggered
Keep in mind, if the a9s Parachute wasn't triggered the above file will not exist and the directory
/var/vcap/sys/run/a9s-parachute
will be empty.
There are multiple option to monitor a9s Parachute. Below, you can find a list of the different options:
- You can receive a9s Parachute related metrics via a9s Logstash. For more information, see a9s Parachute Metrics
- You can fetch a9s Parachute related metrics via the API v1
- a9s Parachute can be monitored via the a9s Service Dashboard
Manually Restart a Service Instance’s Node
To manually restart the processes of a service instance's node and re-enable the a9s Parachute so that it stops the configured processes again when a disk fills up, the following steps are necessary:
- SSH into the affected node of the service instance via
bosh -d <deployment_name> ssh <instance_name>
- Become root via
sudo -i
- Remove all a9 Parachute lock files via
rm -rf /var/vcap/sys/run/a9s-parachute
- Enable the a9s Parachute checks via
monit start ephemeral
andmonit start persistent
- Start the stopped processes via
monit start <process_name>
Keep in mind that the a9s Parachute doesn't clear up disk space, so you have to free up the disk space manually.
Resource Considerations
For more information, see a9s Parachute Resources Considerations.