Skip to main content
Version: Develop

a9s KeyValue Resources Considerations

This document describes concerns and limitations to be considered when allocating resources for your a9s KeyValue plans.

Persistence

It is possible to configure an a9s KeyValue Service Instance with RDB persistency enabled, which means that the Valkey data on memory is written to the persistent disk.

Saving the content to the persistent storage is the approach used when backing up an a9s KeyValue service instance, and for this reason, your persistent disk size must be higher than your memory size. Keep in mind that a9s-parachute will stop the processes when persistent disk usage reaches 80% (default).

Reserved Memory

a9s KeyValue configures the maximum amount of memory Valkey can use on a service instance, leaving a reserved amount for side processes (a9s Logstash, a9s Consul, a9s Backup Agent, and the OS).

maxmemory is configured as follows:

system_reserved_memory = min[10% of memory, 2GB]

total_reserverd_memory = system_reserved_memory + 256MB for the consul agent +
256MB for the a9s Backup Agent + 512MB for a9s Logstash

maxmemory = total_memory - total_reserverd_memory
maxmemory = max[150MB, maxmemory]

Reserved Disk Space

a9s KeyValue can be configured to use snapshots to persist data on the disk. If this is the case, then some disk space needs to be reserved.

Snapshots usually require an amount of disk space following the formula (1.2 * used memory) * 2. The (1.2 * used memory) formula part is directly linked with the relation between the amount of memory used by Valkey and the snapshot file size produced after the snapshot operation. And, the * 2 second formula part is related to the way the Valkey produce the snapshot file, It will keep the existing old snapshot file and create temporarily another file during the process. Therefore, it requires double of disk space available. E.g. a9s KeyValue instance uses 1GB of Memory, so the snapshot functionality would allocate around 2.4GB of disk space.

Note

Please note that even for non-persistent a9s KeyValue Service Instances, some reserved disk space is still required. This is because a temporary snapshot is stored on disk during the backup process, and Valkey also uses disk space for replication operations.

Sentinel port

The sentinel port is currently exposed in the service binding. The Sentinel port is currently hardcoded to 6357 + 20000. The templates in template-uploader-errand do not set the port based on the SPI value.

Cluster Deployment Update Strategy

When updating a cluster service instance, BOSH will update one node per time and only proceeds with updating the next node when the node that has been updated is UP, running, and synchronized with the cluster.

If the cluster service instance is healthy, and it means all nodes are UP, running, and synchronized, the process will proceed smoothly.

If the cluster service instance is not healthy, the process will fail because the update waits until the cluster is healthy. If it does not happen within the configurable timeout (cluster-update-node-timeout), the deployment will behave like it's set in the property (stop-cluster-update-on-failure).

The property stop-cluster-update-on-failure (default value: true), configures if the instance update should be aborted when the replication during the update does not work correctly. Setting this property to false causes that the replication error will be ignored and the deployment will continue. This will most probably cause data loss.

The property cluster-update-node-timeout (default value: 10), configures the maximal amount of time (in minutes) that a single Valkey VM in a cluster instance update can take. The post-start script takes this value and if the runtime is bigger than the configured amount, it will take the above-described property and either fails or continues (based on the configuration).

Disable use_dns_addresses

The BOSH property use_dns_addresses should always be DISABLED when using a9s KeyValue. When use_dns_addresses is set to true, BOSH will use DNS hostnames instead of IP addresses in the service configuration. One of the side effects of using use_dns_addresses: true is the downtime when updating the a9s KeyValue deployment, but we are afraid that there may be more to that, so we recommend to disable this property when using a9s KeyValue.

User Credentials Management

The user access control is implemented via Valkey Access Control List (ACL) to provide a unique set of credentials (username and password) for each service binding/key, for both valkey and sentinel users.

Because Valkey does not replicate ACL rules, the creation or deletion of service bindings/keys must be executed against all nodes of the cluster. To ensure consistency in the cluster, it is only possible to create or delete a service binding/key if all nodes in the deployment are healthy.

Plan Upgrade Restrictions

We recommend preventing users from attempting to upgrade any single instance to a cluster instance. This can be achieved by limiting the migration paths in your manifest. Our example manifest already takes this into consideration.

For more information on migration paths see our Service Plans documentation.

As a solution, we encourage using our migration feature.