a9s KeyValue Resources Considerations
This document describes concerns and limitations to be considered when allocating resources for your a9s KeyValue plans.
Persistence
It is possible to configure an a9s KeyValue Service Instance with RDB persistency enabled, which means that the Valkey data on memory is written to the persistent disk.
Saving the content to the persistent storage is the approach used when backing up an a9s KeyValue
service instance, and for this reason, your persistent disk size must be higher than your memory
size. Keep in mind that a9s-parachute
will stop the processes when persistent disk usage reaches
80%
(default).
Reserved Memory
a9s KeyValue configures the maximum amount of memory Valkey can use on a service instance, leaving a reserved amount for side processes (a9s Logstash, a9s Consul, a9s Backup Agent, and the OS).
maxmemory
is configured as follows:
system_reserved_memory = min[10% of memory, 2GB]
total_reserverd_memory = system_reserved_memory + 256MB for the consul agent +
256MB for the a9s Backup Agent + 512MB for a9s Logstash
maxmemory = total_memory - total_reserverd_memory
maxmemory = max[150MB, maxmemory]
Reserved Disk Space
a9s KeyValue can be configured to use snapshots to persist data on the disk. If this is the case, then some disk space needs to be reserved.
Snapshots usually require an amount of disk space following the formula (1.2 * used memory) * 2
.
The (1.2 * used memory)
formula part is directly linked with the relation between the amount of memory
used by Valkey and the snapshot file size produced after the snapshot operation. And, the * 2
second formula part is related to the way the Valkey produce the snapshot file, It will keep the existing
old snapshot file and create temporarily another file during the process. Therefore, it requires double
of disk space available.
E.g. a9s KeyValue instance uses 1GB
of Memory, so the snapshot functionality would allocate around
2.4GB
of disk space.
Sentinel port
The sentinel port is currently exposed in the service binding. The Sentinel
port is currently hardcoded to 6357 + 20000. The templates in
template-uploader-errand
do not set the port based on the SPI value.
Cluster Deployment Update Strategy
When updating a cluster service instance, BOSH will update one node per time and only proceeds with updating the next node when the node that has been updated is UP, running, and synchronized with the cluster.
If the cluster service instance is healthy, and it means all nodes are UP, running, and synchronized, the process will proceed smoothly.
If the cluster service instance is not healthy, the process will fail because the update
waits until the cluster is healthy. If it does not happen within the configurable timeout
(cluster-update-node-timeout
), the deployment will behave like it's set in the property
(stop-cluster-update-on-failure
).
The property stop-cluster-update-on-failure
(default value: true
), configures if the
instance update should be aborted when the replication during the update does not work
correctly. Setting this property to false
causes that the replication error will be
ignored and the deployment will continue. This will most probably cause data loss.
The property cluster-update-node-timeout
(default value: 10
), configures the maximal
amount of time (in minutes) that a single Valkey VM in a cluster instance update can take.
The post-start script takes this value and if the runtime is bigger than the configured
amount, it will take the above-described property and either fails or continues
(based on the configuration).
Disable use_dns_addresses
The BOSH property use_dns_addresses
should always be DISABLED when using a9s KeyValue. When
use_dns_addresses
is set to true, BOSH will use DNS hostnames instead of IP addresses in
the service configuration. As we know, Valkey Sentinel 3, 4, and 5 do not support this kind
of configuration and although Valkey Sentinel 6.2 supports this type of configuration, it
will require further changes in the a9s KeyValue BOSH release. One of the side effects of using
use_dns_addresses: true
is the downtime when updating the a9s KeyValue deployment, but we are
afraid that there may be more to that, so we recommend to disable this property when using a9s KeyValue.
User Credentials Management
The user access control is implemented via Valkey Access Control List (ACL) to provide a unique set of credentials
(username and password) for each service binding/key, for both valkey
and sentinel
users.
Because Valkey does not replicate ACL rules, the creation or deletion of service bindings/keys must be executed against all nodes of the cluster. To ensure consistency in the cluster, it is only possible to create or delete a service binding/key if all nodes in the deployment are healthy.
Plan Upgrade Restrictions
We recommend preventing users from attempting to upgrade any single instance to a cluster instance. This can be achieved by limiting the migration paths in your manifest. Our example manifest already takes this into consideration.
For more information on migration paths see our Service Plans documentation.
As a solution, we encourage using our migration feature.