Skip to main content
Version: Develop

a9s MongoDB Resources Considerations

This document describes concerns and limitations to be considered when allocating resources for your a9s MongoDB plans.

Replication Lag

The secondary nodes replicate data from the primary. The primary node stores data that has not been replicated yet on oplog on disk. For this reason, disk usage can grow faster than expected until the missing secondary replicates the data.

The maximum size of the oplog is 5% of the free disk with a lower bound of 990MB and upper bound 50GB, calculated when the MongoDB process starts. However, new versions can grow the oplog more than this to avoid deleting the majority commit point.

E.g., With a 5GB persistent disk for a9s MongoDB, the oplog can fill up to 990MB of the storage in the event of a disaster on a secondary node.

The oplog tries to hold 24h of operations until the secondary node comes up after that secondary node needs to be manually cleaned up and brought back to the cluster.

Max Connections

Every a9s MongoDB instance configures the MongoDB process with maxIncomingConnections: 65536, and the MongoDB process represents each connection with a file descriptor. This means that the maximum number of open files has a direct influence on the maxIncomingConnections. a9s MongoDB instances configure the maximum number of open files of 64000 (ulimit -n 64000) and a maximum of 32000 (ulimit -u 32000) processes.

Both values are set with ulimit and can interfere with MongoDB behavior when using a large number of connections. Keep in mind that the number of open connections has an impact on memory consumption.

Memory Cache Limits

By default, MongoDB uses up to max[50% of (total_memory - 1GB), 256MB]. However, its storage engine WiredTiger also uses the filesystem cache, which can use all the available free memory.

Vendor Limitations

There was a known bug in MongoDB v5.0.x that prohibited successful restores of backups. As of this time, the fix has not been backported to all v5.0.x versions.

Therefore, we implemented a workaround that makes backups and restores possible. It creates in the database dummy a role named dummyRole. This role has read-only access to the dummy database.

This role gets created after the first successful deployment of a single or cluster instance.