a9s MongoDB Resources Considerations
This document describes concerns and limitations to be considered when allocating resources for your a9s MongoDB plans.
Replication Lag
The secondary nodes replicate data from the primary. The primary node stores data that has not been replicated yet on oplog on disk. For this reason, disk usage can grow faster than expected until the missing secondary replicates the data.
The maximum size of the oplog is 5%
of the free disk with a lower bound of 990MB
and upper
bound 50GB
, calculated when the MongoDB process starts. However, new versions can grow the oplog
more than this to avoid deleting the majority commit point.
E.g., With a 5GB
persistent disk for a9s MongoDB, the oplog can fill up to 990MB
of the storage
in the event of a disaster on a secondary node.
The oplog tries to hold 24h of operations until the secondary node comes up after that secondary node needs to be manually cleaned up and brought back to the cluster.
Max Connections
Every a9s MongoDB instance configures the MongoDB process with maxIncomingConnections: 65536
, and
the MongoDB process represents each connection with a file descriptor. This means that the maximum
number of open files has a direct influence on the maxIncomingConnections. a9s MongoDB instances
configure the maximum number of open files of 64000
(ulimit -n 64000
) and a maximum of 32000
(ulimit -u 32000
) processes.
Both values are set with ulimit
and can interfere with MongoDB behavior when using a large number
of connections. Keep in mind that the number of open connections has an impact on memory consumption.
Memory Cache Limits
By default, MongoDB uses up to max[50% of (total_memory - 1GB), 256MB]
. However, its storage
engine WiredTiger also uses the filesystem cache, which can use all the available free memory.
Vendor Limitations
There was a known bug in MongoDB v5.0.x that prohibited successful restores of backups. As of this time, the fix has not been backported to all v5.0.x versions.
Therefore, we implemented a workaround that makes backups and restores possible. It creates in the
database dummy
a role named dummyRole
. This role has read-only access to the dummy
database.
This role gets created after the first successful deployment of a single or cluster instance.