Skip to main content
Version: Develop

Known Issues

Concerns with a9s Consul Cluster: Instability in high scale environments

In large environments, a9s Consul clusters containing thousands of nodes and Consul client agents may face severe instability due to frequent node disconnections and high network traffic.

In such cases, the cluster experiences frequent node disconnections and re-syncing, causing spikes in network load. This can be further compounded by high volumes of DNS queries, particularly from Service Instances performing frequent DNS lookups to resolve addresses.

Thus, large environments experiencing this situation could point to an unsustainable amount of Consul clients agents, which goes against the recommendation stated in the official Consul documentation that suggests limiting the amount of existing Consul client agents to 5000, to ensure resiliency. This limitation is outside the reach of the a9s Data Service Framework.

Recommendation

At the moment it is not possible to implement multiple Consul Data Centers, it is recommended that the Platform Operator splits the environment by creating a new one, with its own Consul domain, once the cluster reaches ~5000 agents.