a9s Consul
This section describes the operation of the a9s Consul component.
a9s Consul, as the name suggests, provides a Consul that offers service discovering via DNS for all internal a9s components, including data service instances. In this way, a9s components can easily communicate with each other regardless of IP addresses, which also makes them independent of infrastructure failover and load balancing mechanisms. To access the DNS service discovery from external sources, such as Cloud Foundry, a9s Consul uses Dnsmasq.
Overview
The default setup consists of 3 Consul server nodes, a colocated Consul agent on each framework component, and a colocated Consul agent on each data service instance node. Together form the Consul cluster, which can only be joined with the appropriate credentials.
All a9s components use the service discovery layer of their local Consul agent to find the services they need to communicate with.
For security reasons, none of the Consul agents offer any service discovery functionality on a public interface. External access to the service discovery layer, for example,to use the service bindings, is explained in more detail in Service Discovery Without a Local Consul Agent.
Service Discovery Without a Local Consul Agent
The DNS service discovery layer can also be used without a local Consul agent if required. For this purpose, the default setup also includes 3 Dnsmasq nodes that use their local Consul agents to provide the service discovery layer for external parties.
Dnsmasq is intentionally not colocated on the Consul server nodes. This allows scaling DNS service discovery for external parties via the Dnsmasqs without having to scale the Consul servers, which can otherwise have a negative impact on performance. This way, the Consul servers can take care of their real job, managing the Consul cluster.
For more information on using the DNS service discovery interface, see the official Consul DNS Interface documentation.
NOTE: Whenever possible, use a local Consul agent that is connected to the Consul cluster.
Data Service Instances
The a9s Data Service instances use a9s Consul to provide the following infrastructure and data service independent functionality:
- Service Bindings: It is generally considered bad practice to hardcode an IP address, especially in a microservice architecture. Since data service instances can move throughout the system, it becomes challenging to keep the application configurations up to date. Therefore, a9s service bindings use Consul hostnames which can be resolved using Consul's DNS service discovery layer. This means that application developers don't have to update their service bindings if, for example, an IP address of an instance node changes or another instance node has to take over due to a failover.
- Load Balancing: Since multiple nodes of the same instance are often running simultaneously, a strategy for evenly balancing traffic to all healthy nodes of the instance is needed while handling changes in health and changes in the cluster state. Inside the a9s Data Services this is achieved by Consul's load balancing layer, wherefore, the instances are independent of infrastructure failover and load balancing mechanisms.
Additional Documentation
FAQ
Why Not Use a Proxy?
Although proxies like Nginx or HAProxy can solve the issue of IP propagation, they also introduce a layer of complexity. For a service such as PostgreSQL, drivers expect only one node to contact, which makes a proxy a single point of failure and removes most of the advantages of high availability. Another idea is to deploy multiple proxies within a hot-standby setup, which adds even more complexity, as an IaaS specific failover needs to be implemented.
Why Not Use Static IP Addresses?
Although it sounds tempting to use the IP addresses of service instances in service bindings, there is one major downside to that. Depending on the IaaS, it might not be possible to resurrect VMs with static IP addresses if the compute node containing it is currently down. The a9s Data Services Framework (a9s DSF) aims to support all network types. This includes dynamic networks in which crashed VMs get new IP addresses when they are restarted. In such a case it would be necessary to recreate all service bindings.