In today’s world there are millions of user accessing the internet every day. All the content on the web is hosted on a web server which is responsible for taking the load from the clients.
However there is an upper cap to the number of users a single server can take. So if we want to configure multiple web servers there are few challenges
- firstly all these different web servers have different IP hence if the user want to connect to them the he would have to remember all these IP address which is not practical since some websites have thousands of servers.
- not all web servers are busy at all times and the user would have to check the traffic on different web servers before connecting. Again this is not a very time-efficient way of managing things.
So to tackle this issue we use the concept of load balancer.
What’s a load balancer?
This is the architecture of a typical load balancer. The client does not connect to the web server directly. The client first connects to the load balancer and the task of the load balancer is to check the traffic on each server and connect the client to the desired webserver.
By using this approach both the challenges are directly solved. Firstly since the load balancer is responsible for managing the traffic the user does not need to care about this.
Secondly the user has the IP of the load balancer only and since the load balancer redirects to the webserver implicitly the user need not care about it.
Let’s automate this process
Haproxy is one of the software that allows us to create a reverse proxy or load balancer. So today I would be using it as the software.
Firstly let’s write down the steps for automation:
- Download and install the haproxy software
- Configure the haproxy configuration file
- Limit the firewall which might hamper the connection of the webservers to the load balancer
- Start the haproxy services.
In the controller node I would be setting up the reverse proxy.
Before starting the process we need to have the required binding port of our load balancer(frontend) and web servers(backend). These would be taken as input from the user.
1. Install the haproxy software
2. Create the configuration file
The config file needs to have the information about the binding ports and the IP of the web servers . I would be using two web servers today.
The number of servers can be changing with time so it is not feasible to go and edit the file every time.
So for this we add the IP address of the web servers in the inventory files and loop over them in the configuration file. The benefit of doing this is that we only have to change the inventory file when we want to update the web servers.
Since the template module is used to copy the file at the required location it first parses the content. Here we used a for loop that loops over the <backend_servers> that is defined in the inventory file.
Once the playbook runs the configuration file is updated.
Now in the two webservers I have made two files.
In webserver A
In webserver B
So now when we connect to the loadbalancer we are connected to either of these servers.
Let’s check this
It can be clearly seen that the webserver we connect to changes everytime depending on the load on the system
Today we automated the creation of the reverse proxy. We actually dived in the backend of how things actually work and also understood why we need this type of server in today’s world. This concept would be under utilized if it is not automated since we usually have to deal with large number of servers in the real scenarios.