HA Proxy- Load Balancer

Load balancing is a common solution for distributing web applications horizontally across multiple hosts while providing the users a single point of access to the service. It aims to optimize resource usage, maximize throughput, minimize response time, and avoid overloading any single resource. HAProxy is one of the most popular open source load balancing software offering high availability, load balancing, and proxying for TCP and HTTP-based applications. It is particularly suited for very high traffic websites, and is therefore often used to improve web service reliability and performance for multi-server configurations.

Prerequisites

  • 2 virtual/physical servers for the load-balancers
  • 2 virtual/physical servers to be used as web servers
  • 4 IP addresses needed by the servers themselves
  • Virtual IP address (VIP) is necessary
  • The two load-balancers and the VIP need to be in the same network segment

Server configuration

Host details:
  • Load Balancer 1: LB1, IP: 192.168.0.101
  • Load Balancer 2: LB2, IP: 192.168.0.102
  • Web Server 1: httpd1, IP: 192.168.0.103
  • Web Server 2: httpd2, IP: 192.168.0.104
  • VIP/Shared IP address: 192.168.0.100


The shared (virtual) IP address is no problem as long as you’re in your own LAN where you can assign IP addresses as you like. However, if you want to use this setup with public IP addresses, you need to find a host where you can rent two servers (the load balancer nodes) in the same subnet; you can then use a free IP address in this subnet for the virtual IP address.

HAProxy installation

HAProxy can be easily installed from the default base repository using default package manager yum. On LB1 and LB2 execute the following commands.

yum install -y haproxy

HAProxy Configuration

Open /etc/haproxy/haproxy.cfg file using any editor, and replace the line
frontend main *:5000″ with “frontend main *:80″ and comment out the line “use_backend static if url_static

HAProxy Algorithms

Round Robin: This algorithm is the most commonly implemented. It works by using each server behind the load balancer in turns, according to their weights. It’s also probably the smoothest and most fair algorithm as the server’s’ processing time stays equally distributed. As a dynamic algorithm, Round Robin allows server weights to be adjusted on the go.

Static Round Robin: Similar to Round Robin, each server is used in turns per their weights. Unlike Round Robin though, changing server weight on the fly is not an option. There are, however, no design limitations as far as the number of servers is concerned. When a server goes up, it will always be immediately reintroduced into the farm once the full map is recomputed.

Least Connections: With this algorithm, the server with the lowest number of connections receives the connection. This type of load balancing is recommended when very long sessions are expected, such as LDAP, SQL, TSE, etc. It’s not, however, well-suited for protocols using short sessions such as HTTP. This algorithm is also dynamic like Round Robin.

Source: This algorithm hashes the source IP and divides it by the total weight of running servers. The same client IP always reaches the same server as long as no server goes down or up. If the hash result changes due to the changing number of running servers, clients are directed to a different server. This algorithm is generally used in TCP mode where cookies cannot be inserted. It’s also static by default.

URI: This algorithm hashes either the left part of the URI, or the whole URI and divides the hash value by the total weight of running servers. The same URI is always directed to the same server as long as no servers go up or down. It’s also a static algorithm and works the same way as the Source algorithm.

URL Parameter: This static algorithm can only be used on an HTTP backend. The URL parameter that’s specified is looked up in the query string of each HTTP GET request. If the parameter that’s found is followed by an equal sign and value, the value is hashed and divided by the total weight of running servers.

Open /etc/haproxy/haproxy.cfg file using any editor again, and comment the lines “server app” and add the following lines;

server httpd1 192.168.0.103:80 check
server httpd2 192.168.0.104:80 check

After you’ve done with the configuration, make sure to restart the HAProxy and make it persistent at system startup/boot.

systemctl enable haproxy
systemctl start haproxy

Now open /etc/firewalld/services/haproxy.xml file and paste the following lines:

<?xml version="1.0" encoding="utf-8"?>
<service>
<short>HAProxy</short>
<description>HAProxy load-balancer</description>
<port protocol="tcp" port="80"/>
</service>

We need to assign correct SELinux context and file permissions to the haproxy.xml file:

cd /etc/firewalld/services
restorecon haproxy.xml
chmod 640 haproxy.xml

Next update the firewall configuration:

firewall-cmd --permanent --add-service=haproxy
firewall-cmd --reload

Keepalived installation

We’ve just configured HAProxy to listen on the virtual IP address 192.168.0.100, but someone has to tell lb1 and lb2 that they should listen on that IP address. This is done by keepalived which we install by the following steps;

yum install -y keepalived

Keepalived Configuration

If you want LB1 to be the active/master load balancer, please use the following configuration on LB1.

Create /etc/keepalived/keepalived.conf file and paste the following lines;

vrrp_script chk_haproxy {
script "killall -0 haproxy" # check the haproxy process
interval 2 # every 2 seconds
weight 2 # add 2 points if OK
}

vrrp_instance VI_1 {
interface eth0 # interface to monitor
state MASTER # MASTER on LB1, BACKUP on LB2
virtual_router_id 51
priority 101 # 101 on LB1, 100 on LB2
virtual_ipaddress {
192.168.0.100 # virtual ip address
}
track_script {
chk_haproxy
}
}

It is important that you use priority 101 in the configuration file – which makes LB1 the master.
Use following commands to Enable keepalived service on system boot up:

systemctl enable keepalived
systemctl start keepalived

Next, we need to confirm the presence of the VIP on LB1 server;

Now we do almost the same on LB2. There’s one small, but important difference – we use priority 100 instead of priority 101 in /etc/keepalived/keepalived.conf which makes LB2 the passive/slave load balancer.

As LB2 is the passive load balancer, it should not be listening on the virtual IP address as long as LB1 is up. We could verify the same with the following command;

ip addr sh eth0

Preparing The Backend Web Servers

Install Apache in both web servers httpd1 & httpd2 and create separate index pages to identify each server.

You can now make HTTP requests to the virtual IP address 192.168.0.100 (or to any domain/hostname that is pointing to the virtual IP address), and you should get content from the backend web servers.

You can test its high-availability/failover capabilities by switching off one backend web server – the load balancer should then redirect all requests to the remaining backend web server. Afterwards, switch off the active load balancer (LB1), LB2 should take over immediately. Also LB1 comes up again, it will take over the master role again.

Monitoring HAProxy

We have used the options stats enable and stats auth admin:password in the HAProxy configuration. This allow us to access (password-protected) HAProxy statistics.

listen stats :1936
stats enable
stats hide-version
stats refresh 30s
stats show-node
stats auth admin:password
stats uri /haproxy?stats

Rajesh

Author Rajesh
Written on June 8th, 2017