Woblag: A bunch of Linux/Moodle/OpenSource stuff.: IPVS

Showing posts with label IPVS. Show all posts

Friday, October 28, 2011

Deploying a loadbalanced Apache front end

This particular configuration is being used to support a large Moodle Installation.
The front end comprises of:
1 Loadbalancer
3 webservers
The load balancer virtual IP

Prerequisites:
Fully built and configured CentOS Apache-based webservers
One clean CentOS server to act as loadbalancer

IP's
Webserver 1- 192.168.100.1
Webserver 2 - 192.168.100.2
Webserver 3- 192.168.100.3
Loadbalancer - 192.168.100.5
Virtual IP- 192.168.100.10

On Each Webserver

1.         Open Terminal Interface
2.         Create specific test file on each webserver to allow the Loadbalancer to check.
echo foo > /var/www/html/test.html

3.        Create unique identifiers on each webserver to ensure the load balancing algorithm works. These files will be deleted later on.
On each webserver x (replace x with associated number)
echo "This is Webserver X" > /var/www/html/index.html

4.         Create a loopback interface with the virtual IP to terminate on each webserver:
nano /etc/sysconfig/network-scripts/ifcfg-lo:0
DEVICE=lo:0
IPADDR=192.168.100.20
NETMASK=255.255.255.255
ONBOOT=yes
NAME=loopback

5.        Configure kernel to announce ARP requests/send responses
nano /etc/sysctl.conf
Modify the following entries
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.eth0.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.eth0.arp_announce = 2

6.        Reload modified kernel parameters and bring up loopback:
sysctl -p
ifup lo:0

On the Loadbalancer

1.        Open command line
2.        Install necessary packages
yum install -y heartbeat heartbeat-ldirectord ipvsadm

3.         Configure ldirectord and heartbeat in autostart list.
chkconfig --add ldirectord
chkconfig --del heartbeat
4.     Modify kernel to allow IP forwarding
nano /etc/sysctl.conf
find the following parameter:
net.ipv4.ip_forward = 0
change the parameter:
net.ipv4.ip_forward = 1
5.     Reload modified kernel parameter:
sysctl -p

6.        Configure secondary ethernet interface for Virtual IP:
nano /etc/sysconfig/network-scripts/ifcfg-eth0:0
DEVICE=eth0:0
BOOTPROTO=none
ONBOOT=yes
HWADDR=3a:5d:23:ad:67:47 <>
NETMASK=255.255.255.0
IPADDR=192.168.100.10
GATEWAY=192.168.100.1
TYPE=Ethernet

7.        Create ldirector configuration file:
nano /etc/ha.d/ldirectord.cf
checktimeout=10
checkinterval=2
autoreload=no
logfile="/var/log/ldirectord.log"
quiescent=no
virtual=192.168.100.10:80
               real=192.168.100.1:80 gate
               real= 192.168.100.2:80 gate
               real= 192.168.100.3:80 gate
               service=http
               request="test.html"
               receive="foo"
               scheduler=wlc
               protocol=tcp
               checktype=negotiate
8.         Restart the ldirectord service:
service ldirectord restart
9.     Test the configuration:
ipvsadm -l
Expected output:
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP yoursite.com:http wlc
-> web03:http Route 1 412 0
-> web02:http Route 1 411 0
-> web01:http Route 1 411 0

That should be it! Your DNS may need to be configured to point your external IP to the Virtual IP.

-n

Wednesday, March 9, 2011

Choosing the right scheduling algorithm for a Linux Based Load Balancer - Part II

This is a continuation of an earlier post:
Choosing the right scheduling algorithm for a Linux Based Load Balancer

This is the current scenario:

Two identical webservers, mirrored hardware and software specifications.
The main variable is physical location of the second webserver, but it is located on the same physical network but a different building.
Network traffic may or may not be an issue.

Based on some initial tests, (scale testing is almost irrelevant for these algorithms ), I have decided to implement the Weighted Least Connection. It seems the most appropriate at this point.

Weighted Least-Connections (default)

Distributes more requests to servers with fewer active connections relative to their capacities.
Capacity is indicated by a user-assigned weight, which is then adjusted upward or
downward by dynamic load information. The addition of weighting makes this algorithm
ideal when the real server pool contains hardware of varying capacity.

-n

Wednesday, February 23, 2011

Choosing the right scheduling algorithm for a Linux Based Load Balancer

I'm currently doing some research into choosing the right scheduling algorithm for a Linux based Load balancer for a dual front end moodle installation.
The actual loadbalancer implementation will be detailed later.

This particular post will be to outline the scheduling algorithms possible on ipvs and which would be the best option for my particular scenario. The assumption in my case is that both webservers are identical, from hardware specifications to OS to php code.

From the CentOS Documentation on IPVS Scheduling Algorithms:

Round-Robin Scheduling
Distributes each request sequentially around the pool of real servers. Using this algorithm,
all the real servers are treated as equals without regard to capacity or load. This scheduling
model resembles round-robin DNS but is more granular due to the fact that it is networkconnection
based and not host-based. LVS round-robin scheduling also does not suffer the
imbalances caused by cached DNS queries.

Weighted Round-Robin Scheduling
Distributes each request sequentially around the pool of real servers but gives more jobs to
servers with greater capacity. Capacity is indicated by a user-assigned weight factor, which
is then adjusted upward or downward by dynamic load information.
Weighted round-robin scheduling is a preferred choice if there are significant differences in
the capacity of real servers in the pool. However, if the request load varies dramatically, the
more heavily weighted server may answer more than its share of requests.

Least-Connection
Distributes more requests to real servers with fewer active connections. Because it keeps
track of live connections to the real servers through the IPVS table, least-connection is a
type of dynamic scheduling algorithm, making it a better choice if there is a high degree of
variation in the request load. It is best suited for a real server pool where each member
node has roughly the same capacity. If a group of servers have different capabilities,
weighted least-connection scheduling is a better choice.

Weighted Least-Connections (default)
Distributes more requests to servers with fewer active connections relative to their capacities.
Capacity is indicated by a user-assigned weight, which is then adjusted upward or
downward by dynamic load information. The addition of weighting makes this algorithm
ideal when the real server pool contains hardware of varying capacity.

Locality-Based Least-Connection Scheduling
Distributes more requests to servers with fewer active connections relative to their destination
IPs. This algorithm is designed for use in a proxy-cache server cluster. It routes the
packets for an IP address to the server for that address unless that server is above its capacity
and has a server in its half load, in which case it assigns the IP address to the least
loaded real server.

Locality-Based Least-Connection Scheduling with Replication Scheduling
Distributes more requests to servers with fewer active connections relative to their destination
IPs. This algorithm is also designed for use in a proxy-cache server cluster. It differs
from Locality-Based Least-Connection Scheduling by mapping the target IP address to a
subset of real server nodes. Requests are then routed to the server in this subset with the
lowest number of connections. If all the nodes for the destination IP are above capacity, it
replicates a new server for that destination IP address by adding the real server with the
least connections from the overall pool of real servers to the subset of real servers for that
destination IP. The most loaded node is then dropped from the real server subset to prevent
over-replication.

Destination Hash Scheduling
Distributes requests to the pool of real servers by looking up the destination IP in a static
hash table. This algorithm is designed for use in a proxy-cache server cluster.
Source Hash Scheduling
Distributes requests to the pool of real servers by looking up the source IP in a static hash
table. This algorithm is designed for LVS routers with multiple firewalls.

I'm currently doing some testing with one or two of the more viable options and will follow up with my choice (and why I chose it).

Part II here.
-n

Pages