avatar
Frances Guerrero

Editor at Eduport

  • February 16, 2024
  • 5 min read
  • 266
  • 2K
February 16, 2024|
Research

Kubernetes Networking: Internal Load Balancing vs External Load Balancing

Kubernetes Networking: Load Balancing

Load balancing is a technique used to distribute incoming network traffic across multiple servers to improve responsiveness, reliability, and scalability of applications. It involves directing requests from clients to one or more available servers based on various factors such as server load, response time, and availability.

On the Internet, load balancing is often employed to divide network traffic among several servers. This reduces the strain on each server and makes the servers more efficient, speeding up performance and reducing latency.

Load balancing is essential for most Internet applications to function properly. Imagine a checkout line at a grocery store with 8 checkout lines, only one of which is open. All customers must get into the same line, and therefore it takes a long time for a customer to finish paying for their groceries. Now imagine that the store instead opens all 8 checkout lines. In this case, the wait time for customers is about 8 times shorter (depending on factors like how much food each customer is buying).

Load balancing essentially accomplishes the same thing. By dividing user requests among multiple servers, user wait time is vastly cut down.

How Does Load Balancing Work?

Load balancing works by directing incoming network traffic to the most appropriate server based on certain criteria.

The process typically involves 3 components:

  • Load Balancer: A load balancer sits between the user and the application servers. Its primary responsibility is to monitor the traffic coming from users and redirect it to the best suited server. There are two main types of load balancers: hardware-based and software-based. Hardware-based load balancers are dedicated appliances that perform all the necessary functions, whereas software-based load balancers run on top of existing infrastructure
  • Application Servers: These are the actual servers that host the applications. They receive traffic from the load balancer and serve the requested content to users
  • Users: End-users access the applications hosted on the application servers via the internet or intranet

When a user sends a request to access an application, the request reaches the load balancer first.

The load balancer then evaluates the request and determines which application server is best equipped to handle it. Based on factors like server capacity, usage rate, and response time, the load balancer selects the optimal server and redirects the request to it.

Once the server processes the request, it responds back to the load balancer, which then passes the response back to the user.