Thursday, 25 June 2026

AWS EC2: Application Load Balancer

An Application Load Balancer (ALB) is a fully managed AWS service that automatically distributes incoming HTTP and HTTPS traffic across multiple backend targets.

It operates at the Application Layer (Layer 7) of the Open Systems Interconnection (OSI) model.

Key Features:

Content-Based Routing: Routes traffic based on URL paths (/api vs /images) or hostnames (://example.com).
Container Support: Integrates directly with Amazon ECS and EKS using dynamic port mapping.
Advanced Protocols: Native support for modern protocols like HTTP/2, gRPC, and WebSockets.
Security Integration: Features built-in HTTPS/TLS termination and integrates directly with AWS WAF for web security.

How Components Work Together

Listener: Evaluates connection requests from clients using protocols and ports you configure.
Rules: Determines how the load balancer routes requests to its registered targets.
Target Group: Groups backend resources (like EC2 instances, containers, or IP addresses) that receive the traffic

How ALB health checks keep applications online?

An Application Load Balancer (ALB) keeps your application online by continuously monitoring the health of your backend targets and dynamically redirecting traffic away from failing nodes.

1. Automatic Traffic Redirection

The ALB sends periodic ping requests (health checks) to every registered target. If a target fails to respond correctly, the ALB marks it as unhealthy and immediately stops sending user traffic to it. Traffic is rerouted to the remaining healthy nodes with zero downtime for the user.

2. Auto Scaling Integration

When paired with an Auto Scaling Group (ASG), ALB health checks can trigger the automatic replacement of broken instances.

The Problem: An EC2 instance might be running (healthy at the hardware level), but the web server inside it has crashed (unhealthy at the application level).
The Solution: The ALB tells the ASG that the instance is failing application health checks. The ASG terminates that specific broken instance and launches a fresh, working one.

3. Graceful Recovery

When an unhealthy instance recovers, or when a new instance is launched, the ALB does not send traffic to it immediately. It enters an initial state and undergoes consecutive successful health checks. Only when it passes the threshold does the ALB safely introduce it back into the traffic rotation.

How to Configure an ALB Health Check

You configure health checks inside the Target Group settings using these parameters:

Parameter What it does Recommended Setting

======== ========= =================

Health Check Path The URL endpoint the ALB hits

(e.g., /health or /index.html). /health

Healthy Threshold Consecutive successes needed to mark a target as healthy. 3

Unhealthy Threshold Consecutive failures needed to mark a target as unhealthy. 2

Timeout How long the ALB waits for a response before failing. 5 seconds

Interval The time between individual health check pings. 30 seconds

Success Codes The HTTP status codes that prove the app is working. 200 (or 200-399)

When you configure an ALB, you do not select an Availability Zone (AZ) directly; instead, you must select at least two subnets in different Availability Zones to ensure high availability.

How it works: AWS places a load balancer node in each of the specified subnets.
The AZ link: Because each subnet belongs to exactly one AZ, this fundamentally binds the ALB's nodes to those corresponding Availability Zones.
Custom routing: You can modify the subnets via the Update Availability Zones settings in the EC2 Console at any time.

Public ALB

Binding an ALB to public subnets makes it a public (internet-facing) load balancer.

When you create an internet-facing ALB, AWS requires you to select public subnets so the ALB nodes can receive a public IP address and route traffic from the internet.

Key Characteristics:

Public DNS: The ALB receives a public DNS name that resolves to public IP addresses.
Internet Gateway: The selected public subnets must have a route to an Internet Gateway (IGW) in their route tables.
Target Routing: Even though the ALB is public, it can still route traffic to EC2 instances living in private subnets

An internet-facing ALB routes traffic directly to the individual backend targets (such as EC2 instances or IP addresses), not to the private subnets themselves.

How Routing Works

Target Group Config: You configure the ALB to route traffic to a Target Group.
Direct Node Communication: The ALB nodes in the public subnets communicate directly with the private IP addresses of your backend nodes.
Cross-Subnet Traffic: AWS handles this routing internally via the VPC router, allowing the public ALB to securely traverse into private subnets.

Configuration Checklist

VPC: Both the public subnets (where the ALB lives) and the private subnets (where the nodes live) must be in the same VPC.
Security Groups: The private instances must have a security group that allows inbound traffic from the ALB's security group

Private ALB

An internal (private) ALB routes traffic in the exact same way as a public ALB, but it is only accessible within your VPC or connected networks.

It routes traffic directly to individual backend targets, not to subnets.

Key Characteristics

Private Subnets: You deploy the ALB nodes into private subnets.
Private DNS: The ALB receives a public DNS name, but it resolves exclusively to private IP addresses.
No Internet Access: It cannot receive any traffic from the public internet because it lacks a public IP.

Common Use Cases

Internal Microservices: Routing traffic from a public-facing web tier to a private backend API tier.
Hybrid Networks: Routing traffic coming from an on-premises data centre via AWS Direct Connect or a VPN

Setting Up ALB in AWS Console

AWS Elastic Load Balancing shows basic building blocks of AWS Load Balancer which include listeners and target groups.

To create Application Load Balancer go to EC2 >> Load balancers >> Create Load balancer >> Select load balancer type (click on Create under Application Load Balancer)

Here we can set:

Basic configuration

Name
Scheme (cannot be changed after the load balancer is created)

Internet-facing. An internet-facing load balancer routes requests from clients over the internet to targets. Requires a public subnet.
Internal. An internal load balancer routes requests from clients to targets using private IP addresses.

IP address type. Select the type of IP addresses that your subnets use.

IPv4. Recommended for internal load balancers.
Dualstack. Includes IPv4 and IPv6 addresses.

Network mapping. The load balancer routes traffic to targets in the selected subnets, and in accordance with your IP address settings.

VPC. Virtual private cloud for your targets. If balancer is internet-facing, only VPCs with an internet gateway are enabled for selection. The selected VPC cannot be changed after the load balancer is created. As VPC is region-specific so is Application Load Balancer.
Mappings. Once VPC is selected, its availability zones are listed here and are selectable. Select at least two Availability Zones and one subnet per zone. The load balancer routes traffic to targets in these Availability Zones only. Availability Zones that are not supported by the load balancer or the VPC are not available for selection. We should select all AZs that we listed in the Auto scaling group (if we used it).

Security groups. A security group is a set of firewall rules that control the traffic to your load balancer. We can select up to 10 security groups.

If our application is listening for HTTP requests on port 80 we should select a security group with:

Inbound rule: accept HTTP/TCP traffic on port 80 with source Anywhere-IPv4
Outbound rule: allow all traffic for all protocols and port ranges to custom destination 0.0.0.0/0

Listeners and routing. A listener is a process that checks for connection requests using the port and protocol you configure. The rules that you define for a listener determine how the load balancer routes requests to its registered targets.

Add listener

Protocol e.g. HTTP
Port e.g. 80. This is a public facing port and it does not need to be the same as the port from the attached target group. E.g. LB can listen on port 80 and forward traffic to target group port 8080.
Default action: Forward to (select a target group)
Add listener tags

Add-on services - optional

AWS Global Accelerator

Tags - optional

More info on Scheme, from AWS documentation:

When you create a load balancer, you must choose whether to make it an internal load balancer or an internet-facing load balancer.

The nodes of an internet-facing load balancer have public IP addresses.

The nodes of an internal load balancer have only private IP addresses.

Both internet-facing and internal load balancers route requests to your targets using private IP addresses. Therefore, your targets do not need public IP addresses to receive requests from an internal or an internet-facing load balancer.

More info on how ALB routes traffic to multiple Availability Zones (and about what Load Balancer Nodes are):

When you enable an Availability Zone for your load balancer, Elastic Load Balancing creates a load balancer node in the Availability Zone.
The nodes for your load balancer distribute requests from clients to registered targets. When cross-zone load balancing is enabled, each load balancer node distributes traffic across the registered targets in all enabled Availability Zones. When cross-zone load balancing is disabled, each load balancer node distributes traffic only across the registered targets in its Availability Zone.
Before a client sends a request to your load balancer, it resolves the load balancer's domain name using a Domain Name System (DNS) server. The DNS entry is controlled by Amazon, because your load balancers are in the amazonaws.com domain. The Amazon DNS servers return one or more IP addresses to the client. These are the IP addresses of the load balancer nodes for your load balancer.
As traffic to your application changes over time, Elastic Load Balancing scales your load balancer and updates the DNS entry. The DNS entry also specifies the time-to-live (TTL) of 60 seconds. This helps ensure that the IP addresses can be remapped quickly in response to changing traffic.

The client determines which IP address to use to send requests to the load balancer. The load balancer node that receives the request selects a healthy registered target and sends the request to the target using its private IP address.

With Application Load Balancers, the load balancer node that receives the request uses the following process:

1) Evaluates the listener rules in priority order to determine which rule to apply.

2) Selects a target from the target group for the rule action, using the routing algorithm configured for the target group. The default routing algorithm is round robin. Routing is performed independently for each target group, even when a target is registered with multiple target groups.

For further info: How Elastic Load Balancing works - Elastic Load Balancing

ALB nodes use Elastic Network Interface (Elastic network interfaces - Amazon Elastic Compute Cloud) which has public IP address:

At least one ENI is created and attached to the balancer in each availability zone where the balancer is deployed (except NLB, which should only have one per AZ). Over the life of the balancer, new ENIs will appear and old ones will disappear, as the balancer scales horizontally (number of nodes) and/or vertically (capacity of underlying hardware), all of which is handled transparently by the infrastructure. Even though you can tag them, the tagging will become stale over time.

Source: amazon web services - AWS - Affect Load Balancer's tags to its Network Interfaces (ENI) - Stack Overflow

You can determine the IP addresses associated with an internal load balancer or an internet-facing load balancer by resolving the DNS name of the load balancer. These are the IP addresses where the clients should send the requests that are destined for the load balancer. However, Classic Load Balancers and Application Load Balancers use the private IP addresses associated with their elastic network interfaces as the source IP address for requests forwarded to your web servers.

Source: Find the IP address used by a load balancer to forward traffic to web servers

Load balancer routes requests to the targets in a target group and performs health checks on the targets. Target group is accepting requests from the load balancer and forwards them to targets. These targets can be e.g. EC2 instances created either manually or through auto scaling group.

How to create a Target Group used by Load Balancer listeners? (This applies for any type of Load Balancer)

EC2 >> Target groups >> Create target group

Step 1: Specify group details

Here we can set:

Basic configuration. Settings in this section cannot be changed after the target group is created.

Target type

Instances

Supports load balancing to instances within a specific VPC.
Facilitates the use of Amazon EC2 Auto Scaling to manage and scale your EC2 capacity.

IP addresses

Supports load balancing to VPC and on-premises resources.
Facilitates routing to multiple IP addresses and network interfaces on the same instance.
Offers flexibility with microservice based architectures, simplifying inter-application communication.
Supports IPv6 targets, enabling end-to-end IPv6 communication, and IPv4-to-IPv6 NAT.

Lambda function

Facilitates routing to a single Lambda function.
Accessible to Application Load Balancers only.

Application Load Balancer

Offers the flexibility for a Network Load Balancer to accept and route TCP requests within a specific VPC
Facilitates using static IP addresses and PrivateLink with an Application Load Balancer.

Target group name
Protocol:Port e.g. If our application is accepting HTTP requests on port 8080 this would be HTTP:8080
VPC - VPC with the instances that you want to include in the target group.
Protocol version

HTTP1. Send requests to targets using HTTP/1.1. Supported when the request protocol is HTTP/1.1 or HTTP/2.
HTTP2. Send requests to targets using HTTP/2. Supported when the request protocol is HTTP/2 or gRPC, but gRPC-specific features are not available.
gRPC. Send requests to targets using gRPC. Supported when the request protocol is gRPC.

Health checks. The associated load balancer periodically sends requests, per the settings below, to the registered targets to test their status.

Health check protocol

HTTP
HTTPS

Health check path. Use the default path of “/“ to ping the root, or specify a custom path if preferred.
Advanced health check settings

Port. The port the load balancer uses when performing health checks on targets. The default is the port on which each target receives traffic from the load balancer, but you can specify a different port.

Traffic port
Override

Healthy threshold. The number of consecutive health checks successes required before considering an unhealthy target healthy.
Unhealthy threshold. The number of consecutive health check failures required before considering a target unhealthy.
Timeout. The amount of time, in seconds, during which no response means a failed health check.
Interval. The approximate amount of time between health checks of an individual target
Success codes. The HTTP codes to use when checking for a successful response from a target. You can specify multiple values (for example, "200,202") or a range of values (for example, "200-299").

Attributes
Tags - optional

Step 2: Register targets

This is an optional step to create a target group. However, to ensure that your load balancer routes traffic to this target group you must register your targets.

After load balancer is created it takes several minutes while it's in provisioning state and get into active state. After this, we can use its DNS name in order to see what it's doing.

If we copy its DNS name and paste it to our browser, if we haven't registered any targets in the target group associated with the load balancer, we'll get error 503 - Service Temporary Unavailable.

If we've registered targets and are getting error 504 Gateway time-out, we should check first if security groups (firewalls) for our EC2 instances (inbound rule - source IP range) are set up correctly as this error usually indicates that inbound traffic is not allowed.

AWS Terraform provider offers provisioning all these resources:

Application Load Balancer: aws_lb | Resources | hashicorp/aws | Terraform Registry
Listener: aws_lb_listener | Resources | hashicorp/aws | Terraform Registry
Target Group: aws_lb_target_group | Resources | hashicorp/aws | Terraform Registry

How is AWS Application Load Balancing usually implemented?

Let's say we have our application running on 3 EC2 instances where 2 are in the same region e.g. us-west-2 but in separate availability zones e.g. us-west-2a and us-west-2b. Third EC2 instance is in eu-central-1, in availability zone eu-central-1a.

VPC is region-specific but can span multiple availability zones (AZ).

Subnet is an IP address range within VPC.

VPC can have public and private subnets.

VPC can be divided into multiple subnets but each subnet is AZ-specific.

AZ can have multiple subnets.

So, all EC2 instances belong to the same VPC but, as they are in different AZs, each of them belongs to different subnet.

Load balancer must be in the public subnet of VPC as clients communicate with load balancer via internet (public network).

Load balancer does not get associated directly with EC2 instances but subnets:

resource "aws_lb" "test" {

subnets = ["subnet-0001", "subnet-0002"]

...

}

Target group is associated with VPC:

resource "aws_alb_target_group" "test" {

vpc_id = var.vpc_id

...

}

Difference between ALB and NLB (Network Load Balancer)

An Application Load Balancer (ALB) and a Network Load Balancer (NLB) serve different purposes based on the layer of the network they operate on and the type of traffic they handle.

The core difference is that an ALB understands application-level traffic (Layer 7) like HTTP/HTTPS headers, while an NLB handles low-level network traffic (Layer 4) like TCP/UDP packets at extreme speeds.

Direct Comparison Matrix

Feature Application Load Balancer (ALB) Network Load Balancer (NLB)

====== ========================= =======================

OSI Layer Layer 7 (Application) Layer 4 (Transport)

Protocols HTTP, HTTPS, HTTP/2, gRPC, WebSockets TCP, UDP, TLS

IP Addresses

Dynamic IPs (Changes automatically; requires a DNS name)

Static IPs (Can assign an Elastic IP per AZ)

Routing Features

Advanced (Path, Host, Query parameters, Headers)

Basic (Port and IP protocol routing only)

Performance

Optimized for complex web apps (Millions of requests/sec)

Optimized for ultra-low latency (Billions of requests/sec)

Key Technical Differences

1. Smart Routing vs. Raw Speed

ALB (Smart): Can read the contents of your HTTP requests. It can route traffic bound for ://example.com to an API server cluster, and traffic for ://example.com to a storage cluster.
NLB (Fast): Does not look inside the data packet. It simply looks at the target port and forwards the packet instantly. This results in ultra-low latency (measured in milliseconds).

2. IP Addresses and DNS

ALB: Scale out dynamically by adding or removing nodes. This causes its underlying IP addresses to change frequently. You must always point your domain name to the ALB's DNS Name, never to a static IP.
NLB: Gives you a Static IP address per Availability Zone. You can also assign your own Elastic IP addresses. This is critical if your corporate clients need to whitelist specific, unchanging IPs in their firewalls.

3. Client IP Preservation

ALB: Terminates the connection and makes a new one to your backend instances. The backend see the ALB's private IP. To find the real user's IP, your code must read the X-Forwarded-For HTTP header.
NLB: Passes the original TCP packet straight through to your backend server. Your backend instances see the original source IP address of the client natively, without needing extra headers.

When to Choose Which?

Choose an ALB if you are building:

Standard web applications and microservices.
Containerized apps (ECS/EKS) requiring path-based or host-based routing.
Applications requiring tight integration with AWS Web Application Firewall (WAF).

Choose an NLB if you are building:

Non-HTTP applications (e.g., gaming servers, SFTP, MQTT, database clusters).
Architectures requiring fixed, static IP addresses or Elastic IPs.
High-frequency financial applications where sub-millisecond network latency is a hard requirement

Which alerts should typically be set for AWS ALB?

To keep your applications highly available, you should set up Amazon CloudWatch alarms for a mix of availability, performance, and target health metrics.

The most critical metrics to monitor for an AWS ALB are grouped by priority below:

1. High Priority (Critical Infrastructure Impact)

UnHealthyHostCount (Per Target Group)

What it means: The number of backend instances failing health checks Target Group Metrics.

Alert Threshold: > 0 (or > 1 for larger clusters).

Why it matters: Signals that your servers are crashing or cannot handle traffic.

HTTPCode_Target_5XX_Count

What it means: The number of 5xx server error codes generated by your backend application ALB Metrics.

Alert Threshold: Depends on baseline traffic, typically > 5 failures within a 1-minute to 5-minute window.

Why it matters: Indicates server crashes, database connection timeouts, or unhandled exceptions in your application code.

HTTPCode_ELB_5XX_Count

What it means: The number of 5xx errors generated directly by the ALB itself (not your servers) ALB Metrics.

Alert Threshold: > 0.

Why it matters: Usually means the ALB cannot find any healthy hosts, or it is experiencing a configuration mismatch (e.g., bad TLS handshake with the target).

2. Medium Priority (Performance & User Experience)

TargetResponseTime

What it means: The time elapsed (in seconds) from when the ALB sent the request to the target until the target started responding ALB Metrics.

Alert Threshold: Use the p95 or p99 statistic. Alert if it exceeds your application’s maximum acceptable latency (e.g., > 2.0 seconds).

Why it matters: Users are experiencing severe application slowdowns, likely due to high CPU/memory usage on your instances.

RejectedConnectionCount

What it means: The load balancer is rejecting connections because it has reached its maximum capacity ALB Metrics.

Alert Threshold: > 0.

Why it matters: Your application is getting sudden traffic spikes and the ALB cannot scale fast enough, or backend targets are failing to keep up.

3. Low Priority (Anomalies & Security)

HTTPCode_Target_4XX_Count

What it means: The number of 4xx client errors (like 404 Not Found or 401 Unauthorized) returned by backend targets ALB Metrics.

Alert Threshold: A significant spike above your standard baseline.

Why it matters: A sudden surge might indicate a broken frontend deployment, a bad API update, or a malicious entity scanning your network for vulnerabilities.

Summary Checklist for CloudWatch Alarms

Metric Name Statistic Recommended Suggested Action

Period Threshold

========== ====== =========== ======== =====

UnHealthyHostCount Maximum 1 Minute > 0 Page/On-Call

HTTPCode_ELB_5XX_Count Sum 1 Minute > 0 Page/On-Call

HTTPCode_Target_5XX_Count Sum 5 Minutes > 10 (or > 1% of traffic) Ticket/Slack

TargetResponseTime p95 5 Minutes> [Your Limit] Ticket/Slack

Resources:

How Elastic Load Balancing works - Elastic Load Balancing

My Public Notepad

Pages