An Application Load Balancer (ALB) keeps your application online by continuously monitoring the health of your backend targets and dynamically redirecting traffic away from failing nodes.
The ALB sends periodic ping requests (health checks) to every registered target. If a target fails to respond correctly, the ALB marks it as unhealthy and immediately stops sending user traffic to it. Traffic is rerouted to the remaining healthy nodes with zero downtime for the user.
When paired with an Auto Scaling Group (ASG), ALB health checks can trigger the automatic replacement of broken instances.
When an unhealthy instance recovers, or when a new instance is launched, the ALB does not send traffic to it immediately. It enters an initial state and undergoes consecutive successful health checks. Only when it passes the threshold does the ALB safely introduce it back into the traffic rotation.
You configure health checks inside the Target Group settings using these parameters:
======== ========= =================
(e.g., /health or /index.html). /health
When you configure an ALB, you do not select an Availability Zone (AZ) directly; instead, you must select at least two subnets in different Availability Zones to ensure high availability.
Binding an ALB to public subnets makes it a public (internet-facing) load balancer.
When you create an internet-facing ALB, AWS requires you to select public subnets so the ALB nodes can receive a public IP address and route traffic from the internet.
An internal (private) ALB routes traffic in the exact same way as a public ALB, but it is only accessible within your VPC or connected networks.
It routes traffic directly to individual backend targets, not to subnets.
AWS Elastic Load Balancing shows basic building blocks of AWS Load Balancer which include listeners and target groups.
To create Application Load Balancer go to EC2 >> Load balancers >> Create Load balancer >> Select load balancer type (click on Create under Application Load Balancer)
Here we can set:
- Basic configuration
- Name
- Scheme (cannot be changed after the load balancer is created)
- Internet-facing. An internet-facing load balancer routes requests from clients over the internet to targets. Requires a public subnet.
- Internal. An internal load balancer routes requests from clients to targets using private IP addresses.
- IP address type. Select the type of IP addresses that your subnets use.
- IPv4. Recommended for internal load balancers.
- Dualstack. Includes IPv4 and IPv6 addresses.
- Network mapping. The load balancer routes traffic to targets in the selected subnets, and in accordance with your IP address settings.
- VPC. Virtual private cloud for your targets. If balancer is internet-facing, only VPCs with an internet gateway are enabled for selection. The selected VPC cannot be changed after the load balancer is created. As VPC is region-specific so is Application Load Balancer.
- Mappings. Once VPC is selected, its availability zones are listed here and are selectable. Select at least two Availability Zones and one subnet per zone. The load balancer routes traffic to targets in these Availability Zones only. Availability Zones that are not supported by the load balancer or the VPC are not available for selection. We should select all AZs that we listed in the Auto scaling group (if we used it).
- Security groups. A security group is a set of firewall rules that control the traffic to your load balancer. We can select up to 10 security groups.
- If our application is listening for HTTP requests on port 80 we should select a security group with:
- Inbound rule: accept HTTP/TCP traffic on port 80 with source Anywhere-IPv4
- Outbound rule: allow all traffic for all protocols and port ranges to custom destination 0.0.0.0/0
- Listeners and routing. A listener is a process that checks for connection requests using the port and protocol you configure. The rules that you define for a listener determine how the load balancer routes requests to its registered targets.
- Add listener
- Protocol e.g. HTTP
- Port e.g. 80. This is a public facing port and it does not need to be the same as the port from the attached target group. E.g. LB can listen on port 80 and forward traffic to target group port 8080.
- Default action: Forward to (select a target group)
- Add listener tags
- Add-on services - optional
- Tags - optional
More info on Scheme, from AWS documentation:
When you create a load balancer, you must choose whether to make it an internal load balancer or an internet-facing load balancer.
The nodes of an internet-facing load balancer have public IP addresses.
The nodes of an internal load balancer have only private IP addresses.
Both internet-facing and internal load balancers route requests to your targets using private IP addresses. Therefore, your targets do not need public IP addresses to receive requests from an internal or an internet-facing load balancer.
More info on how ALB routes traffic to multiple Availability Zones (and about what Load Balancer Nodes are):
When you enable an Availability Zone for your load balancer, Elastic Load Balancing creates a load balancer node in the Availability Zone.
The nodes for your load balancer distribute requests from clients to registered targets. When cross-zone load balancing is enabled, each load balancer node distributes traffic across the registered targets in all enabled Availability Zones. When cross-zone load balancing is disabled, each load balancer node distributes traffic only across the registered targets in its Availability Zone.
Before a client sends a request to your load balancer, it resolves the load balancer's domain name using a Domain Name System (DNS) server. The DNS entry is controlled by Amazon, because your load balancers are in the amazonaws.com domain. The Amazon DNS servers return one or more IP addresses to the client. These are the IP addresses of the load balancer nodes for your load balancer.
As traffic to your application changes over time, Elastic Load Balancing scales your load balancer and updates the DNS entry. The DNS entry also specifies the time-to-live (TTL) of 60 seconds. This helps ensure that the IP addresses can be remapped quickly in response to changing traffic.
The client determines which IP address to use to send requests to the load balancer. The load balancer node that receives the request selects a healthy registered target and sends the request to the target using its private IP address.
With Application Load Balancers, the load balancer node that receives the request uses the following process:
1) Evaluates the listener rules in priority order to determine which rule to apply.
2) Selects a target from the target group for the rule action, using the routing algorithm configured for the target group. The default routing algorithm is round robin. Routing is performed independently for each target group, even when a target is registered with multiple target groups.
For further info: How Elastic Load Balancing works - Elastic Load Balancing
ALB nodes use Elastic Network Interface (Elastic network interfaces - Amazon Elastic Compute Cloud) which has public IP address:
At least one ENI is created and attached to the
balancer in each availability zone where the balancer is deployed
(except NLB, which should only have one per AZ). Over the life of the
balancer, new ENIs will appear and old ones will disappear, as the
balancer scales horizontally (number of nodes) and/or vertically
(capacity of underlying hardware), all of which is handled transparently
by the infrastructure. Even though you can tag them, the tagging will
become stale over time.
Source: amazon web services - AWS - Affect Load Balancer's tags to its Network Interfaces (ENI) - Stack Overflow
You can determine the IP addresses associated with an internal load balancer or an internet-facing load balancer by resolving the DNS name of the load balancer. These are the IP addresses where the clients should send the requests that are destined for the load balancer. However, Classic Load Balancers and Application Load Balancers use the private IP addresses associated with their elastic network interfaces as the source IP address for requests forwarded to your web servers.
Source: Find the IP address used by a load balancer to forward traffic to web servers
Load balancer routes requests to the targets in a target group and performs health checks on the targets. Target group is accepting requests from the load balancer and forwards them to targets. These targets can be e.g. EC2 instances created either manually or through auto scaling group.
How to create a Target Group used by Load Balancer listeners? (This applies for any type of Load Balancer)
EC2 >> Target groups >> Create target group
Step 1: Specify group details
Here we can set:
- Basic configuration. Settings in this section cannot be changed after the target group is created.
- Target type
- Instances
- Supports load balancing to instances within a specific VPC.
- Facilitates the use of Amazon EC2 Auto Scaling to manage and scale your EC2 capacity.
- IP addresses
- Supports load balancing to VPC and on-premises resources.
- Facilitates routing to multiple IP addresses and network interfaces on the same instance.
- Offers flexibility with microservice based architectures, simplifying inter-application communication.
- Supports IPv6 targets, enabling end-to-end IPv6 communication, and IPv4-to-IPv6 NAT.
- Lambda function
- Facilitates routing to a single Lambda function.
- Accessible to Application Load Balancers only.
- Application Load Balancer
- Offers the flexibility for a Network Load Balancer to accept and route TCP requests within a specific VPC
- Facilitates using static IP addresses and PrivateLink with an Application Load Balancer.
- Target group name
- Protocol:Port e.g. If our application is accepting HTTP requests on port 8080 this would be HTTP:8080
- VPC - VPC with the instances that you want to include in the target group.
- Protocol version
- HTTP1. Send requests to targets using HTTP/1.1. Supported when the request protocol is HTTP/1.1 or HTTP/2.
- HTTP2. Send requests to targets using HTTP/2. Supported when the request protocol is HTTP/2 or gRPC, but gRPC-specific features are not available.
- gRPC. Send requests to targets using gRPC. Supported when the request protocol is gRPC.
- Health checks. The associated load balancer periodically sends requests, per the settings below, to the registered targets to test their status.
- Health check protocol
- Health check path. Use the default path of “/“ to ping the root, or specify a custom path if preferred.
- Advanced health check settings
- Port. The port the load balancer uses when performing health checks on targets. The default is the port on which each target receives traffic from the load balancer, but you can specify a different port.
- Healthy threshold. The number of consecutive health checks successes required before considering an unhealthy target healthy.
- Unhealthy threshold. The number of consecutive health check failures required before considering a target unhealthy.
- Timeout. The amount of time, in seconds, during which no response means a failed health check.
- Interval. The approximate amount of time between health checks of an individual target
- Success codes. The HTTP codes to use when checking for a successful response from a target. You can specify multiple values (for example, "200,202") or a range of values (for example, "200-299").
- Attributes
- Tags - optional
Step 2: Register targets
This is an optional step to create a target group. However, to ensure that your load balancer routes traffic to this target group you must register your targets.
After load balancer is created it takes several minutes while it's in provisioning state and get into active state. After this, we can use its DNS name in order to see what it's doing.
If we copy its DNS name and paste it to our browser, if we haven't registered any targets in the target group associated with the load balancer, we'll get error 503 - Service Temporary Unavailable.
If we've registered targets and are getting error 504 Gateway time-out, we should check first if security groups (firewalls) for our EC2 instances (inbound rule - source IP range) are set up correctly as this error usually indicates that inbound traffic is not allowed.
AWS Terraform provider offers provisioning all these resources:
How is AWS Application Load Balancing usually implemented?
Let's say we have our application running on 3 EC2 instances where 2 are in the same region e.g. us-west-2 but in separate availability zones e.g. us-west-2a and us-west-2b. Third EC2 instance is in eu-central-1, in availability zone eu-central-1a.
VPC is region-specific but can span multiple availability zones (AZ).
Subnet is an IP address range within VPC.
VPC can have public and private subnets.
VPC can be divided into multiple subnets but each subnet is AZ-specific.
AZ can have multiple subnets.
So, all EC2 instances belong to the same VPC but, as they are in different AZs, each of them belongs to different subnet.
Load balancer must be in the public subnet of VPC as clients communicate with load balancer via internet (public network).
Load balancer does not get associated directly with EC2 instances but subnets:
resource "aws_lb" "test" {
subnets = ["subnet-0001", "subnet-0002"]
...
}
Target group is associated with VPC:
resource "aws_alb_target_group" "test" {
vpc_id = var.vpc_id
...
}
Difference between ALB and NLB (Network Load Balancer)
An Application Load Balancer (ALB) and a Network Load Balancer (NLB) serve different purposes based on the layer of the network they operate on and the type of traffic they handle.
The core difference is that an ALB understands application-level traffic (Layer 7) like HTTP/HTTPS headers, while an NLB handles low-level network traffic (Layer 4) like TCP/UDP packets at extreme speeds.
Direct Comparison Matrix
Feature Application Load Balancer (ALB) Network Load Balancer (NLB)
====== ========================= =======================
OSI Layer Layer 7 (Application) Layer 4 (Transport)
Protocols HTTP, HTTPS, HTTP/2, gRPC, WebSockets TCP, UDP, TLS
IP Addresses
Dynamic IPs (Changes automatically; requires a DNS name)
Static IPs (Can assign an Elastic IP per AZ)
Routing Features
Advanced (Path, Host, Query parameters, Headers)
Basic (Port and IP protocol routing only)
Performance
Optimized for complex web apps (Millions of requests/sec)
Optimized for ultra-low latency (Billions of requests/sec)
Key Technical Differences
1. Smart Routing vs. Raw Speed
- ALB (Smart): Can read the contents of your HTTP requests. It can route traffic bound for ://example.com to an API server cluster, and traffic for ://example.com to a storage cluster.
- NLB (Fast): Does not look inside the data packet. It simply looks at the target port and forwards the packet instantly. This results in ultra-low latency (measured in milliseconds).
2. IP Addresses and DNS
- ALB: Scale out dynamically by adding or removing nodes. This causes its underlying IP addresses to change frequently. You must always point your domain name to the ALB's DNS Name, never to a static IP.
- NLB: Gives you a Static IP address per Availability Zone. You can also assign your own Elastic IP addresses. This is critical if your corporate clients need to whitelist specific, unchanging IPs in their firewalls.
3. Client IP Preservation
- ALB: Terminates the connection and makes a new one to your backend instances. The backend see the ALB's private IP. To find the real user's IP, your code must read the X-Forwarded-For HTTP header.
- NLB: Passes the original TCP packet straight through to your backend server. Your backend instances see the original source IP address of the client natively, without needing extra headers.
When to Choose Which?
Choose an ALB if you are building:
- Standard web applications and microservices.
- Containerized apps (ECS/EKS) requiring path-based or host-based routing.
- Applications requiring tight integration with AWS Web Application Firewall (WAF).
Choose an NLB if you are building:
- Non-HTTP applications (e.g., gaming servers, SFTP, MQTT, database clusters).
- Architectures requiring fixed, static IP addresses or Elastic IPs.
- High-frequency financial applications where sub-millisecond network latency is a hard requirement
Which alerts should typically be set for AWS ALB?
To keep your applications highly available, you should set up Amazon CloudWatch alarms for a mix of availability, performance, and target health metrics.
The most critical metrics to monitor for an AWS ALB are grouped by priority below:
1. High Priority (Critical Infrastructure Impact)
UnHealthyHostCount (Per Target Group)
What it means: The number of backend instances failing health checks Target Group Metrics.
Alert Threshold: > 0 (or > 1 for larger clusters).
Why it matters: Signals that your servers are crashing or cannot handle traffic.
HTTPCode_Target_5XX_Count
What it means: The number of 5xx server error codes generated by your backend application ALB Metrics.
Alert Threshold: Depends on baseline traffic, typically > 5 failures within a 1-minute to 5-minute window.
Why it matters: Indicates server crashes, database connection timeouts, or unhandled exceptions in your application code.
HTTPCode_ELB_5XX_Count
What it means: The number of 5xx errors generated directly by the ALB itself (not your servers) ALB Metrics.
Alert Threshold: > 0.
Why it matters: Usually means the ALB cannot find any healthy hosts, or it is experiencing a configuration mismatch (e.g., bad TLS handshake with the target).
2. Medium Priority (Performance & User Experience)
TargetResponseTime
What it means: The time elapsed (in seconds) from when the ALB sent the request to the target until the target started responding ALB Metrics.
Alert Threshold: Use the p95 or p99 statistic. Alert if it exceeds your application’s maximum acceptable latency (e.g., > 2.0 seconds).
Why it matters: Users are experiencing severe application slowdowns, likely due to high CPU/memory usage on your instances.
RejectedConnectionCount
What it means: The load balancer is rejecting connections because it has reached its maximum capacity ALB Metrics.
Alert Threshold: > 0.
Why it matters: Your application is getting sudden traffic spikes and the ALB cannot scale fast enough, or backend targets are failing to keep up.
3. Low Priority (Anomalies & Security)
HTTPCode_Target_4XX_Count
What it means: The number of 4xx client errors (like 404 Not Found or 401 Unauthorized) returned by backend targets ALB Metrics.
Alert Threshold: A significant spike above your standard baseline.
Why it matters: A sudden surge might indicate a broken frontend deployment, a bad API update, or a malicious entity scanning your network for vulnerabilities.
Summary Checklist for CloudWatch Alarms
Metric Name Statistic Recommended Suggested Action
Period Threshold
========== ====== =========== ======== =====
UnHealthyHostCount Maximum 1 Minute > 0 Page/On-Call
HTTPCode_ELB_5XX_Count Sum 1 Minute > 0 Page/On-Call
HTTPCode_Target_5XX_Count Sum 5 Minutes > 10 (or > 1% of traffic) Ticket/Slack
TargetResponseTime p95 5 Minutes> [Your Limit] Ticket/Slack
Resources: