Kubernetes Metrics Server is a foundational component required by several other critical cluster modules and tools:
1. Horizontal Pod Autoscaler (HPA)
2. Vertical Pod Autoscaler (VPA)
- Purpose: While HPA adds more pods, the Vertical Pod Autoscaler (VPA) adjusts the CPU and memory requests/limits of existing pods.
- Dependency: VPA relies on Metrics Server for the real-time resource data it uses to recommend or apply these resource changes.
2. Native CLI Observability (kubectl top)
- Purpose: Commands used for ad-hoc debugging and performance monitoring.
- Dependency: Both kubectl top pods and kubectl top nodes query the Metrics API directly. Without the server, these commands will return an error.
3. Kubernetes Dashboard
- Purpose: A web-based UI for managing and troubleshooting clusters.
- Dependency: The Kubernetes Dashboard uses Metrics Server to display resource usage graphs and live statistics for nodes and pods.
4. Third-Party Monitoring Tools & Adapters
- Custom Metrics Adapters: Some adapters that bridge external sources (like CloudWatch or Datadog) to Kubernetes may use the standard Metrics API for fallback or basic resource data.
- Resource Management Tools: Operational tools such as Goldilocks, which suggests "just right" resource requests, often depend on the baseline metrics provided by this server.
Key Distinction
While the Metrics Server is essential for these control loops (HPA, VPA), it is not a replacement for a full observability stack like Prometheus. It only stores a short-term, in-memory snapshot and does not provide historical data
How to to install the Metrics Server as an EKS Community Add-on to enable these features?
In March 2025, AWS introduced a new catalog of community add-ons that includes the Metrics Server. This allows you to manage it directly through EKS-native tools like any other AWS-managed add-on (e.g., VPC CNI or CoreDNS).
Method 1: Using the AWS Management Console
The easiest way to install it is through the EKS console:
- Navigate to your EKS cluster in the AWS Console.
- Select the Add-ons tab and click Get more add-ons.
- Scroll down to the Community add-ons section.
- Find Metrics Server, select it, and click Next.
- Choose the desired version (usually the latest recommended) and click Create.
Method 2: Using the AWS CLI
You can also install the community add-on via the command line:
aws eks create-addon \
--cluster-name <YOUR_CLUSTER_NAME> \
--addon-name metrics-server
Verification
Once the installation status moves to Active, verify that the pods are running in the kube-system namespace:
kubectl get deployment metrics-server -n kube-system
Finally, test that the Metrics API is responding:
kubectl top nodes
Note: If you are using AWS Fargate, you may need to update the containerPort from 10250 to 10251 in the deployment configuration to ensure compatibility with Fargate's networking constraints.
Metrics Server Configuration
To configure custom resource limits for the Metrics Server EKS community add-on, you can use Configuration Values during installation or update. This is essential for high-pod-count clusters where the default allocation may lead to OOMKilled errors.
1. Scaling Recommendations
The Metrics Server's resource consumption scales linearly with your cluster's size. Baseline recommendations include:
- CPU: Approximately 1 millicore per node in the cluster.
- Memory: Approximately 2 MB of memory per node.
- Large Clusters: If your cluster exceeds 100 nodes, it is recommended to double these defaults and monitor performance.
2. How to Apply Custom Limits
You can provide a JSON or YAML configuration block via the AWS EKS Add-ons API.
Via AWS CLI
Use the configuration-values flag to pass your resource overrides:
aws eks create-addon \
--cluster-name <YOUR_CLUSTER_NAME> \
--addon-name metrics-server \
--configuration-values '{
"resources": {
"requests": { "cpu": "100m", "memory": "200Mi" },
"limits": { "cpu": "200m", "memory": "500Mi" }
}
}'
Via AWS Console
- Go to the Add-ons tab in your EKS cluster.
- Click Edit on the metrics-server add-on.
- Expand the Optional configuration settings.
- Paste the JSON configuration into the Configuration values text box.
3. Critical Configuration for High Traffic
In addition to resource limits, you may want to adjust the scraping frequency to make HPA more responsive.
- Metric Resolution: The default is 60s. For faster scaling, add --metric-resolution=15s to the container arguments via the same configuration block.
- High Availability: The community add-on defaults to 2 replicas to prevent downtime during scaling events.

No comments:
Post a Comment