Wednesday, 28 January 2026

How to test Redis connectivity




We first need to check if Redis DNS name resolves:

% nslookup redis.example.com

Server: 192.168.1.1
Address: 192.168.1.1#53

Non-authoritative answer:
redis.example.com canonical name = example-prod-redis-serverless-kg5o59.serverless.use2.cache.amazonaws.com.
example-prod-redis-serverless-kg5o59.serverless.use2.cache.amazonaws.com canonical name = default.example-prod-redis-serverless-kg5o59.serverless.use2.cache.amazonaws.com.
Name: default.example-prod-redis-serverless-kg5o59.serverless.use2.cache.amazonaws.com
Address: 10.0.3.74
...


Let's try to make a TCP connection:

% nc -vz redis.example.com 6379

nc: connectx to redis.example.com port 6379 (tcp) failed: Operation timed out

After adding remote client's IP address to inbound rules of Redis security group (firewall):

% nc -vz redis.example.com 6379

Connection to redis.example.com port 6379 [tcp/*] succeeded!

 
Let's now install Redis client so we can try to connect to the server:

% brew install redis
==> Fetching downloads for: redis
✔︎ Bottle Manifest redis (8.4.0)                                                                                                                                                            Downloaded   10.9KB/ 10.9KB
✔︎ Bottle Manifest ca-certificates (2025-12-02)                                                                                                                                             Downloaded    2.0KB/  2.0KB
✔︎ Bottle ca-certificates (2025-12-02)                                                                                                                                                      Downloaded  131.8KB/131.8KB
✔︎ Bottle redis (8.4.0)                                                                                                                                                                     Downloaded    1.2MB/  1.2MB
==> Installing redis dependency: ca-certificates
==> Pouring ca-certificates--2025-12-02.all.bottle.tar.gz
==> Regenerating CA certificate bundle from keychain, this may take a while...
🍺  /opt/homebrew/Cellar/ca-certificates/2025-12-02: 4 files, 236.4KB
==> Pouring redis--8.4.0.arm64_sequoia.bottle.tar.gz
==> Caveats
To start redis now and restart at login:
  brew services start redis
Or, if you don't want/need a background service you can just run:
  /opt/homebrew/opt/redis/bin/redis-server /opt/homebrew/etc/redis.conf
==> Summary
🍺  /opt/homebrew/Cellar/redis/8.4.0: 15 files, 3MB
==> Running `brew cleanup redis`...
...
==> Caveats
==> redis
To start redis now and restart at login:
  brew services start redis
Or, if you don't want/need a background service you can just run:
  /opt/homebrew/opt/redis/bin/redis-server /opt/homebrew/etc/redis.conf


Let's connect to Redis server and execute ping command (expected response is PONG) and also get some information about the server:


% redis-cli \       
  --tls \
  -h redis.example.com \
  -p 6379

redis.example.com:6379> ping
PONG
redis.example.com:6379> info server 
# Server
redis_version:7.1
redis_mode:cluster
os:Amazon ElastiCache
arch_bits:64
run_id:0
redis.example.com:6379> 


---

Sunday, 25 January 2026

Strategies for AWS EKS Cluster Kubernetes Version Upgrade


This article explores AWS EKS cluster upgrade strategies end-to-end (control plane + nodes), and where node strategies fit inside them. We are assuming there is a single cluster (no multiple clusters, one per environment). In case there are multiple clusters, one per environment (e.g. dev, stage, prod), different approaches should also be considered.


General Upgrade order


This is critical in EKS. A proper cadence explicitly states the order:

  1. Non-prod clusters first
  2. Control plane
  3. Managed add-ons
    1. VPC CNI
    2. CoreDNS
    3. kube-proxy
  4. Worker nodes
  5. Platform controllers
    1. Ingress
    2. Autoscalers
    3. Observability

Example:

Upgrade order: dev → staging → prod
Control plane → add-ons → nodes → workloads

Wednesday, 14 January 2026

Elasticsearch Data Streams





In Elasticsearch, a data stream is an abstraction layer designed to simplify the management of continuously generated time-series data, such as logs, metrics, and events. 

Key Characteristics


  • Time-Series Focus: Every document indexed into a data stream must contain a @timestamp field, which is used to organize and query the data.
  • Append-Only Design: Data streams are optimized for use cases where data is rarely updated or deleted. We cannot send standard update or delete requests directly to the stream; these must be performed via _update_by_query or directed at specific backing indices.
  • Unified Interface: Users interact with a single named resource (the data stream name) for both indexing and searching, even though the data is physically spread across multiple underlying indices. 

Architecture: Backing Indices


A data stream consists of one or more hidden, auto-generated backing indices: 
  • Write Index: The most recently created backing index. All new documents are automatically routed here.
  • Rollover: When the write index reaches a specific size or age, Elasticsearch automatically creates a new backing index (rollover) and sets it as the new write index.
  • Search: Search requests sent to the data stream are automatically routed to all of its backing indices to return a complete result set. 

Automated Management


Data streams rely on two primary automation tools: 
  • Index Templates: These define the stream's structure, including field mappings and settings, and must include a data_stream object to enable the feature.
  • Lifecycle Management (ILM/DSL): Tools like Index Lifecycle Management (ILM) or the newer Data Stream Lifecycle automate tasks like moving old indices to cheaper hardware (hot/warm/cold tiers) and eventually deleting them based on retention policies. 

When to Use


  • Ideal for: Logs, events, performance metrics, and security traces.
  • Avoid for: Use cases requiring frequent updates to existing records (like a product catalog) or data that lacks a timestamp.

How does data stream know when to rollover?


Data streams are typically managed by:
  • Index Lifecycle Management (ILM)
  • Data Stream Lifecycle (DSL) - newer concept

In cluster settings, data_streams.lifecycle.poll_interval defines how often shall Elasticsearch go over each data stream, check if it is eligible for a rollover and then perform it. 

To find this interval value, check the output of 

GET _cluster/settings

By default, the GET _cluster/settings command only returns settings that have been manually overridden so if we are using default values, we need to add ?include_defaults=true.

Default interval value is 5 minutes which can be verified by checking cluster's default settings:

GET _cluster/settings?include_defaults=true&filter_path=defaults.data_streams.lifecycle.poll_interval

Output:

{
  "defaults": {
    "data_streams": {
      "lifecycle": {
        "poll_interval": "5m"
      }
    }
  }
}


After this interval, Elasticsearch rolls over the write index of the data stream, if it fulfills the conditions defined by cluster.lifecycle.default.rollover. If we are using default cluster settings, we can check its default value:

GET _cluster/settings?include_defaults=true&filter_path=defaults.cluster.lifecycle

Output:

{
  "defaults": {
    "cluster": {
      "lifecycle": {
        "default": {
          "rollover": "max_age=auto,max_primary_shard_size=50gb,min_docs=1,max_primary_shard_docs=200000000"
        }
      }
    }
  }
}


max_age=7d: This is why our indices are rolling over every week.
max_primary_shard_size=50gb: Prevents shards from becoming too large and slow.
max_primary_shard_docs=200000000: A built-in limit to maintain search performance, even if the 50GB size hasn't been reached yet

In our case max_age=auto which means Elasticsearch is using a dynamic rollover strategy based on our retention period. If we look at https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/action/admin/indices/rollover/RolloverConfiguration.java#L174-L195 we can see the comment:

    /**
     * When max_age is auto we’ll use the following retention dependent heuristics to compute the value of max_age:
     * - If retention is null aka infinite (default), max_age will be 30 days
     * - If retention is less than or equal to 1 day, max_age will be 1 hour
     * - If retention is less than or equal to 14 days, max_age will be 1 day
     * - If retention is less than or equal to 90 days, max_age will be 7 days
     * - If retention is greater than 90 days, max_age will be 30 days
     */


So, max age of backing index before rollover depends on how long we want to keep data overall in our data stream. For example, if it's 90 days, Elasticsearch will perform rollover and create a new backing index every 7 days.

Instead of a single fixed value for every data stream, auto adjusts the rollover age to ensure that indices aren't kept too long or rolled over too frequently for their specific retention settings.

max_age=auto is a "smart" setting designed to prevent "small index bloat" while ensuring data is deleted on time. It ensures our max_age is always a fraction of our total retention so that we have several backing indices to delete sequentially as they expire.


Data Stream Lifecycle (DSL)


This is a streamlined, automated alternative to the older Index Lifecycle Management (ILM). 

While ILM focuses on "how" data is stored (tiers, hardware, merging), the lifecycle block focuses on "what" happens to the data based on business needs, primarily focusing on retention and automated optimization.


How to find out if data stream is managed by Index Lifecycle Management (ILM) or Data Stream Lifecycle (DSL)?


Get the data stream's details and look at template, lifecycle, next_generation_managed_by and prefer_ilm attributes. Example:

GET _data_stream/ilm-history-7

Output snippet:

      "template": "ilm-history-7",
      "lifecycle": {
        "enabled": true,
        "data_retention": "90d",
        "effective_retention": "90d",
        "retention_determined_by": "data_stream_configuration"
      },
      "next_generation_managed_by": "Data stream lifecycle",
      "prefer_ilm": true,


lifecycle block in our data stream's index template refers to the Data Stream Lifecycle (DSL). 


Inside that lifecycle block, we typically see these children:
  • enabled (Boolean):
    • Interpretation: Determines if Elasticsearch should actively manage this data stream using DSL.
    • Behavior: When set to true, Elasticsearch automatically handles rollover (based on cluster defaults) and deletion (based on our retention settings). If this is missing but other attributes are present, it often defaults to true.
  • data_retention (String):
    • Interpretation: The minimum amount of time Elasticsearch is guaranteed to store our data.
    • Format: Uses time units like 90d (90 days), 30m (30 minutes), or 1h (1 hour).
    • Behavior: This period is calculated starting from the moment a backing index is rolled over (it becomes "read-only"), not from its creation date.
  • effective_retention
    • This is the final calculated value that Elasticsearch actually uses to delete data. 
    • What it represents: It is the minimum amount of time our data is guaranteed to stay in the cluster after an index has rolled over.
    • Why it might differ from our setting: We might set data_retention: "90d", but the cluster might have a global "max retention" or "default retention" policy that overrides our specific request
  • retention_determined_by
    • This attribute identifies the source of the effective_retention value. Common values include: 
      • data_stream_configuration: The retention is coming directly from the data_retention we set in our index template or data stream.
      • default_retention: We didn't specify a retention period, so Elasticsearch is using the cluster-wide default (e.g., data_streams.lifecycle.retention.default).
      • max_retention: We tried to set a very long retention (e.g., 1 year), but a cluster admin has capped all streams at a lower value (e.g., 90 days) using data_streams.lifecycle.retention.max
  • downsampling (Object/Array):
    • Interpretation: Configures the automatic reduction of time-series data resolution over time.
    • Behavior: It defines when (e.g., after 7 days) and how (e.g., aggregate 1-minute metrics into 1-hour blocks) data should be condensed to save storage space while keeping historical trends searchable.

Elasticsearch determines the final retention value using this priority: 
  • If a Max Retention is set on the cluster and our setting exceeds it, Max Retention wins.
  • If we have configured Data Retention on the stream, it is used (as long as it's under the max).
  • If we have not configured anything, the Default Retention for the cluster is used.
  • If no defaults or maxes exist and we haven't set a value, retention is Infinite.

prefer_ilm setting is a transition flag used when a data stream has both an Index Lifecycle Management (ILM) policy and a Data Stream Lifecycle (DSL) configuration. It tells Elasticsearch which of the two management systems should take control of the data stream. Value options are:

  • true: Elasticsearch will use the ILM policy defined in index template setting index.lifecycle.name. It will ignore the lifecycle block (DSL). Use this if you need granular control over shard allocation, force merging, or specific rollover ages that DSL doesn't offer.
  • false (or unset): Elasticsearch will prioritize the Data Stream Lifecycle (DSL) block. It will ignore the ILM policy for rollover and retention. This is the default behavior in modern 2026 clusters to encourage the use of the simpler DSL. 



What if data retention is set both in lifecycle (DSL) and in ILM associated to the index template used for data stream?


If we see retention-related settings in both the lifecycle block and the settings block of an index template, the lifecycle block takes precedence because it is the native configuration for the Data Stream Lifecycle (DSL). This is the modern way to manage data streams. When the lifecycle block is present and enabled: true, Elasticsearch ignores any traditional ILM "Delete" phase settings. It manages the retention of the data stream indices exclusively through the DSL background process.


If a data stream has both a lifecycle block and an ILM policy in data stream index template like:

"settings": {
  "index.lifecycle.name": "my-ilm-policy"
}

...then:
  • The lifecycle block wins: Elasticsearch will prioritize the Data Stream Lifecycle (DSL) for retention and rollover.
  • The ILM policy is ignored: We will often see a warning in the logs or the "Explain" API indicating that the ILM policy is being bypassed because DSL is active.

If we have a custom setting in the settings block (like a metadata field or a legacy retention setting) index.lifecycle.retention: It is ignored for logic: DSL only looks at the lifecycle object. Any other setting is treated as a static index setting and will not trigger the deletion of indices.


How do we associate ILM policy with data stream?


Associating an ILM policy with a data stream requires configuring the data stream's backing index template. Because your current template uses the newer Data Stream Lifecycle (DSL), you must also explicitly tell Elasticsearch to favor ILM.

1. Update the Index Template

To associate a policy, you must add the policy name to the index settings within the template that matches your data stream.

Using the API:

PUT _index_template/<your-template-name>
{
  "index_patterns": ["ilm-history-7*"],
  "template": {
    "settings": {
      "index.lifecycle.name": "your-ilm-policy-name",
      "index.lifecycle.prefer_ilm": true
    }
  },
  "data_stream": { }
}

index.lifecycle.name: Specifies which ILM policy to use.
index.lifecycle.prefer_ilm: true: This is critical if your template still has a lifecycle block. It forces Elasticsearch to use the ILM policy instead of DSL.

2. Apply to Existing Backing Indices

Updating a template only affects future indices created by the data stream. To apply the policy to the 14 indices already in your stream, you must update their settings directly: 


PUT .ds-ilm-history-7-*/_settings
{
  "index": {
    "lifecycle": {
      "name": "your-ilm-policy-name",
      "prefer_ilm": true
    }
  }
}


Note: Use a wildcard like .ds-ilm-history-7-* to target all existing backing indices at once.


If you are moving back to ILM because you need a specific max_age (e.g., rollover every 1 day instead of 7), ensure your new ILM policy has the rollover action defined in its Hot phase. Once applied, the ilm-history-7 stream will immediately begin following the custom timings in your ILM policy instead of the cluster-wide DSL defaults.

What if index template has lifecycle attribute but no index.lifecycle.name?


If an index template contains a lifecycle block, it is configured to use Data Stream Lifecycle (DSL).

If you want to associate a specific ILM policy (to gain granular control over rollover max_age, for example) while this block exists, you must handle the conflict as follows:

1. DSL vs. ILM Precedence

The presence of "lifecycle": { "enabled": true } tells Elasticsearch to ignore traditional ILM. To force the use of an ILM policy instead, you must add index.lifecycle.prefer_ilm: true to the settings block.
Without that setting, the lifecycle block will "win," and your ILM policy will be ignored.

2. How to associate the ILM Policy

To properly link an ILM policy to this specific template, you should update it to look like this:


{
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "your-ilm-policy-name",  // 1. Point to your ILM policy
          "prefer_ilm": true              // 2. Tell ES to favor ILM over the DSL block
        },
        "number_of_shards": "1",
        "auto_expand_replicas": "0-1",
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_hot"
            }
          }
        }
      }
    },
    "mappings": { ... },
    "lifecycle": {              // This block can remain, but will be ignored 
      "enabled": true,          // because prefer_ilm is true above.
      "data_retention": "90d"
    }
  }
}


Key Interpretations

  • The lifecycle block at the root: This governs retention (90 days) and rollover (defaulted to auto, which usually means 7 days or 30 days depending on retention).
  • The settings.index block: This is where you define the ILM link.
  • Conflict Resolution: If you don't add prefer_ilm: true, Elasticsearch 2026 defaults to using the lifecycle block. Your data stream will continue rolling over every 7–30 days based on the auto logic, even if you put an ILM policy name in the settings.

Recommendation

If you want to use an ILM policy, the cleanest approach is to remove the lifecycle block from the template entirely and only use index.lifecycle.name in the settings. This eliminates any ambiguity for the orchestration engine.


How to fix a Shard Explosion?


Shard Explsion happens when there are more than ~20 shards per GB of heap (Elasticsearch node heap  - the JVM heap allocated to the Elasticsearch process, not the Kubernetes node's total memory.)

Force merge does not reduce the number of backing indices in a data stream. 

A force merge (_forcemerge) acts on the segments within the shards of the backing indices, not on the backing indices themselves
Here is the breakdown of what force merge does and how to reduce backing indices:

What Force Merge Does

  • Merges Segments: It reduces the number of Lucene segments in each shard, ideally to one, which improves search performance.
  • Cleans Deleted Docs: It permanently removes (expunges) documents that were soft-deleted, freeing up disk space.
  • Targets Shards: The operation is performed on the shards of one or more indices (or data stream backing indices). 

What Reduces Backing Indices


To reduce the number of backing indices for a data stream, you must use other strategies:
  • Rollover (ILM): Index Lifecycle Management (ILM) creates new backing indices and can automatically delete old ones based on age or size.
  • Data Stream Lifecycle: This automates the deletion of backing indices that are older than the defined retention period.
  • Shrink API: While not typically used to combine multiple daily indices into one, it can be used to reduce the primary shard count of a specific, read-only backing index.
  • Delete Index API: You can manually delete older backing indices (except the current write index). 

Summary:

  • Force Merge = Reduces segments inside shards (better search, less disk space).
  • Rollover/Delete = Reduces the total number of backing indices (fewer indices overall).


---

Wednesday, 7 January 2026

Kubernetes Scheduling

 


Pod scheduling is controlled by pod scheduling constraints section of the Kubernetes pod/deployment configuration which can be found in Kubernetes manifest (YAML) for resources like:
  • Deployment
  • StatefulSet
  • Pod
  • DaemonSet
  • Job/CronJob

Kubernetes scheduling mechanisms:
  • Tolerations
  • Node Selectors
  • Node Affinity
  • Node Affinity
  • Pod Affinity/Anti-Affinity
  • Taints (node-side)
  • Priority and Preemption
  • Topology Spread Constraints
  • Resource Requests/Limits
  • Custom Schedulers
  • Runtime Class


Example:

    tolerations:
      - key: "karpenter/elastic"
        operator: "Exists"
        effect: "NoSchedule"
    nodeSelector:
      karpenter-node-pool: elastic
      node.kubernetes.io/instance-type: m7g.large
      karpenter.sh/capacity-type: "on-demand"


Tolerations


Specify what node taints it can tolerate.

tolerations:
  - key: "karpenter/elastic"
    operator: "Exists"
    effect: "NoSchedule"

Allows the pod to be scheduled on nodes with the taint karpenter/elastic:NoSchedule.
Without this toleration, the pod would be repelled from those nodes.
operator: "Exists" means it tolerates the taint regardless of its value.

Karpenter applies the taint karpenter/elastic:NoSchedule to nodes in the "elastic" pool. This taint acts as a gatekeeping mechanism - it says: "Only pods that explicitly tolerate this taint can schedule here". By default, most pods CANNOT schedule on these nodes (they lack the toleration). Our pod explicitly opts in with the toleration, saying "I'm allowed on elastic nodes".

Why This Pattern?

This is actually a common workload isolation strategy:

Regular pods (no toleration) 
  ↓
  ❌ BLOCKED from elastic nodes
  ✅ Schedule on general-purpose nodes

Elastic workload pods (with toleration)
  ↓  
  ✅ CAN schedule on elastic nodes
  ✅ Can also schedule elsewhere (unless nodeSelector restricts)

Real-World Use Case:

# Elastic nodes are tainted to reserve them for specific workloads
# General traffic shouldn't land here accidentally

# Your pod says: "I'm an elastic workload, let me in"
tolerations:
  - key: "karpenter/elastic"
    operator: "Exists"
    effect: "NoSchedule"

# PLUS you add nodeSelector to say: "And I ONLY want elastic nodes"
nodeSelector:
  karpenter-node-pool: elastic


The Karpenter Perspective

Karpenter knows the node state perfectly. The taint isn't about node health—it's about reserving capacity for specific workloads. This prevents:
  • Accidental scheduling of non-elastic workloads
  • Resource contention
  • Cost inefficiency (elastic nodes might be expensive/specialized)

Think of it like a VIP section: the velvet rope (taint) keeps everyone out except those with a pass (toleration).


Node Selector


nodeSelector:
  karpenter-node-pool: elastic
  node.kubernetes.io/instance-type: m7g.large
  karpenter.sh/capacity-type: "on-demand"

Requires the pod to run only on nodes matching ALL these labels:
  • Must be in the "elastic" Karpenter node pool
  • Must be an AWS m7g.large instance (ARM-based Graviton3)
  • Must be on-demand (not spot instances; karpenter.sh/capacity-type can also have value "spot")

What This Means

This pod is configured to run on dedicated elastic infrastructure managed by Karpenter (a Kubernetes node autoscaler), specifically targeting:
  • ARM-based instances (m7g = Graviton)
  • On-demand capacity (predictable, no interruptions)
  • A specific node pool for workload isolation

This is common for workloads that need consistent performance or have specific architecture requirements.

Node Affinity


More flexible than nodeSelector with support for soft/hard requirements:

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:  # Hard requirement
      nodeSelectorTerms:
      - matchExpressions:
        - key: node.kubernetes.io/instance-type
          operator: In
          values: ["m7g.large", "m7g.xlarge"]
    preferredDuringSchedulingIgnoredDuringExecution:  # Soft preference
    - weight: 100
      preference:
        matchExpressions:
        - key: topology.kubernetes.io/zone
          operator: In
          values: ["us-east-1a"]


Pod Affinity/Anti-Affinity


Schedule pods based on what other pods are running:

affinity:
  podAffinity:  # Schedule NEAR certain pods
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchLabels:
          app: cache
      topologyKey: kubernetes.io/hostname
      
  podAntiAffinity:  # Schedule AWAY from certain pods
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
      podAffinityTerm:
        labelSelector:
          matchLabels:
            app: my-app
        topologyKey: topology.kubernetes.io/zone


Taints (node-side)


Complement to tolerations, applied to nodes:

kubectl taint nodes node1 dedicated=gpu:NoSchedule


Priority and Preemption


Control which pods get scheduled first and can evict lower-priority pods:

priorityClassName: high-priority


Topology Spread Constraints


Distribute pods evenly across zones, nodes, or other topology domains:

topologySpreadConstraints:
- maxSkew: 1
  topologyKey: topology.kubernetes.io/zone
  whenUnsatisfiable: DoNotSchedule
  labelSelector:
    matchLabels:
      app: my-app


Resource Requests/Limits


Influence scheduling based on available resources:

resources:
  requests:
    memory: "64Mi"
    cpu: "250m"
  limits:
    memory: "128Mi"
    cpu: "500m"


Custom Schedulers


You can even specify a completely different scheduler:

schedulerName: my-custom-scheduler


Runtime Class


For specialized container runtimes (like gVisor, Kata Containers):

runtimeClassName: gvisor

Each mechanism serves different use cases—nodeSelector is simple but rigid, while affinity rules and topology constraints offer much more flexibility for complex scheduling requirements.


Useful kubectl commands

 


To get the list of all the nodes (physical nodes, e.g. EC2 instances in AWS EKS cluster) in the cluster:

kubectl get nodes

Output columns:
  • NAME e.g. ip-10-2-12-73.us-east-1.compute.internal
  • STATUS e.g. Ready
  • ROLES    <none>
  • AGE    e.g. 1d
  • VERSION e.g. v1.32.9-eks-ecaa3a6


kubectl get nodes -L node.kubernetes.io/instance-type,topology.kubernetes.io/zone,karpenter.sh/capacity-type

Output columns:
  • NAME e.g. ip-10-2-12-73.us-east-1.compute.internal
  • STATUS e.g. Ready
  • ROLES    <none>
  • AGE    e.g. 1d
  • VERSION e.g. v1.32.9-eks-ecaa3a6
  • INSTANCE-TYPE e.g. m7g.2xlarge
  • ZONE e.g. us-east-2a
  • CAPACITY-TYPE e.g. on-demand

kubectl get nodes -o wide

Output columns:
  • NAME e.g. ip-10-2-12-73.us-east-1.compute.internal
  • STATUS e.g. Ready
  • ROLES    <none>
  • AGE    e.g. 1d
  • VERSION e.g. v1.32.9-eks-ecaa3a6
  • INTERNAL-IP e.g. 10.2.12.73
  • EXTERNAL-IP e.g. <none> 
  • OS-IMAGE e.g. Amazon Linux 2023.9.20251208
  • KERNEL-VERSION e.g. 6.1.158-180.294.amzn2023.aarch64 or 6.1.132-147.221.amzn2023.x86_64
  • CONTAINER-RUNTIME e.g.  containerd://2.1.5

kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.providerID}{"\n"}{end}'

Output:

ip-10-2-12-73.us-east-2.compute.internal aws:///us-east-2a/i-039a9aaafa975358f
ip-10-2-13-147.us-east-2.compute.internal aws:///us-east-2a/i-0627bbbb3c15da009
...


List all pods (by default the output is grouped by NAMESPACE):

kubectl get pods -A -o wide 

Output columns:
  • NAMESPACE
  • NAME                                            
  • READY   (X/Y means X out of Y containers are ready)
  • STATUS
  • RESTARTS
  • AGE
  • IP
  • NODE
  • NOMINATED NODE
  • READINESS GATES

List all pods and sort them by NODE:

kubectl get pods -A -o wide --sort-by=.spec.nodeName 


To get a list of all namespaces in the cluster:

kubectl get ns

Output columns:
  • NAME e.g. default
  • STATUS e.g. Active
  • AGE e.g. 244d
...

Kubernetes DaemonSet

 


Kubernetes DaemonSet is a workload resource that ensures a specific pod runs on all (or selected) nodes in a cluster. It's commonly used for deploying node-level services like log collectors, monitoring agents, or network plugins.

Example:

Elasticsearch Agents are Elastic’s unified data shippers typically used in k8s cluster to collect container logs, Kubernetes metrics, node-level metrics and ship all of that data to Elasticsearch. They are deployed in the cluster as a DaemonSet.

We can use a DaemonSet to run a copy of a pod on every node, or we can use node affinity or selector rules to run it on only certain nodes.


What is the difference between ReplicaSet and DaemonSet?

ReplicaSets ensure a specific number of identical pods run for scaling stateless apps (e.g., web servers), while DaemonSets guarantee one pod runs on every (or a subset of) node(s) for node-specific tasks like logging or monitoring. The key difference is quantity versus location: ReplicaSets focus on maintaining pod count for availability, whereas DaemonSets ensure pod presence on each node for system-level services. 


ReplicaSet

  • Purpose: Maintain a stable set of replica pods for stateless applications, ensuring high availability and scalability.
  • Scaling: Scales pods up or down based on the replicas field you define in the manifest.
  • Use Case: Running web frontends, APIs, or any application needing multiple identical instances.
  • Behavior: If a pod dies, it creates a new one to meet the replica count; if a node fails, it tries to reschedule elsewhere. 


DaemonSet

  • Purpose: Run a single copy of a pod on every (or specific) node in the cluster for node-specific tasks.
  • Scaling: Automatically adds a pod when a new node joins the cluster and removes it when a node leaves.
  • Use Case: Logging agents (Fluentd, Elastic Agent), monitoring agents (Prometheus node-exporter), or storage daemons.
  • Behavior: Ensures that a particular service runs locally on each machine for local data collection or management. 


References:

DaemonSet | Kubernetes

DevOps Interview: Replica sets vs Daemon sets - DEV Community

Monday, 5 January 2026

Kubernetes ReplicaSets

 


A ReplicaSet is a Kubernetes object that ensures a specified number of identical pod replicas are running at any given time. It's a fundamental component for maintaining application availability and scalability.

Key Functions


A ReplicaSet continuously monitors your pods and takes action if the actual number differs from the desired number:
  • If pods crash or are deleted, it creates new ones to replace them
  • If there are too many pods, it terminates the excess ones
  • This self-healing behavior keeps your application running reliably

How It Works


You define a ReplicaSet with three main components:
  • Selector: Labels used to identify which pods belong to this ReplicaSet
  • Replicas: The desired number of pod copies
  • Pod template: The specification for creating new pods when needed

Example:

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: my-app-replicaset
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: my-container
        image: nginx:1.14

This ReplicaSet ensures 3 nginx pods are always running with the label app: my-app.



What is the relation between replicaset and deployment?


While ReplicaSets are still used internally by Kubernetes, you typically don't create them directly. Instead, you use Deployments, which manage ReplicaSets for you and provide additional features like rolling updates, rollbacks, and version history. Deployments are the recommended way to manage replicated applications in Kubernetes.

A Deployment is a higher-level Kubernetes object that manages ReplicaSets for you. Think of it as a wrapper that adds intelligent update capabilities on top of ReplicaSets.


The Relationship

When you create a Deployment, Kubernetes automatically creates a ReplicaSet underneath it. The Deployment controls this ReplicaSet to maintain your desired number of pods.
The key difference becomes apparent when you update your application:

With just a ReplicaSet: If you want to update your application (like changing the container image), you'd need to manually delete the old ReplicaSet and create a new one. This causes downtime.

With a Deployment: When you update the pod template, the Deployment intelligently manages the transition by:
  1. Creating a new ReplicaSet with the updated pod specification
  2. Gradually scaling up the new ReplicaSet while scaling down the old one
  3. Keeping both ReplicaSets around for rollback capability

Visual Example

Deployment: my-app
    │
    ├── ReplicaSet: my-app-abc123 (old version, scaled to 0)
    │
    └── ReplicaSet: my-app-xyz789 (current version, 3 replicas)
        ├── Pod: my-app-xyz789-1
        ├── Pod: my-app-xyz789-2
        └── Pod: my-app-xyz789-3

What Deployment Adds

  • Rolling updates: Gradual replacement of old pods with new ones
  • Rollback: Easy reversion to previous versions
  • Update strategies: Control how updates happen (RollingUpdate, Recreate)
  • Revision history: Track changes over time
  • Pause/resume: Control when updates are applied

In Practice


You almost always use Deployments rather than creating ReplicaSets directly. The Deployment handles the ReplicaSet management complexity, and you just declare what you want your application to look like.



Why is there many old ReplicaSets in my cluster? 


% kubectl get replicaset  -n staging | grep my-app-staging
my-app-staging-545f64d64b                   0         0         0       46d
my-app-staging-57897b6496                   0         0         0       131d
my-app-staging-57979c66c                    0         0         0       186d
my-app-staging-5974fd86f9                   0         0         0       185d
my-app-staging-5f9fbdf898                   0         0         0       173d
my-app-staging-6754c468                     0         0         0       185d
my-app-staging-695469db68                   0         0         0       144d
my-app-staging-6cd7d77789                   0         0         0       185d
my-app-staging-6f5cc5b6cf                   0         0         0       185d
my-app-staging-76f5bf4bf5                   0         0         0       186d
my-app-staging-d9c7895d9                    5         5         5       27d


This is completely normal Kubernetes behaviour — and the reason is almost always revisionHistoryLimit (or the default value of it).


Why there are many old ReplicaSets

Every time a Deployment’s pod template changes, Kubernetes:
  • Creates a new ReplicaSet
  • Scales the old one down to 0
  • Keeps the old ReplicaSet for rollback purposes

Examples of changes that trigger a new ReplicaSet:
  • Image tag change
  • Env var change
  • ConfigMap checksum change
  • Resource requests/limits change
  • Annotation change on the pod template
  • Helm re-deploy with different values

👉 Over time, this naturally accumulates ReplicaSets like the ones you see.

Why they are so old (46–186 days)

Because Kubernetes never deletes them automatically unless told to.

By default:

revisionHistoryLimit: 10


If:
  • revisionHistoryLimit is unset
  • or explicitly set high
  • or Helm chart doesn’t define it

...Kubernetes will happily keep many old ReplicaSets forever.

Your output strongly suggests:
  • Multiple deployments over months
  • Each one created a new ReplicaSet
  • All old ones were scaled to 0
  • Only the newest one is active:
            my-app-staging-d9c7895d9   5/5   27d   ← current

Why this is not usually a problem

Old ReplicaSets with 0 replicas:
  • Consume almost no resources
  • Do not schedule pods
  • Exist mainly as metadata
So:
  • Performance impact → negligible
  • Scheduler impact → none

When it does become a problem

It’s worth cleaning up if:
  • You have hundreds or thousands of old ReplicaSets
  • kubectl get rs becomes noisy
  • GitOps / audits become painful
  • You accidentally roll back to a very old revision
  • etcd size is a concern (rare, but real at scale)

The right fix: limit revision history

Set this on the Deployment:

spec:
  revisionHistoryLimit: 3


or (Helm values example):

revisionHistoryLimit: 3


This means:
  • Kubernetes keeps only the last 3 old ReplicaSets
  • Older ones are automatically deleted

Typical values:
  • 2–3 for staging
  • 5–10 for prod (depending on rollback needs)

How to confirm current setting

kubectl get deployment my-app-staging -n staging -o yaml | grep revisionHistoryLimit


If nothing shows → default behaviour applies.

Can you safely delete the old ones manually?

Yes — as long as they’re at 0 replicas.

Example:

kubectl delete rs my-app-staging-545f64d64b -n staging


But:
  • This is a one-off cleanup
  • Without fixing revisionHistoryLimit, they’ll come back

TL;DR

  • Many old ReplicaSets is normal
  • They exist for rollback history
  • They’re old because Kubernetes keeps them indefinitely
  • Fix it with revisionHistoryLimit
  • Manual deletion is safe but not a long-term solution