Pages

Thursday, 2 April 2026

Kubernetes StatefulSet

 

In Kubernetes, a StatefulSet is a specialized workload API object designed to manage stateful applications. Unlike standard Deployments, where Pods are interchangeable "cattle," StatefulSets treat Pods as unique "pets" with a persistent identity that is maintained even if they are rescheduled or restarted.

Key Features

  • Stable Network Identity: Each Pod is assigned a unique, ordinal index (e.g., web-0, web-1) and a corresponding stable DNS name through a Headless Service.
  • Stable Storage: By using volumeClaimTemplates, each Pod is automatically paired with its own PersistentVolume. If a Pod dies, the replacement Pod with the same identity will automatically remount the same storage.
  • Ordered Deployment: Pods are created and scaled sequentially from 0 to N-1. Kubernetes ensures that the previous Pod is "Running and Ready" before starting the next one.
  • Ordered Termination: Scaling down or deleting the StatefulSet occurs in reverse order, starting from the highest ordinal (e.g., web-2 is deleted before web-1). 


When to Use StatefulSets


StatefulSets are the standard choice for applications that require consistent data and unique identities, such as: 
  • Databases: Systems like MySQL, PostgreSQL, MongoDB, and Cassandra.
  • Distributed Systems: Tools like ZooKeeper, Kafka, and Elasticsearch that need a quorum or master election.
  • Clustered Applications: Any software where instances need to know each other’s specific addresses to sync data. 

Comparison: StatefulSet vs. Deployment


Feature         StatefulSet                                                 Deployment
----------           --------------                                                     ---------------
Pod Identity: Unique and stable (ordinal names)                 Randomly generated and ephemeral
Storage:         Dedicated volume per Pod (via template) Typically shared or transient
Network: Fixed DNS per Pod (via Headless Service) Single Load Balancer for the whole set
Scaling:         Sequential (0, then 1, then 2...)                 Parallel (multiple Pods at once)


Best Practices

  • Use Headless Services: Always pair our StatefulSet with a Service that has clusterIP: None to ensure Pods are individually addressable.
  • Persistent Storage: Ensure our StorageClass is correctly configured for dynamic provisioning so that each Pod gets its own disk automatically.
  • Manual Data Sync: Note that while Kubernetes manages the infrastructure, we are still responsible for configuring internal application logic like data replication or master/slave sync. 


YAML manifest example for a basic MySQL StatefulSet


Below is a standard YAML manifest for a MySQL StatefulSet. It includes a Headless Service for network identity and a volumeClaimTemplate to automatically provision unique storage for each replica.

apiVersion: v1
kind: Service
metadata:
  name: mysql
  labels:
    app: mysql
spec:
  ports:
  - port: 3306
    name: mysql
  clusterIP: None # Defines this as a Headless Service
  selector:
    app: mysql
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  selector:
    matchLabels:
      app: mysql
  serviceName: "mysql"
  replicas: 3
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        env:
        - name: MYSQL_ROOT_PASSWORD
          value: "password" # Use Secrets in production!
        ports:
        - containerPort: 3306
          name: mysql
        volumeMounts:
        - name: mysql-data
          mountPath: /var/lib/mysql
  volumeClaimTemplates:
  - metadata:
      name: mysql-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi


Why this works:
  • Stable DNS: Each Pod gets a predictable name: mysql-0.mysql, mysql-1.mysql, etc.
  • Unique Storage: Kubernetes creates three separate PersistentVolumeClaims. mysql-0 will always mount the first disk, even after a reboot.
  • Ordered Startup: Pods launch one after another (0, then 1, then 2), which is critical for forming database clusters.


To use a Kubernetes Secret (like mysql-secret) instead of hardcoding passwords, we need to create a Secret object and then reference it in our StatefulSet. This is the standard practice for distributing credentials securely in Kubernetes.

1. Create the Secret

We can define our password in a YAML file. Note that values in the data field must be base64 encoded.

apiVersion: v1
kind: Secret
metadata:
  name: mysql-secret
type: Opaque
data:
  # 'password' encoded in base64 is 'cGFzc3dvcmQ='
  root-password: cGFzc3dvcmQ=


Alternatively, we can use stringData to provide the password in plain text; Kubernetes will handle the encoding for us when we apply it:

apiVersion: v1
kind: Secret
metadata:
  name: mysql-secret
type: Opaque
stringData:
  root-password: "our-secure-password"

2. Update the StatefulSet

Modify the env section of our MySQL container to use valueFrom and secretKeyRef. This tells the Pod to pull the value of MYSQL_ROOT_PASSWORD from the secret we just created.

 containers:
      - name: mysql
        image: mysql:8.0
        env:
        - name: MYSQL_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-secret   # Name of our Secret object
              key: root-password   # The specific key inside the Secret


Key Considerations

  • Initialization Only: For MySQL, the MYSQL_ROOT_PASSWORD environment variable is typically only used during the first-time initialization of the data directory. Changing the Secret later will not automatically update the root password in an existing database.
  • Security: Ensure our cluster has encryption at rest enabled for Secrets to truly protect sensitive data.
  • Alternative for Multiple Variables: If we have many credentials (user, password, DB name), we can use envFrom to map all keys in a Secret to environment variables at once.

Changing Storage Spec


Changing spec.volumeClaimTemplate updates the StatefulSet template but will not resize already-created PVCs. If the goal is to fix an existing CrashLoopBackOff due to disk-full, we still need to expand the current PVC(s) (and ensure the general StorageClass allows volume expansion), or recreate the PVC/StatefulSet so the new size takes effect.

References: