In Kubernetes, a StatefulSet is a specialized workload API object designed to manage stateful applications. Unlike standard Deployments, where Pods are interchangeable "cattle," StatefulSets treat Pods as unique "pets" with a persistent identity that is maintained even if they are rescheduled or restarted.
Key Features
- Stable Network Identity: Each Pod is assigned a unique, ordinal index (e.g., web-0, web-1) and a corresponding stable DNS name through a Headless Service.
- Stable Storage: By using volumeClaimTemplates, each Pod is automatically paired with its own PersistentVolume. If a Pod dies, the replacement Pod with the same identity will automatically remount the same storage.
- Ordered Deployment: Pods are created and scaled sequentially from 0 to N-1. Kubernetes ensures that the previous Pod is "Running and Ready" before starting the next one.
- Ordered Termination: Scaling down or deleting the StatefulSet occurs in reverse order, starting from the highest ordinal (e.g., web-2 is deleted before web-1).
When to Use StatefulSets
StatefulSets are the standard choice for applications that require consistent data and unique identities, such as:
- Databases: Systems like MySQL, PostgreSQL, MongoDB, and Cassandra.
- Distributed Systems: Tools like ZooKeeper, Kafka, and Elasticsearch that need a quorum or master election.
- Clustered Applications: Any software where instances need to know each other’s specific addresses to sync data.
Comparison: StatefulSet vs. Deployment
Feature StatefulSet Deployment
---------- -------------- ---------------
Pod Identity: Unique and stable (ordinal names) Randomly generated and ephemeral
Storage: Dedicated volume per Pod (via template) Typically shared or transient
Network: Fixed DNS per Pod (via Headless Service) Single Load Balancer for the whole set
Scaling: Sequential (0, then 1, then 2...) Parallel (multiple Pods at once)
Best Practices
- Use Headless Services: Always pair our StatefulSet with a Service that has clusterIP: None to ensure Pods are individually addressable.
- Persistent Storage: Ensure our StorageClass is correctly configured for dynamic provisioning so that each Pod gets its own disk automatically.
- Manual Data Sync: Note that while Kubernetes manages the infrastructure, we are still responsible for configuring internal application logic like data replication or master/slave sync.
YAML manifest example for a basic MySQL StatefulSet
Below is a standard YAML manifest for a MySQL StatefulSet. It includes a Headless Service for network identity and a volumeClaimTemplate to automatically provision unique storage for each replica.
apiVersion: v1
kind: Service
metadata:
name: mysql
labels:
app: mysql
spec:
ports:
- port: 3306
name: mysql
clusterIP: None # Defines this as a Headless Service
selector:
app: mysql
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
selector:
matchLabels:
app: mysql
serviceName: "mysql"
replicas: 3
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:8.0
env:
- name: MYSQL_ROOT_PASSWORD
value: "password" # Use Secrets in production!
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mysql-data
mountPath: /var/lib/mysql
volumeClaimTemplates:
- metadata:
name: mysql-data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
Why this works:
- Stable DNS: Each Pod gets a predictable name: mysql-0.mysql, mysql-1.mysql, etc.
- Unique Storage: Kubernetes creates three separate PersistentVolumeClaims. mysql-0 will always mount the first disk, even after a reboot.
- Ordered Startup: Pods launch one after another (0, then 1, then 2), which is critical for forming database clusters.
To use a Kubernetes Secret (like mysql-secret) instead of hardcoding passwords, we need to create a Secret object and then reference it in our StatefulSet. This is the standard practice for distributing credentials securely in Kubernetes.
1. Create the Secret
We can define our password in a YAML file. Note that values in the data field must be base64 encoded.
apiVersion: v1
kind: Secret
metadata:
name: mysql-secret
type: Opaque
data:
# 'password' encoded in base64 is 'cGFzc3dvcmQ='
root-password: cGFzc3dvcmQ=
Alternatively, we can use stringData to provide the password in plain text; Kubernetes will handle the encoding for us when we apply it:
apiVersion: v1
kind: Secret
metadata:
name: mysql-secret
type: Opaque
stringData:
root-password: "our-secure-password"
2. Update the StatefulSet
Modify the env section of our MySQL container to use valueFrom and secretKeyRef. This tells the Pod to pull the value of MYSQL_ROOT_PASSWORD from the secret we just created.
containers:
- name: mysql
image: mysql:8.0
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret # Name of our Secret object
key: root-password # The specific key inside the Secret
Key Considerations
- Initialization Only: For MySQL, the MYSQL_ROOT_PASSWORD environment variable is typically only used during the first-time initialization of the data directory. Changing the Secret later will not automatically update the root password in an existing database.
- Security: Ensure our cluster has encryption at rest enabled for Secrets to truly protect sensitive data.
- Alternative for Multiple Variables: If we have many credentials (user, password, DB name), we can use envFrom to map all keys in a Secret to environment variables at once.
Changing Storage Spec
Changing spec.volumeClaimTemplate updates the StatefulSet template but will not resize already-created PVCs. If the goal is to fix an existing CrashLoopBackOff due to disk-full, we still need to expand the current PVC(s) (and ensure the general StorageClass allows volume expansion), or recreate the PVC/StatefulSet so the new size takes effect.