How to install Clickhouse in AWS EKS cluster via Altinity Helm charts and Terraform

Thursday, 5 March 2026

How to install Clickhouse in AWS EKS cluster via Altinity Helm charts and Terraform

Installing ClickHouse on an AWS EKS cluster using Terraform and the Altinity Helm charts typically involves two stages:

Installing the Altinity ClickHouse Operator
Deploying a ClickHouse Installation (CHI)

The Altinity Helm repository is located at https://helm.altinity.com.

Prerequisites

Ensure your Terraform environment is configured with the following providers:

aws: To manage EKS and underlying infrastructure.
kubernetes: To interact with the EKS cluster.
helm: To install the operator.

Terraform Configuration

The following example uses the helm_release resource to install the operator and the kubernetes_manifest resource to deploy the actual ClickHouse cluster.

Step A: Install the Altinity Operator

The operator is the "brain" that manages ClickHouse instances on Kubernetes.

resource "helm_release" "clickhouse_operator" {

name = "clickhouse-operator"

repository = "https://helm.altinity.com"

chart = "altinity-clickhouse-operator"

namespace = "clickhouse-operator"

create_namespace = true

# Optional: Enable metrics for Prometheus

set {

name = "metrics.enabled"

value = "true"

}

Step B: Deploy a ClickHouse Cluster (CHI)

Once the operator is running, you define your ClickHouse cluster using a Custom Resource (CRD). In Terraform, you use kubernetes_manifest.

resource "kubernetes_manifest" "clickhouse_cluster" {

depends_on = [helm_release.clickhouse_operator]

manifest = {

apiVersion = "clickhouse.altinity.com/v1"

kind = "ClickHouseInstallation"

metadata = {

name = "simple-clickhouse"

namespace = "clickhouse-operator"

}

spec = {

configuration = {

clusters = [

{

name = "cluster1"

layout = {

shardsCount = 1

replicasCount = 1

}

]

}

Production Considerations for EKS

When running ClickHouse on EKS, you should consider storage and networking:

Storage Class: Use AWS gp3 volumes for a good balance of price and performance. You can specify a volumeClaimTemplate in your kubernetes_manifest.
Node Affinity: It is recommended to run ClickHouse on specific node groups (e.g., using i3 or r5 instances) to ensure it doesn't compete with other workloads for IOPS.
Zookeeper/Keeper: For multi-node shards or replicas, you will need a Zookeeper cluster or the ClickHouse Keeper (also available via Altinity charts).

EKS Module

Altinity maintains a dedicated Terraform EKS ClickHouse module that automates the entire VPC, EKS, and ClickHouse setup if you prefer a pre-packaged solution.

How to view Clickhouse Installation Configuration?

% kubectl get chi -n clickhouse -o yaml

apiVersion: v1

items:

- apiVersion: clickhouse.altinity.com/v1

kind: ClickHouseInstallation

metadata:

annotations:

kubectl.kubernetes.io/last-applied-configuration: |

{"apiVersion":"clickhouse.altinity.com/v1","kind":"ClickHouseInstallation","metadata":....}

creationTimestamp: "2025-01-28T14:35:46Z"

finalizers:

- finalizer.clickhouseinstallation.altinity.com

generation: 12

namespace: clickhouse

resourceVersion: "67251031"

uid: 9fxxxx1-81e7-429b-9cf7-ffxxxxxxef

spec:

configuration:

clusters:

- layout:

replicasCount: 1

shardsCount: 1

templates:

dataVolumeClaimTemplate: ch-data

podTemplate: ch-pod

serviceTemplate: ch-svc

users:

admin/grants/query: GRANT ALL ON *.*

admin/networks/ip: 0.0.0.0/0

admin/password: my-admin-password

(or admin/password_sha256_hex: my-admin-password-in-sha256)

admin/profile: xxxx

admin/quota: xxxxx

admin/settings/enable_http_compression: 1

default/k8s_secret_password_sha256_hex: <namespace/secretName/key>

default/profile: default

default/quota: default

templates:

podTemplates:

- name: ch-pod

spec:

containers:

- image: altinity/clickhouse-server:24.8.14.10544.altinitystable

- args:

- server

env:

- name: LOG_LEVEL

value: info

- name: API_LISTEN

value: 0.0.0.0:7171

- name: API_CREATE_INTEGRATION_TABLES

value: "true"

- name: REMOTE_STORAGE

value: s3

- name: BACKUPS_TO_KEEP_REMOTE

value: "2"

- name: S3_BUCKET

value: my-clickhouse-backups

- name: S3_REGION

value: us-east-1

- name: CLICKHOUSE_HOST

value: localhost

- name: CLICKHOUSE_USERNAME

value: xxxxx

- name: CLICKHOUSE_PASSWORD

value: xxxx

image: altinity/clickhouse-backup:latest

imagePullPolicy: IfNotPresent

serviceAccountName: clickhouse-backup

tolerations:

- effect: NoSchedule

key: karpenter/clickhouse

operator: Exists

serviceTemplates:

- metadata:

annotations:

service.beta.kubernetes.io/aws-load-balancer-ip-address-type: ipv4

service.beta.kubernetes.io/aws-load-balancer-name: my-clickhouse-nlb

service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip

service.beta.kubernetes.io/aws-load-balancer-scheme: internal

service.beta.kubernetes.io/aws-load-balancer-type: nlb

spec:

ports:

- name: http

port: 8123

targetPort: 8123

- name: native

port: 9000

targetPort: 9000

type: LoadBalancer

volumeClaimTemplates:

- name: ch-data

spec:

accessModes:

- ReadWriteOnce

resources:

requests:

storage: 100Gi

status:

chop-commit: 9abcd12

chop-date: 2025-01-24T08:40:12

chop-ip: 10.x.x.x

chop-version: 0.25.5

clusters: 1

endpoint: clickhouse-clickhouse.clickhouse.svc.cluster.local

fqdns:

- chi-clickhouse-ch-0-0.clickhouse.svc.cluster.local

hosts: 1

hostsWithTablesCreated:

- chi-clickhouse-ch-0-0.clickhouse.svc.cluster.local

pods:

- chi-clickhouse-ch-0-0-0

shards: 1

status: Completed

taskID: auto-1xxxxd2-5ba4-4c3a-9daa-baxxxxx850

taskIDsCompleted:

- auto-1fxxxxxd2-5ba4-4c3a-9daa-baxxxxxx50

...

- auto-bbxxxx6-31e3-4a4c-b04b-e5xxxxxx91

taskIDsStarted:

- auto-31xxxxx37-492f-4109-b515-4axxxxxx6c8

...

- auto-b8xxxxx7-0396-41e0-b5d1-95xxxxd48

kind: List

metadata:

resourceVersion: ""

Users section shows users config in form USER_NAME/ATTRIBUTE. In the example above we have two users: admin and default.

USER_NAME/password values is plain text password. This is very convenient for debugging (though usually a security "no-no" for production, especially if that's admin or default user!).

USER_NAME/password_sha256_hex is a SHA256 hashed password.

USER_NAME/k8s_secret_password_sha256_hex: <namespace/SECRET_NAME/KEY_NAME > shows that USER_NAME ClickHouse user is secured using a Kubernetes Secret. This maps the default user's password to a specific Kubernetes Secret.

USER_NAME/k8s_secret_password_sha256_hex: Specifies that for the user named USER_NAME, the password should be read from a Kubernetes Secret as a SHA256 hex string.
<namespace/SECRET_NAME/KEY_NAME >: This is the reference to the secret itself, structured as namespace/SECRET_NAME/KEY_NAME.
Purpose: This allows for secure, GitOps-friendly password management, preventing plain-text passwords from appearing in Kubernetes manifests.
Implementation: The ClickHouse Operator reads this secret and places the hashed password into the users.xml file for the ClickHouse server. Operator reads the secret, hashes the password (if necessary), and writes it into a file called /etc/clickhouse-server/users.d/chop-generated-users.xml inside your ClickHouse pod. If you have External Secrets installed, this secret is likely being pulled from AWS Secrets Manager.
Alternative: You can also use k8s_secret_env_password_sha256_hex to load the password via an environment variable.

In the Altinity Operator, the syntax USER_NAME/k8s_secret_password_sha256_hex is a pointer. It tells the operator to look into a specific secret to find the password hash for the USER_NAME user.

To get the password:

% kubectl get secret <SECRET_NAME> \

-n <NAMESPACE> \

-o jsonpath="{.data.<KEY_NAME>}" \

| base64 -d

NAMESPACE is usually clickhouse.

How to check Clickhouse health?

Since ClickHouse is running in our cluster, the best way to verify it's "working fine" is to move beyond just checking the Pod status and actually query the database engine itself.

Here is a 3-step approach to verify health, connectivity, and data integrity.

1. The "Internal" Health Check

The quickest way is to execute a command directly inside the pod using the clickhouse-client. This bypasses networking issues and tells you if the engine is responsive. Run this command:

kubectl exec -it chi-clickhouse-ch-0-0 \

-n clickhouse \

-- clickhouse-client --query "SELECT version(), uptime()"

chi-clickhouse-ch-0-0 is the name of the pod, it can also be like chi-clickhouse-ch-0-0-0.

If this returns data, it means ClickHouse is successfully reading from its system tables on the EBS volume.

What to look for: It should return the version string and the number of seconds the server has been up. If this fails, the DB engine itself is hung.

If you are using default user which has a password, or, default user was disabled, the output might show the error similar to this:

% kubectl exec -it chi-clickhouse-ch-0-0-0 -n clickhouse -- clickhouse-client --query "SELECT version(), uptime()"

Defaulted container "clickhouse" out of: clickhouse, clickhouse-backup

Code: 516. DB::Exception: Received from localhost:9000. DB::Exception: default: Authentication failed: password is incorrect, or there is no user with such name.

If you have installed ClickHouse and forgot password you can reset it in the configuration file.

The password for default user is typically located at /etc/clickhouse-server/users.d/default-password.xml

and deleting this file will reset the password.

See also /etc/clickhouse-server/users.xml on the server where ClickHouse is installed.

. (AUTHENTICATION_FAILED)

command terminated with exit code 4

Seeing an AUTHENTICATION_FAILED error instead of a Connection Refused error is actually a positive result for this check:

Networking works: Your kubectl exec reached the pod.
Process is alive: The ClickHouse server is running and actively rejecting bad logins.
Storage is mounted: ClickHouse can't check credentials if it can't read its config files from disk.

If we know the Clickhouse credentials, we can perform the health check:

% kubectl exec -it chi-clickhouse-ch-0-0-0 -n clickhouse -- clickhouse-client --user USER --password PASS --query "SELECT version(), uptime(), name FROM system.clusters"

Defaulted container "clickhouse" out of: clickhouse, clickhouse-backup

24.8.14.10544.altinitystable 501389 all-clusters

24.8.14.10544.altinitystable 501389 all-replicated

24.8.14.10544.altinitystable 501389 all-sharded

24.8.14.10544.altinitystable 501389 ch

24.8.14.10544.altinitystable 501389 default

The output above is exactly what we wanted to see. The database is responsive, healthy, and has an uptime of ~5.8 days (501,389 seconds). The version 24.8.14.10544.altinitystable indicates we are on a very recent, stable Altinity build.

2. Check Replication and Disk Health

Since you are using the Altinity Operator, ClickHouse is likely managing data across disks. You want to ensure the "System" tables report no errors. Run this to check if the disks are mounted and have space:

% kubectl exec -it chi-clickhouse-ch-0-0-0 \

-n clickhouse \

-- clickhouse-client --user USER --password PASS \

--query "SELECT name, path, formatReadableSize(free_space) AS free, formatReadableSize(total_space) AS total FROM system.disks"

Defaulted container "clickhouse" out of: clickhouse, clickhouse-backup

default /var/lib/clickhouse/ 89.60 GiB 95.80 GiB

If you have multiple replicas (e.g., a ch-0-1 pod), check for replication lag:

% kubectl exec -it chi-clickhouse-ch-0-0-0 \

-n clickhouse \

-- clickhouse-client --user USER --password PASS \

--query "SELECT type, last_exception, num_tries FROM system.replication_queue WHERE last_exception != ''"

Defaulted container "clickhouse" out of: clickhouse, clickhouse-backup

Result: This should ideally be empty (or as above) . If you see exceptions here, your nodes aren't syncing correctly.

3. Verify the "Operator" View

The Altinity Operator provides a "Status" field in its Custom Resource that summarizes the health of the entire installation.

% kubectl get chi -n clickhouse

NAME CLUSTERS HOSTS STATUS HOSTS-COMPLETED AGE

clickhouse 1 1 Completed 123d

What to look for: Look for the STATUS column. It should say Completed. If it says InProgress or Error, the Operator is struggling to configure the cluster.

4. Check the Backup (Safety Net)

Since you saw clickhouse-backup pods earlier, verify that the last backup actually succeeded. This is your "point of no return" check before the upgrade.

kubectl logs -n clickhouse -l job-name=clickhouse-backup-cron-<TIMESTAMP>

(Replace <TIMESTAMP> with one of the strings from your previous get all output, e.g., 29543400).

Look for: Done, Success, or Upload finished.

5. Check the Status of All Replicas

To be absolutely sure the cluster is "Green" before you start the EKS upgrade, run this to check the status of all replicas in the cluster:

% kubectl exec -it chi-clickhouse-ch-0-0-0 -n clickhouse -- clickhouse-client --user USER --password PASS --query "SELECT replica_path, is_leader, is_readonly, future_parts FROM system.replicas"

is_readonly: Should be 0. If it's 1, the node can't write data (usually a Zookeeper issue).

is_leader: One of your replicas should be 1.

Summary Checklist

Test Command Goal Good Result

Ping SELECT 1 1

Uptime SELECT uptime() >0

Storage system.disks Free space > 10%

Operator kubectl get chi Completed

Resources:

Altinity | Run Open Source ClickHouse® Better

My Public Notepad

Pages

Thursday, 5 March 2026