To install Elasticsearch cluster on the existing k8s cluster, we can use ECK Helm charts from "https://helm.elastic.co" repository. These charts need to be installed in the particular order:
- eck-operator-crds
- eck-operator
- eck-elasticsearch
- eck-kibana
- eck-fleet-server
- eck-agent
- eck-apm-server
Prerequisites
- AWS EKS cluster with addons
- User with associated arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy
eck-operator-crds
ECK relies on a set of Custom Resource Definitions (CRDs) to define how applications are deployed. CRDs are global (not namespace-specific) resources, shared across the entire Kubernetes cluster, so installing them requires specific permissions (e.g. in AWS EKS, the installer might have role with this associated policy: arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy).
This chart installs Elastic Custom Resource Definitions (CRDs).
To list all CRD installed by the "eck-operator-crds" Helm chart:
% kubectl get crd | grep elastic
agents.agent.k8s.elastic.co 2025-05-15T11:37:50Z
apmservers.apm.k8s.elastic.co 2025-05-15T11:37:50Z
beats.beat.k8s.elastic.co 2025-05-15T11:37:50Z
elasticmapsservers.maps.k8s.elastic.co 2025-05-15T11:37:50Z
elasticsearchautoscalers.autoscaling.k8s.elastic.co 2025-05-15T11:37:50Z
elasticsearches.elasticsearch.k8s.elastic.co 2025-05-15T11:37:50Z
enterprisesearches.enterprisesearch.k8s.elastic.co 2025-05-15T11:37:50Z
kibanas.kibana.k8s.elastic.co 2025-05-15T11:37:50Z
logstashes.logstash.k8s.elastic.co 2025-05-15T11:37:50Z
stackconfigpolicies.stackconfigpolicy.k8s.elastic.co 2025-05-15T11:37:50Z
Note that this chart only installs CRDs, not resources of these types! If we deploy only resources of types Agent, APM Server, Elasticsearch and Kibana, the command which lists all resources of Elastic CRD types would return something like this:
% kubectl get elastic -n elastic-system
NAME HEALTH AVAILABLE EXPECTED VERSION AGE
agent.agent.k8s.elastic.co/my-fleet-agent green 6 6 9.0.1 28h
agent.agent.k8s.elastic.co/my-fleet-server green 1 1 9.0.1 2d17h
NAME HEALTH NODES VERSION AGE
apmserver.apm.k8s.elastic.co/my-apm-server green 1 9.0.1 7h39m
NAME HEALTH NODES VERSION PHASE AGE
elasticsearch.elasticsearch.k8s.elastic.co/my-elasticsearch green 9 9.0.1 Ready 10d
NAME HEALTH NODES VERSION AGE
kibana.kibana.k8s.elastic.co/my-kibana green 1 9.0.1 10d
eck-operator
Installs the ECK Operator, which is the official Kubernetes operator from Elastic which helps us deploy and manage (orchestrate) Elastic applications on Kubernetes including:
- Elasticsearch
- Kibana
- APM Server
- Enterprise Search
- Beats
- Elastic Agent
- Elastic Maps Server
eck-elasticsearch
This chart creates an Elasticsearch cluster.
In an Elasticsearch cluster, all Elasticsearch nodes provide the REST API. These nodes are the fundamental building blocks of the cluster and handle data storage, indexing, and search operations. The REST API is exposed on a specific port (usually 9200) and provides a standardized way to interact with the cluster. Elasticsearch uses a RESTful API, which means that requests are sent using standard HTTP methods (GET, POST, PUT, DELETE). The REST API allows you to perform tasks like adding or removing nodes, managing indices, and configuring various cluster settings. You can use the REST API to perform searches, retrieve data, and interact with the documents stored in Elasticsearch. The REST API is accessible to any client that can make HTTP requests, allowing us to integrate with various tools, applications, and programming languages.
Elasticsearch clusters have a designated master node that is responsible for managing the cluster state and performing certain critical operations. However, it's not a single, dedicated master node in the traditional sense.
Master-eligible nodes: All nodes in an Elasticsearch cluster are master-eligible by default, meaning any of them can potentially become the master node.
Elected master: Only one node is actively the master node at any given time. This node is elected from the master-eligible nodes using a distributed consensus algorithm.
Role of the master: The master node manages the cluster state, which includes:
- Creating or deleting indexes.
- Tracking which nodes are part of the cluster.
- Allocating shards to different nodes.
- Updating and propagating the cluster state across the cluster.
Master node elections: If the current master node fails or becomes unavailable, a new master node is automatically elected from the remaining master-eligible nodes, ensuring that the cluster can continue operating.
High availability: Using dedicated master nodes and a sufficient number of master-eligible nodes ensures high availability, even if one or more nodes fail.
In essence, while all nodes can potentially be the master, only one is actively managing the cluster at any given time. This ensures that the cluster remains stable and functional, even if the master node fails, as a new one will be elected quickly.
---
When the Elasticsearch resource is created, a default user named elastic is created automatically, and is assigned the superuser role.
Its password can be retrieved in a Kubernetes secret, whose name is based on the Elasticsearch resource name: <elasticsearch-name>-es-elastic-user.
This user is used for accessing Elasticsearch down the line e.g. if using Elasticsearch Terraform provider. Credentials can be stored in AWS Secrets Manager.
---
When this chart is deployed, ECK Operator automatically creates a Kubernetes service object:
- By default, its name follows this format: <elasticsearch_cluster_name>-es-http
- Its roles are:
- Primary access point: It acts as the main endpoint for clients (such as applications, users, or other services) to interact with the Elasticsearch cluster using the REST API.
- Handles authentication and TLS: The service is secured by default with TLS and basic authentication, managed by the ECK operator.
- Traffic distribution: It routes incoming HTTP (REST API) traffic to all Elasticsearch nodes in our cluster, unless we create custom services for more granular routing (for example, to target only data or ingest nodes).
- Its type is ClusterIP, meaning it is accessible only within the Kubernetes cluster (from other pods or nodes that are part of the same cluster) unless otherwise configured
- The service listens on port 9200 (the default Elasticsearch HTTP port) and load-balances requests to the Elasticsearch pods
How to access this service?
Within the Kubernetes cluster we need to use the service DNS name:
https://<elasticsearch_cluster_name>-es-http.<namespace>:9200
For example, if our cluster is named elasticsearch and in the elastic-system namespace:
https://elasticsearch-es-http.elastic-system:9200
We can use the user mentioned above (elastic) that ECK automatically created must provide:
- The CA certificate (to trust the service’s TLS certificate)
- The elastic user password (stored in a Kubernetes secret)
Example:
NAME=elasticsearch
NAMESPACE=elastic-system
kubectl get secret "$NAME-es-http-certs-public" \
-n $NAMESPACE \
-o go-template='{{index .data "tls.crt" | base64decode }}' \
> tls.crt
PW=$(\
kubectl get secret "$NAME-es-elastic-user" \
-n $NAMESPACE \
-o go-template='{{.data.elastic | base64decode }}')
curl \
--cacert tls.crt \
-u elastic:$PW \
https://$NAME-es-http.$NAMESPACE:9200/
To access it from outside the cluster we need to change the service type to LoadBalancer or use an Ingress to expose it externally.
When using LoadBalancer, the service will get an external IP address, and we can access it via https://<external-ip>:9200.
We need to be sure to secure access appropriately, as this exposes our Elasticsearch cluster to external networks.
---
When Elasticsearch is created, we can create Elastic Serverless Forwarder.
eck-kibana
Installs Kibana.
Creates Kubernetes service object. Its name follows this format: <kibana_name>-kb-http.
To expose this service externally, we can create an Ingress.
Now, when Kibana is installed, we can crate:
- snapshot repositories. They can be S3-backed.
- Kibana users. Each user is assigned a set of predefined Kibana roles.
- Index lifecycle rules
- Component templates (which reference those lifecycle rules)
eck-fleet-server
Creates Kubernetes service object. Its name follows this pattern: <fleet_server_name>-agent-http.
To expose this service externally, we can create Fleet Server Ingress.
After Fleet Server is crated, we can install:
- Fleet Integrations, like:
- system
- fleet_server
- elastic_agent
- kibana
- elasticsearch
- kubernetes
- apm
- aws
- Fleet Agent Policies
- Fleet Server can have its own policy
- e.g. sys_monitoring can be disabled while monitor_logs and monitor_metrics enabled
- Fleet Agents have their own policy
- e.g. all monitoring types enabled
- Fleet Integration Policies
- They define:
- which agent policy should be associated with which integration
- inputs
eck-agent
This Helm chart deploys:
- Elastic Agent Custom Resource
- The chart creates one or more Elastic Agent custom resources (Agent), which are Kubernetes objects managed by the ECK operator.
- These resources define how Elastic Agents are deployed, configured, and connected to your Elasticsearch and Kibana instances.
- Associated Kubernetes Resources
- The Agent custom resource triggers the ECK operator to create the necessary Kubernetes resources, such as:
- Pods/DaemonSets: Runs the Elastic Agent containers on your nodes. Elastic Agents are typically deployed as pods (usually via a DaemonSet or Deployment) in a namespace such as kube-system or elastic-agent. These pods execute the Elastic Agent binary, which collects data and communicates with the Fleet Server.
- ConfigMaps/Secrets: Stores configuration and credentials for the agents.
- ServiceAccounts, Roles, RoleBindings: Manages permissions for the agents to interact with the Kubernetes API if needed.
- Fleet Integration (Optional)
- The chart can configure Elastic Agents to enroll with Elastic Fleet, allowing for centralized management of agent policies and integrations.
We can specify the mode of Agent to use. Only set to "fleet" when Fleet Server is enabled. Default value is:
mode: "fleet"
Both `mode: fleet` and `fleetServerEnabled: true` need to be set for Fleet Server to be enabled. By default, the Agent does NOT act as the Fleet Server:
fleetServerEnabled: false
We need to provide a DaemonSet, StatefulSet, or Deployment specification for Agent.
Some Elastic Agent features, such as the Kubernetes integration, require that Agent Pods interact with Kubernetes APIs. This functionality requires specific permissions. This is why Elastic Agent runs on behalf of its own service account. Its default value is elastic-agent:
serviceAccount:
name: elastic-agent
We also need to provide policyID which determines into which Agent Policy this Agent will be enrolled. Default value is:
policyID: eck-agent
---
Typical Use Cases
- Observability: Collect logs, metrics, and traces from Kubernetes workloads and nodes.
- Security: Use Elastic Agent for security monitoring and data shipping.
- Fleet Management: Centrally manage agent configurations using Elastic Fleet.
Example: What we might see deployed
An Agent resource in your chosen namespace (e.g., elastic-agent).
One or more Elastic Agent pods (as a DaemonSet or Deployment, depending on configuration).
Supporting Kubernetes resources for configuration and permissions.
How to Verify
After installing the chart (e.g., with helm install elastic-agent elastic/eck-agent -n elastic-agent --create-namespace), we can check:
kubectl get agent -n elastic-agent
kubectl get pods -n elastic-agent
If we installed the Helm chart in the elastic-system namespace, we might have see the output like here:
% kubectl get agent -n elastic-system
NAME HEALTH AVAILABLE EXPECTED VERSION AGE
my-agent green 6 6 9.0.1 47h
my-fleet-server green 1 1 9.0.1 3d12h
To find all agent pods and their labels:
% kubectl get pods --show-labels -n elastic-system | grep agent
The command above lists all pods with agent in their name and/or label and their labels which are:
- common.k8s.elastic.co/type=agent <-- this shows that pod is running Agent (of Elastic CRD type)
- agent.k8s.elastic.co/name=<helm_installation_name>-eck-agent
- agent.k8s.elastic.co/version=version set int version attribute in chart values
Now when we know pods' labels, we can target only that we are interested in. For example, we want to check the status of all Elastic Agent pods:
% kubectl get pods -l common.k8s.elastic.co/type=agent -n elastic-system -o wide
The Elastic Agent pods running in your Kubernetes cluster and the agents listed in Kibana’s Fleet UI are directly related (each pod running ok in cluster represents a healthy Agent in Kibana) but represent different layers of abstraction (Kubernetes vs Fleet Layer). This applies both for Elastic Fleet Server Agent and regular Elastic Agents.
Agents in Kibana UI (Fleet Layer)
Agents registered with Fleet Server and visible in Kibana’s Fleet > Agents UI. These represent logical agents managed by Fleet, regardless of their deployment method (Kubernetes, VMs, etc.).
Statuses:
- Online: Agent is actively communicating with Fleet Server.
- Offline: Agent has not checked in with Fleet Server recently (default: 2 minutes).
TODO: Why Some Agents Appear Offline in Kibana?
In Kibana UI we can see that each agent has its associated Agent Policy. Elastic Agent policies are central configurations that define what data Elastic Agents should collect, how they should collect it, and where to send it. Each Elastic Agent can only be enrolled in a single policy, which contains a set of integration policies specifying the configuration for each type of data input (such as logs, metrics, security events, etc.)
Elastic Agent policy revisions are version numbers that track changes made to an Elastic Agent policy over time. Each time we update a policy—such as adding or removing integrations, changing configuration settings, or modifying outputs—the policy’s revision number is incremented. This revision number is included in the policy sent to each enrolled Elastic Agent, allowing both Fleet and the agents themselves to know which version of the policy is currently applied.
Who manages policy revisions?
- Fleet (in Kibana) manages policy revisions automatically. When you make any change to an agent policy through the Fleet UI or API, Fleet increments the revision number and distributes the updated policy to all agents enrolled in that policy.
- Users do not manually set or manage the revision number; it is handled by the Fleet management system.
Purpose and Benefits
- Change tracking: The revision number helps track when and how a policy has changed. Each agent reports which policy revision it is using, making it easy to see if agents are up to date.
- Troubleshooting: If agents are not behaving as expected, the revision number can help correlate issues with recent policy changes.
- Auditability: While the revision number itself does not provide a full change history, it signals that a change has occurred. (Note: The Fleet UI does not currently provide a detailed revision history with user attribution)
Rolling out new policy versions
When we change an Elastic Agent policy (which increments its version number), the updated policy is rolled out immediately to all agents enrolled in that policy. Fleet distributes the new policy as soon as we save our changes, and agents will attempt to fetch and apply the updated configuration right away.
Immediate distribution:
Any changes to a policy or its integrations are immediately sent to all enrolled agents.
Agent update timing:
Agents regularly check in with Fleet Server. Upon their next check-in, they detect the new policy revision and apply it, usually within a few seconds to a couple of minutes.
Status indication:
While the policy is being applied, the agent status in the Fleet UI may briefly show as "Updating" before returning to "Healthy" once the new policy is active.
No manual intervention required:
We do not need to manually trigger the rollout; it is automatic and managed by Fleet.
Policy changes are automatically and quickly rolled out to all affected Elastic Agents, ensuring configurations stay consistent and up to date across your infrastructure
eck-apm-server
Creates Kubernetes service which name follows this format: <apm_server_name>-apm-http
To expose this service externally, we can crate APM Server Ingress.
...
How to test ECK cluster health?
If we've set in Route53 a record that resolves elasticsearch.mycorp.com into Elasticsearch Load Balancer DNS name, we can use:
% curl -u ‘user:pass’ -X GET 'https://elasticsearch.mycorp.com:443/_cluster/health?pretty'
{
"cluster_name" : "elasticsearch-prod",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 9,
"number_of_data_nodes" : 9,
"active_primary_shards" : 106,
"active_shards" : 212,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"unassigned_primary_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
---