Monday, 30 June 2025

Introduction to Amazon API Gateway

Amazon API Gateway:

fully managed service to create, publish, maintain, monitor, and secure APIs at any scale

APIs act as the "front door" for applications to access data, business logic, or functionality from our backend services

allows creating:

RESTful APIs

optimized for serverless workloads and HTTP backends using HTTP APIs

they act as triggers for Lambda functions

HTTP APIs are the best choice for building APIs that only require API proxy functionality
Use REST APIs if our APIs require in a single solution both:

API proxy functionality
API management features

WebSocket APIs that enable real-time two-way communication applications

supports:

containerized workloads
serverless workloads
web applications

handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including:

traffic management
CORS support
authorization and access control
throttling
monitoring
API version management

has no minimum fees or startup costs. We pay for the API calls we receive and the amount of data transferred out and, with the API Gateway tiered pricing model, we can reduce our cost as our API usage scales

RESTful APIs

What is the difference between REST API endpoints (apiGateway) and HTTP API endpoints (httpApi)?

The difference between REST API endpoints (apiGateway) and HTTP API endpoints (httpApi) in Amazon API Gateway primarily comes down to features, performance, cost, and use cases.

REST API endpoints (apiGateway):

Older, feature-rich, supports API keys, usage plans, request/response validation, custom authorizers, and more.
More configuration options, but higher latency and cost.
Defined under the provider.apiGateway section and function events: http.

HTTP API endpoints (httpApi):

Newer, simpler, faster, and cheaper.
Supports JWT/Lambda authorizers, CORS, and OIDC, but lacks some advanced REST API features.
Defined under provider.httpApi and function events: httpApi.

GitHub Workflows and AWS

GitHub workflow can communicate with our AWS resources, directly (via AWS CLI commands) or indirectly (via e.g. Terraform AWS provider).

Before running AWS CLI commands, deploying AWS infrastructure with Terraform, or interacting with AWS services in any way we need to include a step which configures AWS credentials. It ensures that the workflow runner is authenticated with AWS and knows which region to target.

This step should contain configure-aws-credentials action provided by AWS. This action sets up the necessary environment variables so that AWS CLI commands and SDKs can authenticate with AWS services.

aws-region input sets the default AWS region to us-east-2 (Ohio). All AWS commands run in later steps will use this region unless overridden.

We can use either IAM user or OIDC (temp token) authentication.

IAM User Authentication

If using IAM user authentication, we can store user's credentials in a dedicated GitHub secrets:

env:

AWS_ACCOUNT_ID: ${{ secrets.AWS_ACCOUNT_ID }}

AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}

AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

AWS_REGION: us-east-2

// Define this step before steps which are accessing AWS:

- name: Configure AWS Credentials

uses: aws-actions/configure-aws-credentials@v2

with:

aws-region: ${{ env.AWS_REGION }}

OpenID Connect (OIDC) Authentication

In this authentication, configure-aws-credentials GitHub Action uses GitHub's OpenID Connect (OIDC) for secure authentication with AWS. It leverages the OIDC token provided by GitHub to request temporary AWS credentials from AWS STS, eliminating the need to store long-lived AWS access keys in GitHub Secrets.

Note that we now need to grant the workflow run a permissions for write access to the id-token:

id-token: write allows the workflow to request and use OpenID Connect (OIDC) tokens. The write level is required for actions that need to generate or use OIDC tokens to authenticate with external systems. Granting id-token: write is essential for workflows that use OIDC-based authentication, such as securely assuming AWS IAM roles via GitHub Actions. This enables secure, short-lived authentication to AWS and other cloud providers. This permission is a security best practice for modern CI/CD workflows that use OIDC to authenticate with cloud providers, reducing the need for static secrets.

env:

AWS_REGION: us-east-2

permissions:

id-token: write # aws-actions/configure-aws-credentials (OIDC)

...

- name: Configure AWS Credentials

uses: aws-actions/configure-aws-credentials@v4

with:

role-to-assume: arn:aws:iam::123456789012:role/github-actions-role

role-session-name: my-app

aws-region: ${{ env.AWS_REGION }}

Here's how it works:

GitHub OIDC Provider: GitHub acts as an OIDC provider, issuing signed JWTs (JSON Web Tokens) to workflows that request them.
configure-aws-credentials Action: This action, when invoked in a GitHub Actions workflow, receives the JWT from the OIDC provider.
AWS STS Request: The action then uses the JWT to request temporary security credentials from AWS Security Token Service (STS).
Credential Injection: AWS STS returns temporary credentials (access key ID, secret access key, and session token) which the action injects as environment variables into the workflow's execution environment.
AWS SDKs and CLI: AWS SDKs and the AWS CLI automatically detect and use these environment variables for authenticating with AWS services.

Benefits of using OIDC with configure-aws-credentials:

Enhanced Security: Eliminates the need to store long-lived AWS access keys, reducing the risk of compromise.
Simplified Credential Management: Automatic retrieval and injection of temporary credentials, simplifying workflow setup and maintenance.
Improved Auditing: Provides better traceability of actions performed within AWS, as the identity is linked to the GitHub user or organization.

Before using the action:

Configure an OpenID Connect provider in AWS: We need to establish an OIDC trust relationship between GitHub and our AWS account.
Create an IAM role in AWS: Define the permissions for the role that the configure-aws-credentials action will assume.
Set up the GitHub workflow: Configure the configure-aws-credentials action with the appropriate parameters, such as the AWS region and the IAM role to assume.

In an OpenID Connect (OIDC) authentication scenario, the aws-actions/configure-aws-credentials action creates the following environment variables when assuming a role with temporary credentials: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN. These variables are used by the AWS SDK and CLI to interact with AWS resources.

Here's a breakdown:

AWS_ACCESS_KEY_ID: This environment variable stores the access key ID of the temporary credentials.
AWS_SECRET_ACCESS_KEY: This environment variable stores the secret access key of the temporary credentials.
AWS_SESSION_TOKEN: This environment variable stores the session token associated with the temporary credentials, which is required for operations with AWS Security Token Service (STS).

These environment variables are populated by the action after successful authentication with the OIDC provider and assuming the specified IAM role. The action retrieves the temporary credentials from AWS and makes them available to subsequent steps in the workflow.

Once AWS authentication is done and this env variables are created, the next steps in the workflow can access our AWS resources, e.g. read secrets from AWS Secrets Manager:

- name: Read secrets from AWS Secrets Manager into environment variables

uses: aws-actions/aws-secretsmanager-get-secrets@v2

with:

secret-ids: |

my-secret

parse-json-secrets: true

- name: deploy

run: |

echo $AWS_ACCESS_KEY_ID

echo $AWS_SECRET_ACCESS_KEY

env:

MY_KEY: ${{ env.MY_SECRET_MY_KEY }}

This example assumes that in AWS secret my-secret we have a key MY_KEY, set to the secret value we want to fetch and use.

Friday, 13 June 2025

Introduction to Serverless Framework

Serverless Framework is a tool designed to streamline the development and deployment of serverless applications, including functions and infrastructure, by abstracting away the need to manage servers.

We define desired infrastructure in serverless yaml files and then deploy it by executing:

sls deploy

This command parses serverless yaml file into larger AWS CloudFormation template which automatically gets filled with values from the yaml.

Serverless Yaml Configuration File

serverless yaml file defines a serverless service. It is a good idea to break up the serverless project into multiple services, each of which is defined by its own serverless yaml file. We don't want to have everything in one big infrastructure stack.

Example:

database e.g. DynamoDB
Rest API e.g. which handles the submitted web form and stores data in DynamoDB
front-end website which e.g. stores React app website in s3 bucket

Services can be deployed in multiple regions. (Multi-region architecture is supported)

serverless.yml example:

service: my-service

frameworkVersion: "3"

useDotenv: true

plugins:

- serverless-plugin-log-subscription

- serverless-dotenv-plugin

provider:

runtime: nodejs14.x

region: eu-east-1

memorySize: 512

timeout: 900

deploymentBucket:

vpc:

securityGroupIds:

- "sg-0123cf34f6c6354cb"

subnetIds:

- "subnet-01a23493f9e755207"

- "subnet-02b234dbd7d66d33c"

- "subnet-03c234712e99ae1fb"

iam:

role:

statements:

- Effect: Allow

Action:

- lambda:InvokeFunction

Resource: arn:aws:lambda:eu-east-1:123456789099:function:my-database

package:

patterns:

- "out/**"

- "utils.js"

- "aws-sdk"

functions:

my-function:

handler: lambda.handler

events:

- schedule:

name: "my-service-${opt:stage, self:provider.stage}"

description: "Periodically run my-service lambdas"

rate: rate(4 hours)

inputTransformer:

inputTemplate: '{"Records":[{"EventSource":"aws:rate","EventVersion":"1.0","EventSubscriptionArn":"arn:aws:sns:eu-east-1:{{accountId}}:ExampleTopic","Sns":{"Type":"Notification","MessageId":"95df01b4-1234-5678-9903-4c221d41eb5e","TopicArn":"arn:aws:sns:eu-east-1:123456789012:ExampleTopic","Subject":"example subject","Message":"example message","Timestamp":"1970-01-01T00:00:00.000Z","SignatureVersion":"1","Signature":"EXAMPLE","SigningCertUrl":"EXAMPLE","UnsubscribeUrl":"EXAMPLE","MessageAttributes":{"type":{"Type":"String","Value":"populate_unsyncronised"},"count":{"Type":"Number","Value":"400"}}}}]}'

- sns:

arn: arn:aws:sns:us-east-2:123456789099:trigger-my-service

- http:

custom:

dotenv:

dotenvParser: env.loader.js

logSubscription:

enabled: true

destinationArn: ${env:KINESIS_SUBSCRIPTION_STREAM}

roleArn: ${env:KINESIS_SUBSCRIPTION_ROLE}

service: - name of the service
useDotenv: boolean (true|false)
configValidationMode: error
frameworkVersion: e.g. "3"
provider -

name - provider name e.g. aws
runtime - e.g. nodejs18.x
region e.g. us-east-1
memorySize - how much memory will have the machine on which Lambda will be running e.g. 1024 (MB). It is good to check the actual memory usage and adjust the required memory size - downsizing can lower the costs!
timeout: (number) e.g. 60 [seconds] - the maximum amount of time, in seconds, that a serverless function (such as an AWS Lambda function) is allowed to run before it is forcibly terminated by the platform. This setting ensures that your function does not run indefinitely. If the function execution exceeds 60 seconds, the serverless platform will automatically stop it and return a timeout error. The timeout property is commonly used to control resource usage and prevent runaway executions. It is especially important for functions that interact with external services or perform long-running tasks. If not specified, most serverless platforms (like AWS Lambda) use a default timeout (for AWS Lambda, the default is 3 seconds, and the maximum is 900 seconds or 15 minutes).
httpApi:

apiGateway:

minimumCompressionSize: 1024
shouldStartNameWithService: true
restApiId: ""
restApiRootResourceId: ""

stage: - name of the environment e.g. production;
iamManagedPolicies: a list of ARNs of policies that will be associated to the Lambda's computing instance e.g. policy which allows access to S3 buckets etc...
lambdaHashingVersion
environment: dictionary of environment variable names and values
vpc

securityGroupIds: list
subnetIds - typically a list of private subnets with NAT gateway.

functions: a dictionary which defines the AWS Lambda functions that are deployed as part of this Serverless service. This is where we define the AWS Lambda functions that our Serverless service will deploy.

<function_name>: string, a name of the function (e.g., my-function). A name of the provisioned Lambda function is in format: <service_name>-<stage>-<function_name>. Each function entry under functions specifies:

handler - tells Serverless which file and exported function to execute as the Lambda entry point (e.g., lambda.handler)
events - (optional) a list of events that trigger this function, such as:

schedule: for periodic invocation (cron-like jobs)
sns: for invocation via an AWS SNS topic
http

plugins: a list of serverless plugins e.g. serverless-webpack, serverless-esbuild, serverless-offline, serverless-plugin-log-subscription
custom: - section for serverless plugins settings e.g. for esbuild, logSubscription, webpack etc...

example: serverless-plugin-log-subscription plugin has the settings:

logSubscription:

enabled: true
destinationArn: ${env:CLOUDWATCH_LOGS_KINESIS_STREAM}
roleArn: ${env:CLOUDWATCH_LOGS_KINESIS_ROLE}

Useful Kibana DevTools Queries

To perform a search operation on a specific index:

GET /my_index/_search

By itself (without a request body), it returns the first 10 documents by default. This request is the same as the above one:

GET /my_index/_search

{

"query": {

"match_all": {}

}

To get the number of documents in an Elasticsearch index, you can use the _count API or the _stats API.

GET /my_index/_count

This will return a response like:

{

"count": 12345,

"_shards": {

"total": 5,

"successful": 5,

"skipped": 0,

"failed": 0

}

To get a certain number of documents, use size argument:

GET my_index/_search?size=900

We can also use _cat API:

GET /_cat/count/my_index?v

This will return output like:

epoch timestamp count

1718012345 10:32:25 12345

GET /my_index/_stats

"indices": {

"my_index": {

"primaries": {

"docs": {

"count": 12345,

"deleted": 12

}

To get the union of all values of some field e.g. channel_type field across all documents in the my_index index, we can use an Elasticsearch terms aggregation:

GET my_index/_search

{

"size": 0,

"aggs": {

"unique_channel_types": {

"terms": {

"field": "channel_type.keyword",

"size": 10000 // increase if you expect many unique values

}

Explanation:

"size": 0: No documents returned, just aggregation results.
"terms": Collects unique values.
"channel_type.keyword": Use .keyword to aggregate on the raw value (not analyzed text).
"size": 10000: Max number of buckets (unique values) to return. Adjust as needed.

Response example:

{

"aggregations": {

"unique_channel_types": {

"buckets": [

{ "key": "email", "doc_count": 456 },

{ "key": "push", "doc_count": 321 },

{ "key": "sms", "doc_count": 123 }

]

}

The "key" values in the buckets array are your union of channel_type values.

Let's assume that my_index has the timestamp field (as the root field...but it can be at any path in which case we'd need to adjust the query) is correctly mapped as a date type.

To find the oldest document:

GET my_index/_search

{

"size": 1,

"sort": [

{ "timestamp": "asc" }

]

}

To find the newest document:

GET my_index/_search

{

"size": 1,

"sort": [

{ "timestamp": "desc" }

]

}

----

Friday, 30 May 2025

Introduction to Elastic Agents

Elastic Agents are unified, lightweight software components developed by Elastic to collect, ship, and (optionally) protect data—including logs, metrics, traces, and security events—from your infrastructure to the Elastic Stack (Elasticsearch, Kibana, etc.)

Elastic Agents are not strictly required components in every Elastic Stack deployment, but they play a crucial role in certain scenarios. Here's an explanation based on use cases:

Key Functions of Elastic Agents (When Elastic Agents Are Required?)

Unified Data Collection:

They provide a single, centralized solution to collect various types of observability and security data from hosts, containers, and Kubernetes clusters (logs, metrics, traces, and security data)
They replace individual Beats (e.g., Filebeat, Metricbeat) for streamlined data ingestion.

Kubernetes Monitoring:

When deployed on Kubernetes (often as a DaemonSet), Elastic Agent runs on every node, collecting:

System metrics (CPU, memory, disk, etc.)
Kubernetes resource metrics (pods, nodes, deployments)
Logs from nodes and containers
Security posture and events

Fleet Management:
- Elastic Agents can be centrally managed using Elastic Fleet, allowing you to configure, update, and monitor all agents and their integrations from a single Kibana interface
- Elastic Agents are required when using Fleet, the centralized management interface in Kibana.
- Fleet allows you to:
Endpoint Security:

Elastic Agents are necessary for using endpoint Security features, like malware detection, endpoint protection, and threat monitoring, host intrusion detection, and Kubernetes Security Posture Management (KSPM)

When Elastic Agents Are Not Required:

Traditional Beats Usage:
- If you are already using specific Beats (e.g., Filebeat, Metricbeat, Heartbeat) for data collection and do not need unified management, Elastic Agents are optional.
- Beats can ship data directly to Elasticsearch or Logstash without requiring Fleet or Elastic Agents.
Direct Data Ingestion:
- If you are ingesting data directly into Elasticsearch via APIs, custom applications, or third-party tools, Elastic Agents are not needed.
Standalone Elastic Stack:
- For use cases focused purely on search, analytics, or visualization where data is ingested manually or through custom integrations, Elastic Agents are unnecessary.

Key Considerations:

Unified Management: Elastic Agents with Fleet simplify large-scale deployments and are recommended for environments with many data sources.
Compatibility: Elastic is gradually consolidating data collection around Elastic Agents, so they are the future-proof choice for managing observability and security data.
Flexibility: You can still mix and match Elastic Agents and Beats, depending on your requirements.

How Elastic Agents Work in Kubernetes

Deployment

Typically deployed as a DaemonSet so that each Kubernetes node runs an agent instance, ensuring complete coverage for data collection.

Leader Election

One agent may be elected as a leader to handle cluster-wide metrics (like Kubernetes events), while others focus on node-specific data.

Data Flow

Data collected by agents is shipped to Elasticsearch for storage and analysis, and visualized in Kibana

---

In summary, Elastic Agents are not mandatory for all Elastic Stack setups, but they are highly beneficial for unified data collection, centralized management, and security monitoring.

How to deploy Elastic stack via Elastic Cloud on Kubernetes (ECK)

Elastic Cloud on Kubernetes (ECK):

Kubernetes operator
Automates the deployment, provisioning, management, and orchestration of Elastic applications on Kubernetes, including:

Elasticsearch
Kibana
APM Server
Beats
Elastic Agent
Elastic Maps Server
Logstash

elastic/cloud-on-k8s: Elastic Cloud on Kubernetes

OperatorHub.io | The registry for Kubernetes Operators

To install Elasticsearch cluster on the existing k8s cluster, we can use ECK Helm charts from "https://helm.elastic.co" repository. These charts need to be installed in the particular order:

eck-operator-crds
eck-operator
eck-elasticsearch
eck-kibana
eck-fleet-server
eck-agent
eck-apm-server

Install ECK | Elastic Docs

Prerequisites

AWS EKS cluster with addons
User with associated arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy

eck-operator-crds

eck-operator-crds 3.0.0 · elastic/elastic

ECK relies on a set of Custom Resource Definitions (CRDs) to define how applications are deployed. CRDs are global (not namespace-specific) resources, shared across the entire Kubernetes cluster, so installing them requires specific permissions (e.g. in AWS EKS, the installer might have role with this associated policy: arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy).

This chart installs Elastic Custom Resource Definitions (CRDs). When we install ECK (Elastic's operator for managing Elasticsearch, Kibana, Beats, APM, etc. on Kubernetes), it defines several CRDs. The main kinds include (with their CRD Names and Descriptions):

Elasticsearch

elasticsearches.elasticsearch.k8s.elastic.co
Manages Elasticsearch clusters

Kibana

kibanas.kibana.k8s.elastic.co
Manages Kibana instances

ApmServer

apmservers.apm.k8s.elastic.co
Manages APM Servers

Beat

beats.beat.k8s.elastic.co
Manages Beats (Filebeat, Metricbeat, etc.) agents

EnterpriseSearch

enterprisesearches.enterprisesearch.k8s.elastic.co
Manages Enterprise Search instances

To list all CRDs installed by the "eck-operator-crds" Helm chart:

% kubectl get crd | grep elastic

agents.agent.k8s.elastic.co 2025-05-15T11:37:50Z

apmservers.apm.k8s.elastic.co 2025-05-15T11:37:50Z

beats.beat.k8s.elastic.co 2025-05-15T11:37:50Z

elasticmapsservers.maps.k8s.elastic.co 2025-05-15T11:37:50Z

elasticsearchautoscalers.autoscaling.k8s.elastic.co 2025-05-15T11:37:50Z

elasticsearches.elasticsearch.k8s.elastic.co 2025-05-15T11:37:50Z

enterprisesearches.enterprisesearch.k8s.elastic.co 2025-05-15T11:37:50Z

kibanas.kibana.k8s.elastic.co 2025-05-15T11:37:50Z

logstashes.logstash.k8s.elastic.co 2025-05-15T11:37:50Z

stackconfigpolicies.stackconfigpolicy.k8s.elastic.co 2025-05-15T11:37:50Z

Note that this chart only installs CRDs, not resources of these types! If we deploy only resources of types Agent, APM Server, Elasticsearch and Kibana, the command which lists all resources of Elastic CRD types would return something like this:

% kubectl get elastic -n elastic-system

NAME HEALTH AVAILABLE EXPECTED VERSION AGE

agent.agent.k8s.elastic.co/my-fleet-agent green 6 6 9.0.1 28h

agent.agent.k8s.elastic.co/my-fleet-server green 1 1 9.0.1 2d17h

NAME HEALTH NODES VERSION AGE

apmserver.apm.k8s.elastic.co/my-apm-server green 1 9.0.1 7h39m

NAME HEALTH NODES VERSION PHASE AGE

elasticsearch.elasticsearch.k8s.elastic.co/my-elasticsearch green 9 9.0.1 Ready 10d

NAME HEALTH NODES VERSION AGE

kibana.kibana.k8s.elastic.co/my-kibana green 1 9.0.1 10d

The most common resource managed by the ECK operator is Elasticsearch, which represents an Elasticsearch cluster within your Kubernetes environment. The correct command to list all Elasticsearch clusters managed by the operator is:

% kubectl get elasticsearch -n elastic-system

NAME HEALTH NODES VERSION PHASE AGE

my-elasticsearch green 9 9.0.1 Ready 1m

eck-operator

eck-operator 3.0.0 · elastic/elastic

Installs the ECK Operator, which is the official Kubernetes operator from Elastic which helps us deploy and manage (orchestrate) Elastic applications on Kubernetes including:

Elasticsearch
Kibana
APM Server
Enterprise Search
Beats
Elastic Agent
Elastic Maps Server

A namespace elastic-system is created where ECK operator runs.

The "elastic-system" namespace in Elastic Cloud on Kubernetes (ECK) is where the ECK operator itself is deployed. While the operator resides in this namespace, it's recommended to deploy your application workloads (like Elasticsearch and Kibana) in a separate, dedicated namespace, not within elastic-system or the default namespace. elastic-system namespace is used to isolate the ECK operator from accidental deletion and to manage its resources. Deploying workloads in elastic-system could lead to conflicts or accidental deletion of the operator resources. Using a separate namespace provides better organization and isolation.

Monitoring the operator

eck-elasticsearch

This chart creates an Elasticsearch cluster.

In an Elasticsearch cluster, all Elasticsearch nodes provide the REST API. These nodes are the fundamental building blocks of the cluster and handle data storage, indexing, and search operations. The REST API is exposed on a specific port (usually 9200) and provides a standardized way to interact with the cluster. Elasticsearch uses a RESTful API, which means that requests are sent using standard HTTP methods (GET, POST, PUT, DELETE). The REST API allows you to perform tasks like adding or removing nodes, managing indices, and configuring various cluster settings. You can use the REST API to perform searches, retrieve data, and interact with the documents stored in Elasticsearch. The REST API is accessible to any client that can make HTTP requests, allowing us to integrate with various tools, applications, and programming languages.

Elasticsearch clusters have a designated master node that is responsible for managing the cluster state and performing certain critical operations. However, it's not a single, dedicated master node in the traditional sense.

Master-eligible nodes: All nodes in an Elasticsearch cluster are master-eligible by default, meaning any of them can potentially become the master node.

Elected master: Only one node is actively the master node at any given time. This node is elected from the master-eligible nodes using a distributed consensus algorithm.

Role of the master: The master node manages the cluster state, which includes:

Creating or deleting indexes.
Tracking which nodes are part of the cluster.
Allocating shards to different nodes.
Updating and propagating the cluster state across the cluster.

Master node elections: If the current master node fails or becomes unavailable, a new master node is automatically elected from the remaining master-eligible nodes, ensuring that the cluster can continue operating.

High availability: Using dedicated master nodes and a sufficient number of master-eligible nodes ensures high availability, even if one or more nodes fail.

In essence, while all nodes can potentially be the master, only one is actively managing the cluster at any given time. This ensures that the cluster remains stable and functional, even if the master node fails, as a new one will be elected quickly.

---

When the Elasticsearch resource is created, a default user named elastic is created automatically, and is assigned the superuser role.

Its password can be retrieved in a Kubernetes secret, whose name is based on the Elasticsearch resource name: <elasticsearch-name>-es-elastic-user.

This user is used for accessing Elasticsearch down the line e.g. if using Elasticsearch Terraform provider. Credentials can be stored in AWS Secrets Manager.

---

When this chart is deployed, ECK Operator automatically creates a Kubernetes service object:

By default, its name follows this format: <elasticsearch_cluster_name>-es-http
Its roles are:

Primary access point: It acts as the main endpoint for clients (such as applications, users, or other services) to interact with the Elasticsearch cluster using the REST API.
Handles authentication and TLS: The service is secured by default with TLS and basic authentication, managed by the ECK operator.
Traffic distribution: It routes incoming HTTP (REST API) traffic to all Elasticsearch nodes in our cluster, unless we create custom services for more granular routing (for example, to target only data or ingest nodes).

Its type is ClusterIP, meaning it is accessible only within the Kubernetes cluster (from other pods or nodes that are part of the same cluster) unless otherwise configured
The service listens on port 9200 (the default Elasticsearch HTTP port) and load-balances requests to the Elasticsearch pods

How to access this service?

Within the Kubernetes cluster we need to use the service DNS name:

https://<elasticsearch_cluster_name>-es-http.<namespace>:9200

For example, if our cluster is named elasticsearch and in the elastic-system namespace:

https://elasticsearch-es-http.elastic-system:9200

We can use the user mentioned above (elastic) that ECK automatically created must provide:

The CA certificate (to trust the service’s TLS certificate)
The elastic user password (stored in a Kubernetes secret)

Example:

NAME=elasticsearch

NAMESPACE=elastic-system

kubectl get secret "$NAME-es-http-certs-public" \

-n $NAMESPACE \

-o go-template='{{index .data "tls.crt" | base64decode }}' \

> tls.crt

PW=$(\

kubectl get secret "$NAME-es-elastic-user" \

-n $NAMESPACE \

-o go-template='{{.data.elastic | base64decode }}')

curl \

--cacert tls.crt \

-u elastic:$PW \

https://$NAME-es-http.$NAMESPACE:9200/

To access it from outside the cluster we need to change the service type to LoadBalancer or use an Ingress to expose it externally.

When using LoadBalancer, the service will get an external IP address, and we can access it via https://<external-ip>:9200.

We need to be sure to secure access appropriately, as this exposes our Elasticsearch cluster to external networks.

---

When Elasticsearch is created, we can create Elastic Serverless Forwarder.

eck-kibana

Installs Kibana.

Creates Kubernetes service object. Its name follows this format: <kibana_name>-kb-http.

To expose this service externally, we can create an Ingress.

Now, when Kibana is installed, we can crate:

snapshot repositories. They can be S3-backed.
Kibana users. Each user is assigned a set of predefined Kibana roles.
Index lifecycle rules
Component templates (which reference those lifecycle rules)

eck-fleet-server

Creates Kubernetes service object. Its name follows this pattern: <fleet_server_name>-agent-http.

To expose this service externally, we can create Fleet Server Ingress.

After Fleet Server is crated, we can install:

Fleet Integrations, like:

system
fleet_server
elastic_agent
kibana
elasticsearch
kubernetes
apm
aws

Fleet Agent Policies

Fleet Server can have its own policy

e.g. sys_monitoring can be disabled while monitor_logs and monitor_metrics enabled

Fleet Agents have their own policy

e.g. all monitoring types enabled

Fleet Integration Policies

They define:

which agent policy should be associated with which integration
inputs

eck-agent

eck-agent 0.15.0 · elastic/elastic

This Helm chart deploys:

Elastic Agent Custom Resource

The chart creates one or more Elastic Agent custom resources (Agent), which are Kubernetes objects managed by the ECK operator.
These resources define how Elastic Agents are deployed, configured, and connected to your Elasticsearch and Kibana instances.

Associated Kubernetes Resources

The Agent custom resource triggers the ECK operator to create the necessary Kubernetes resources, such as:

Pods/DaemonSets: Runs the Elastic Agent containers on your nodes. Elastic Agents are typically deployed as pods (usually via a DaemonSet or Deployment) in a namespace such as kube-system or elastic-agent. These pods execute the Elastic Agent binary, which collects data and communicates with the Fleet Server.
ConfigMaps/Secrets: Stores configuration and credentials for the agents.
ServiceAccounts, Roles, RoleBindings: Manages permissions for the agents to interact with the Kubernetes API if needed.

Fleet Integration (Optional)

The chart can configure Elastic Agents to enroll with Elastic Fleet, allowing for centralized management of agent policies and integrations.

We can specify the mode of Agent to use. Only set to "fleet" when Fleet Server is enabled. Default value is:

mode: "fleet"

Both `mode: fleet` and `fleetServerEnabled: true` need to be set for Fleet Server to be enabled. By default, the Agent does NOT act as the Fleet Server:

fleetServerEnabled: false

We need to provide a DaemonSet, StatefulSet, or Deployment specification for Agent.

Some Elastic Agent features, such as the Kubernetes integration, require that Agent Pods interact with Kubernetes APIs. This functionality requires specific permissions. This is why Elastic Agent runs on behalf of its own service account. Its default value is elastic-agent:

serviceAccount:

We also need to provide policyID which determines into which Agent Policy this Agent will be enrolled. Default value is:

policyID: eck-agent

---

Typical Use Cases

Observability: Collect logs, metrics, and traces from Kubernetes workloads and nodes.
Security: Use Elastic Agent for security monitoring and data shipping.
Fleet Management: Centrally manage agent configurations using Elastic Fleet.

Example: What we might see deployed

An Agent resource in your chosen namespace (e.g., elastic-agent).

One or more Elastic Agent pods (as a DaemonSet or Deployment, depending on configuration).

Supporting Kubernetes resources for configuration and permissions.

How to Verify

After installing the chart (e.g., with helm install elastic-agent elastic/eck-agent -n elastic-agent --create-namespace), we can check:

kubectl get agent -n elastic-agent

kubectl get pods -n elastic-agent

If we installed the Helm chart in the elastic-system namespace, we might have see the output like here:

% kubectl get agent -n elastic-system

NAME HEALTH AVAILABLE EXPECTED VERSION AGE

my-agent green 6 6 9.0.1 47h

my-fleet-server green 1 1 9.0.1 3d12h

To find all agent pods and their labels:

% kubectl get pods --show-labels -n elastic-system | grep agent

The command above lists all pods with agent in their name and/or label and their labels which are:

common.k8s.elastic.co/type=agent <-- this shows that pod is running Agent (of Elastic CRD type)
agent.k8s.elastic.co/name=<helm_installation_name>-eck-agent
agent.k8s.elastic.co/version=version set int version attribute in chart values

Now when we know pods' labels, we can target only that we are interested in. For example, we want to check the status of all Elastic Agent pods:

% kubectl get pods -l common.k8s.elastic.co/type=agent -n elastic-system -o wide

The Elastic Agent pods running in your Kubernetes cluster and the agents listed in Kibana’s Fleet UI are directly related (each pod running ok in cluster represents a healthy Agent in Kibana) but represent different layers of abstraction (Kubernetes vs Fleet Layer). This applies both for Elastic Fleet Server Agent and regular Elastic Agents.

Agents in Kibana UI (Fleet Layer)

Agents registered with Fleet Server and visible in Kibana’s Fleet > Agents UI. These represent logical agents managed by Fleet, regardless of their deployment method (Kubernetes, VMs, etc.).

Statuses:

Online: Agent is actively communicating with Fleet Server.
Offline: Agent has not checked in with Fleet Server recently (default: 2 minutes).

TODO: Why Some Agents Appear Offline in Kibana?

In Kibana UI we can see that each agent has its associated Agent Policy. Elastic Agent policies are central configurations that define what data Elastic Agents should collect, how they should collect it, and where to send it. Each Elastic Agent can only be enrolled in a single policy, which contains a set of integration policies specifying the configuration for each type of data input (such as logs, metrics, security events, etc.)

Elastic Agent policy revisions are version numbers that track changes made to an Elastic Agent policy over time. Each time we update a policy—such as adding or removing integrations, changing configuration settings, or modifying outputs—the policy’s revision number is incremented. This revision number is included in the policy sent to each enrolled Elastic Agent, allowing both Fleet and the agents themselves to know which version of the policy is currently applied.

Who manages policy revisions?

Fleet (in Kibana) manages policy revisions automatically. When you make any change to an agent policy through the Fleet UI or API, Fleet increments the revision number and distributes the updated policy to all agents enrolled in that policy.
Users do not manually set or manage the revision number; it is handled by the Fleet management system.

Purpose and Benefits

Change tracking: The revision number helps track when and how a policy has changed. Each agent reports which policy revision it is using, making it easy to see if agents are up to date.
Troubleshooting: If agents are not behaving as expected, the revision number can help correlate issues with recent policy changes.
Auditability: While the revision number itself does not provide a full change history, it signals that a change has occurred. (Note: The Fleet UI does not currently provide a detailed revision history with user attribution)

Rolling out new policy versions

When we change an Elastic Agent policy (which increments its version number), the updated policy is rolled out immediately to all agents enrolled in that policy. Fleet distributes the new policy as soon as we save our changes, and agents will attempt to fetch and apply the updated configuration right away.

Immediate distribution:

Any changes to a policy or its integrations are immediately sent to all enrolled agents.

Agent update timing:

Agents regularly check in with Fleet Server. Upon their next check-in, they detect the new policy revision and apply it, usually within a few seconds to a couple of minutes.

Status indication:

While the policy is being applied, the agent status in the Fleet UI may briefly show as "Updating" before returning to "Healthy" once the new policy is active.

No manual intervention required:

We do not need to manually trigger the rollout; it is automatic and managed by Fleet.

Policy changes are automatically and quickly rolled out to all affected Elastic Agents, ensuring configurations stay consistent and up to date across your infrastructure

eck-apm-server

Creates Kubernetes service which name follows this format: <apm_server_name>-apm-http

To expose this service externally, we can crate APM Server Ingress.

...

How to test ECK cluster health?

If we've set in Route53 a record that resolves elasticsearch.mycorp.com into Elasticsearch Load Balancer DNS name, we can use:

% curl -u ‘user:pass’ -X GET 'https://elasticsearch.mycorp.com:443/_cluster/health?pretty'

{

"cluster_name" : "elasticsearch-prod",

"status" : "green",

"timed_out" : false,

"number_of_nodes" : 9,

"number_of_data_nodes" : 9,

"active_primary_shards" : 106,

"active_shards" : 212,

"relocating_shards" : 0,

"initializing_shards" : 0,

"unassigned_shards" : 0,

"unassigned_primary_shards" : 0,

"delayed_unassigned_shards" : 0,

"number_of_pending_tasks" : 0,

"number_of_in_flight_fetch" : 0,

"task_max_waiting_in_queue_millis" : 0,

"active_shards_percent_as_number" : 100.0

}

Verify that all instances of any of elastic CRDs is in green state:

% kubectl get elastic -n elastic-system

Check indices. Verify that each index has health=green and status=open and that has 1 primary and 1 replica shard.

Check shards. Verify that each shard has state=STARTED.

---

Friday, 16 May 2025

How to use Helm charts

Case study: we want to install Elasticsearch via Helm chart.

From Elastic Stack Helm chart | Elastic Docs we can see that Elastic offers a repository of Helm charts: https://helm.elastic.co.

Inspecting the local Helm repositories

Before adding some repo to our local system, we can check if that repo has already been added:

% helm repo list

NAME URL

stable https://charts.helm.sh/stable

bitnami https://charts.bitnami.com/bitnami

Each entry includes:

NAME: The local alias you’ve given to the repo.
URL: The actual remote chart repository URL.

Adding a new Helm repository

We first need to add Elastic Helm repository to our local Helm repository list:

% helm repo add elastic https://helm.elastic.co

We can choose an arbitrary local name for the repository we're adding. We used elastic as repository is provided by Elastic.

The next step is to update information of available charts locally from all added chart repositories, or from the one we've just added:

% helm repo update elastic

Hang tight while we grab the latest from your chart repositories...

...Successfully got an update from the "elastic" chart repository

Update Complete. ⎈Happy Helming!⎈

helm repo update basically downloads all Helm charts from a given repo to our local registry.

To update repo index (fetch latest chart versions) for all local repositories:

% helm repo update

Inspecting charts in a local Helm repository

Let's now list all charts in elastic repository:

% helm search repo elastic

NAME CHART VERSION APP VERSION DESCRIPTION

elastic/eck-elasticsearch 0.15.0 Elasticsearch managed by the ECK operator

elastic/elastic-agent 9.0.1 9.0.1 Elastic-Agent Helm Chart

elastic/elasticsearch 8.5.1 8.5.1 Official Elastic helm chart for Elasticsearch

elastic/apm-attacher 1.1.3 A Helm chart installing the Elastic APM Kuberne...

elastic/apm-server 8.5.1 8.5.1 Official Elastic helm chart for Elastic APM Server

elastic/eck-agent 0.15.0 Elastic Agent managed by the ECK operator

elastic/eck-apm-server 0.15.0 Elastic APM Server managed by the ECK operator

elastic/eck-beats 0.15.0 Elastic Beats managed by the ECK operator

elastic/eck-enterprise-search 0.15.0 Elastic Enterprise Search managed by the ECK op...

elastic/eck-fleet-server 0.15.0 Elastic Fleet Server as an Agent managed by the...

elastic/eck-kibana 0.15.0 Kibana managed by the ECK operator

elastic/eck-logstash 0.15.0 Logstash managed by the ECK operator

elastic/eck-operator 3.0.0 3.0.0 Elastic Cloud on Kubernetes (ECK) operator

elastic/eck-operator-crds 3.0.0 3.0.0 ECK operator Custom Resource Definitions

elastic/eck-stack 0.15.0 Elastic Stack managed by the ECK Operator

elastic/filebeat 8.5.1 8.5.1 Official Elastic helm chart for Filebeat

elastic/kibana 8.5.1 8.5.1 Official Elastic helm chart for Kibana

elastic/kube-state-metrics 5.30.1 2.15.0 Install kube-state-metrics to generate and expo...

elastic/logstash 8.5.1 8.5.1 Official Elastic helm chart for Logstash

elastic/metricbeat 8.5.1 8.5.1 Official Elastic helm chart for Metricbeat

elastic/pf-host-agent 8.14.3 8.14.3 Hyperscaler software efficiency. For everybody.

elastic/profiling-agent 9.0.0 9.0.0 Hyperscaler software efficiency. For everybody.

elastic/profiling-collector 9.0.0 9.0.0 Universal Profiling. Hyperscaler software effic...

elastic/profiling-symbolizer 9.0.0 9.0.0 Universal Profiling. Hyperscaler software effic...

Another way of checking all charts is to download index.yaml file from the remote repository. It contains information of ALL versions of ALL charts in the repo:

% curl https://helm.elastic.co/index.yaml

...

- apiVersion: v2

appVersion: 8.15.0

created: "2024-08-08T09:05:09.582088545Z"

description: 'Universal Profiling. Hyperscaler software efficiency. For everybody. '

digest: 9f6a78ed179cda2792259ad7c73db32c2753bf5e3317135fca52fbfb48a8063c

icon: https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/blt6ec3007768940247/63337a1f4d11fa0cfdb55244/illustration-deployment-3-arrows.png

kubeVersion: '>= 1.22.0-0'

urls:

- https://helm.elastic.co/helm/profiling-symbolizer/profiling-symbolizer-8.15.0.tgz

version: 8.15.0

- apiVersion: v2

appVersion: 8.14.3

created: "2024-07-11T13:35:06.289007371Z"

description: 'Universal Profiling. Hyperscaler software efficiency. For everybody. '

digest: 9d1656e80f9c96c3cf7fa2d0692c7318e34e20ff6ad1da13a6b4dae1c82bc990

icon: https://static-www.elastic.co/v3/assets/bltefdd0b53724fa2ce/blt6ec3007768940247/63337a1f4d11fa0cfdb55244/illustration-deployment-3-arrows.png

kubeVersion: '>= 1.22.0-0'

urls:

- https://helm.elastic.co/helm/profiling-symbolizer/profiling-symbolizer-8.14.3.tgz

version: 8.14.3

...

To see only versions of some particular chart e.g. eck-elasticsearch:

% curl -s https://helm.elastic.co/index.yaml | grep eck-elasticsearch

eck-elasticsearch:

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.15.0.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.14.1.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.14.0.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.13.0.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.12.1.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.12.0.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.11.0.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.10.0.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.9.1.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.9.0.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.8.0.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.7.0.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.7.0-SNAPSHOT.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.6.0.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.4.0.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.3.0.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.2.0.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.1.1.tgz

- https://helm.elastic.co/helm/eck-elasticsearch/eck-elasticsearch-0.1.0.tgz

- condition: eck-elasticsearch.enabled

- condition: eck-elasticsearch.enabled

...

As the response is YAML document, we can use yq tool to extract exactly what we need:

Name: eck-elasticsearch

Version: 0.15.0

Name: eck-elasticsearch

Version: 0.14.1

Name: eck-elasticsearch

Version: 0.14.0

Name: eck-elasticsearch

Version: 0.13.0

Name: eck-elasticsearch

Version: 0.12.1

...

Let's say we want to install elastic/eck-elasticsearch chart. How can we find its default values?

% helm show values elastic/eck-elasticsearch

---

# Default values for eck-elasticsearch.

# This is a YAML-formatted file.

# Overridable names of the Elasticsearch resource.

# By default, this is the Release name set for the chart,

# followed by 'eck-elasticsearch'.

# nameOverride will override the name of the Chart with the name set here,

# so nameOverride: quickstart, would convert to '{{ Release.name }}-quickstart'

# nameOverride: "quickstart"

# fullnameOverride will override both the release name, and the chart name,

# and will name the Elasticsearch resource exactly as specified.

# fullnameOverride: "quickstart"

# Version of Elasticsearch.

version: 9.0.0

# Elasticsearch Docker image to deploy

# image:

# Labels that will be applied to Elasticsearch.

labels: {}

# Annotations that will be applied to Elasticsearch.

annotations: {}

# Settings for configuring Elasticsearch users and roles.

# ref: https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-users-and-roles.html

auth: {}

# Settings for configuring stack monitoring.

# ref: https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-stack-monitoring.html

monitoring: {}

# metrics:

# elasticsearchRefs:

# - name: monitoring

# namespace: observability

# logs:

# elasticsearchRefs:

# - name: monitoring

# namespace: observability

...

We can save this document into a local yaml file which we can then modify and adjust to our needs:

% helm show values elastic/eck-elasticsearch > eck-elasticsearch-values.yaml

How to find which resources will Helm chart deploy?

We have few options:

1) Dry-run install the chart and inspect output

% helm install my-fleet-server elastic/eck-fleet-server --dry-run --debug

This will render the templates using default values (or our custom --values file) and print all the generated Kubernetes YAML to stdout.

We need to look for Look for: Deployment, Service, Secret, ConfigMap, Pod or any custom resources (Agent, etc.).

2) Download the chart locally and inspect the templates

% helm pull elastic/eck-fleet-server --untar

% cd eck-fleet-server

Now we can inspect the files under templates/ and values.yaml.

We'll see:

All the resource templates (deployment.yaml, service.yaml, etc.)
Which fields can be customized.

3) Search the Helm chart source code on GitHub

We can inspect:

templates/ folder – actual YAML templates
values.yaml – configurable inputs
Chart.yaml – metadata

Installing Helm chart

To deploy Helm chart into the Kubernetes cluster by using our own values:

% helm install \

my-elasticsearch \

elastic/eck-elasticsearch \

-f eck-elasticsearch-values.yaml \

-n elastic-system \

--create-namespace

We chose to deploy it in a custom namespace which we named elastic-system.

By default, helm install installs the chart into the Kubernetes cluster our kubectl is currently configured to use. Helm relies on the kubeconfig file (typically located at ~/.kube/config) to know which cluster to interact with.

helm install:

Reads the kubeconfig file used by kubectl.
Connects to the current Kubernetes context (cluster and namespace).
Installs the Helm chart to that cluster, unless you override the context or namespace.

We can control the target cluster and namespace using the following:

helm install my-release elastic/eck-elasticsearch --kube-context=my-cluster-context

To list contexts:

kubectl config get-contexts

To switch context:

kubectl config use-context my-cluster-context

Before we attempt to target a remote Kubernetes cluster, we need to ensure that:

Our ~/.kube/config contains valid credentials and cluster info.
We can interact with it using kubectl (test with kubectl get nodes or kubectl get pods).

To remove the repo from the local system:

% helm repo remove elastic

---

Pages

Monday, 30 June 2025

Introduction to Amazon API Gateway

RESTful APIs

Friday, 27 June 2025

GitHub Workflows and AWS

IAM User Authentication

OpenID Connect (OIDC) Authentication

Friday, 13 June 2025

Introduction to Serverless Framework

Serverless Yaml Configuration File

Thursday, 12 June 2025

Useful Kibana DevTools Queries

Friday, 30 May 2025

Introduction to Elastic Agents

Key Functions of Elastic Agents (When Elastic Agents Are Required?)

When Elastic Agents Are Not Required:

Key Considerations:

How Elastic Agents Work in Kubernetes

Deployment

Leader Election

Data Flow

How to deploy Elastic stack via Elastic Cloud on Kubernetes (ECK)

Prerequisites

eck-operator-crds

eck-operator

Monitoring the operator

eck-elasticsearch

eck-kibana

eck-fleet-server

eck-agent

Typical Use Cases

Example: What we might see deployed

How to Verify

Agents in Kibana UI (Fleet Layer)

eck-apm-server

How to test ECK cluster health?

Friday, 16 May 2025

How to use Helm charts

Inspecting the local Helm repositories

Adding a new Helm repository

Inspecting charts in a local Helm repository

How to find which resources will Helm chart deploy?

1) Dry-run install the chart and inspect output

2) Download the chart locally and inspect the templates

3) Search the Helm chart source code on GitHub

Installing Helm chart