Monday, 30 June 2025

Introduction to Amazon API Gateway

Amazon API Gateway:

fully managed service to create, publish, maintain, monitor, and secure APIs at any scale

APIs act as the "front door" for applications to access data, business logic, or functionality from our backend services

allows creating:

RESTful APIs

optimized for serverless workloads and HTTP backends using HTTP APIs

they act as triggers for Lambda functions

HTTP APIs are the best choice for building APIs that only require API proxy functionality
Use REST APIs if our APIs require in a single solution both:

API proxy functionality
API management features

WebSocket APIs that enable real-time two-way communication applications

supports:

containerized workloads
serverless workloads
web applications

handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including:

traffic management
CORS support
authorization and access control
throttling
monitoring
API version management

has no minimum fees or startup costs. We pay for the API calls we receive and the amount of data transferred out and, with the API Gateway tiered pricing model, we can reduce our cost as our API usage scales

RESTful APIs

What is the difference between REST API endpoints (apiGateway) and HTTP API endpoints (httpApi)?

The difference between REST API endpoints (apiGateway) and HTTP API endpoints (httpApi) in Amazon API Gateway primarily comes down to features, performance, cost, and use cases.

REST API endpoints (apiGateway):

Older, feature-rich, supports API keys, usage plans, request/response validation, custom authorizers, and more.
More configuration options, but higher latency and cost.
Defined under the provider.apiGateway section and function events: http.

HTTP API endpoints (httpApi):

Newer, simpler, faster, and cheaper.
Supports JWT/Lambda authorizers, CORS, and OIDC, but lacks some advanced REST API features.
Defined under provider.httpApi and function events: httpApi.

GitHub Workflows and AWS

GitHub workflow can communicate with our AWS resources, directly (via AWS CLI commands) or indirectly (via e.g. Terraform AWS provider).

Before running AWS CLI commands, deploying AWS infrastructure with Terraform, or interacting with AWS services in any way we need to include a step which configures AWS credentials. It ensures that the workflow runner is authenticated with AWS and knows which region to target.

This step should contain configure-aws-credentials action provided by AWS. This action sets up the necessary environment variables so that AWS CLI commands and SDKs can authenticate with AWS services.

aws-region input sets the default AWS region to us-east-2 (Ohio). All AWS commands run in later steps will use this region unless overridden.

We can use either IAM user or OIDC (temp token) authentication.

IAM User Authentication

If using IAM user authentication, we can store user's credentials in a dedicated GitHub secrets:

env:

AWS_ACCOUNT_ID: ${{ secrets.AWS_ACCOUNT_ID }}

AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}

AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

AWS_REGION: us-east-2

// Define this step before steps which are accessing AWS:

- name: Configure AWS Credentials

uses: aws-actions/configure-aws-credentials@v2

with:

aws-region: ${{ env.AWS_REGION }}

OpenID Connect (OIDC) Authentication

In this authentication, configure-aws-credentials GitHub Action uses GitHub's OpenID Connect (OIDC) for secure authentication with AWS. It leverages the OIDC token provided by GitHub to request temporary AWS credentials from AWS STS, eliminating the need to store long-lived AWS access keys in GitHub Secrets.

Note that we now need to grant the workflow run a permissions for write access to the id-token:

id-token: write allows the workflow to request and use OpenID Connect (OIDC) tokens. The write level is required for actions that need to generate or use OIDC tokens to authenticate with external systems. Granting id-token: write is essential for workflows that use OIDC-based authentication, such as securely assuming AWS IAM roles via GitHub Actions. This enables secure, short-lived authentication to AWS and other cloud providers. This permission is a security best practice for modern CI/CD workflows that use OIDC to authenticate with cloud providers, reducing the need for static secrets.

env:

AWS_REGION: us-east-2

permissions:

id-token: write # aws-actions/configure-aws-credentials (OIDC)

...

- name: Configure AWS Credentials

uses: aws-actions/configure-aws-credentials@v4

with:

role-to-assume: arn:aws:iam::123456789012:role/github-actions-role

role-session-name: my-app

aws-region: ${{ env.AWS_REGION }}

Here's how it works:

GitHub OIDC Provider: GitHub acts as an OIDC provider, issuing signed JWTs (JSON Web Tokens) to workflows that request them.
configure-aws-credentials Action: This action, when invoked in a GitHub Actions workflow, receives the JWT from the OIDC provider.
AWS STS Request: The action then uses the JWT to request temporary security credentials from AWS Security Token Service (STS).
Credential Injection: AWS STS returns temporary credentials (access key ID, secret access key, and session token) which the action injects as environment variables into the workflow's execution environment.
AWS SDKs and CLI: AWS SDKs and the AWS CLI automatically detect and use these environment variables for authenticating with AWS services.

Benefits of using OIDC with configure-aws-credentials:

Enhanced Security: Eliminates the need to store long-lived AWS access keys, reducing the risk of compromise.
Simplified Credential Management: Automatic retrieval and injection of temporary credentials, simplifying workflow setup and maintenance.
Improved Auditing: Provides better traceability of actions performed within AWS, as the identity is linked to the GitHub user or organization.

Before using the action:

Configure an OpenID Connect provider in AWS: We need to establish an OIDC trust relationship between GitHub and our AWS account.
Create an IAM role in AWS: Define the permissions for the role that the configure-aws-credentials action will assume.
Set up the GitHub workflow: Configure the configure-aws-credentials action with the appropriate parameters, such as the AWS region and the IAM role to assume.

In an OpenID Connect (OIDC) authentication scenario, the aws-actions/configure-aws-credentials action creates the following environment variables when assuming a role with temporary credentials: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN. These variables are used by the AWS SDK and CLI to interact with AWS resources.

Here's a breakdown:

AWS_ACCESS_KEY_ID: This environment variable stores the access key ID of the temporary credentials.
AWS_SECRET_ACCESS_KEY: This environment variable stores the secret access key of the temporary credentials.
AWS_SESSION_TOKEN: This environment variable stores the session token associated with the temporary credentials, which is required for operations with AWS Security Token Service (STS).

These environment variables are populated by the action after successful authentication with the OIDC provider and assuming the specified IAM role. The action retrieves the temporary credentials from AWS and makes them available to subsequent steps in the workflow.

Once AWS authentication is done and this env variables are created, the next steps in the workflow can access our AWS resources, e.g. read secrets from AWS Secrets Manager:

- name: Read secrets from AWS Secrets Manager into environment variables

uses: aws-actions/aws-secretsmanager-get-secrets@v2

with:

secret-ids: |

my-secret

parse-json-secrets: true

- name: deploy

run: |

echo $AWS_ACCESS_KEY_ID

echo $AWS_SECRET_ACCESS_KEY

env:

MY_KEY: ${{ env.MY_SECRET_MY_KEY }}

This example assumes that in AWS secret my-secret we have a key MY_KEY, set to the secret value we want to fetch and use.

Friday, 13 June 2025

Introduction to Serverless Framework

Serverless Framework is a tool designed to streamline the development and deployment of serverless applications, including functions and infrastructure, by abstracting away the need to manage servers.

We define desired infrastructure in serverless yaml files and then deploy it by executing:

sls deploy

This command parses serverless yaml file into larger AWS CloudFormation template which automatically gets filled with values from the yaml.

The sls deploy command in the Serverless Framework is effectively idempotent at the infrastructure level, but with important nuances:

How it works:

sls deploy packages our service and deploys it via AWS CloudFormation. CloudFormation itself is designed to be idempotent: if we deploy the same stack with the same configuration and code, AWS will detect no changes and will not modify our resources. If there are changes, only those changes are applied.

What this means:

Repeated runs of sls deploy with no changes will not create duplicate resources or apply unnecessary updates.

If we make changes (to code, configuration, or infrastructure), only the differences are deployed.

Side effects in Lambda code: While infrastructure deployment is idempotent, our Lambda functions themselves must be written to handle repeated invocations safely if we want end-to-end idempotency. The deployment command itself does not guarantee idempotency at the application logic level.

Limitations:

If we use sls deploy function (to update a single function without CloudFormation), this command simply swaps out the function code and is also idempotent in the sense that re-uploading the same code does not cause issues.

If we use plugins or custom resources, their behavior may not always be idempotent unless explicitly designed that way.

To conclude:

sls deploy is idempotent for infrastructure: Re-running it with no changes is safe and does not cause duplicate resources or unintended side effects at the CloudFormation level.
Application-level idempotency is our responsibility: Ensure our Lambda functions and integrations handle repeated events if that is a requirement for our use case

Serverless Yaml Configuration File

serverless yaml file defines a serverless service. It is a good idea to break up the serverless project into multiple services, each of which is defined by its own serverless yaml file. We don't want to have everything in one big infrastructure stack.

Example:

database e.g. DynamoDB
Rest API e.g. which handles the submitted web form and stores data in DynamoDB
front-end website which e.g. stores React app website in s3 bucket

Services can be deployed in multiple regions. (Multi-region architecture is supported)

serverless.yml example:

service: my-service

frameworkVersion: "3"

useDotenv: true

plugins:

- serverless-plugin-log-subscription

- serverless-dotenv-plugin

provider:

name: aws

runtime: nodejs14.x

region: eu-east-1

memorySize: 512

timeout: 900

deploymentBucket:

name: my-serverless-deployments

vpc:

securityGroupIds:

- "sg-0123cf34f6c6354cb"

subnetIds:

- "subnet-01a23493f9e755207"

- "subnet-02b234dbd7d66d33c"

- "subnet-03c234712e99ae1fb"

iam:

role:

statements:

- Effect: Allow

Action:

- lambda:InvokeFunction

Resource: arn:aws:lambda:eu-east-1:123456789099:function:my-database

package:

patterns:

- "out/**"

- "utils.js"

- "aws-sdk"

functions:

my-function:

handler: lambda.handler

events:

- schedule:

name: "my-service-${opt:stage, self:provider.stage}"

description: "Periodically run my-service lambdas"

rate: rate(4 hours)

inputTransformer:

inputTemplate: '{"Records":[{"EventSource":"aws:rate","EventVersion":"1.0","EventSubscriptionArn":"arn:aws:sns:eu-east-1:{{accountId}}:ExampleTopic","Sns":{"Type":"Notification","MessageId":"95df01b4-1234-5678-9903-4c221d41eb5e","TopicArn":"arn:aws:sns:eu-east-1:123456789012:ExampleTopic","Subject":"example subject","Message":"example message","Timestamp":"1970-01-01T00:00:00.000Z","SignatureVersion":"1","Signature":"EXAMPLE","SigningCertUrl":"EXAMPLE","UnsubscribeUrl":"EXAMPLE","MessageAttributes":{"type":{"Type":"String","Value":"populate_unsyncronised"},"count":{"Type":"Number","Value":"400"}}}}]}'

- sns:

arn: arn:aws:sns:us-east-2:123456789099:trigger-my-service

- http:

custom:

dotenv:

dotenvParser: env.loader.js

logSubscription:

enabled: true

destinationArn: ${env:KINESIS_SUBSCRIPTION_STREAM}

roleArn: ${env:KINESIS_SUBSCRIPTION_ROLE}

service: - name of the service
useDotenv: boolean (true|false)
configValidationMode: error
frameworkVersion: e.g. "3"
provider -

name - provider name e.g. aws
runtime - e.g. nodejs18.x
region e.g. us-east-1
memorySize - how much memory will have the machine on which Lambda will be running e.g. 1024 (MB). It is good to check the actual memory usage and adjust the required memory size - downsizing can lower the costs!
timeout: (number) e.g. 60 [seconds] - the maximum amount of time, in seconds, that a serverless function (such as an AWS Lambda function) is allowed to run before it is forcibly terminated by the AWS platform. This setting ensures that our function does not run indefinitely. If the function execution exceeds 60 seconds, the serverless platform will automatically stop it and return a timeout error. The timeout property is commonly used to control resource usage and prevent runaway executions. It is especially important for functions that interact with external services or perform long-running tasks. If not specified, most serverless platforms (like AWS Lambda) use a default timeout (for AWS Lambda, the default is 3 seconds, and the maximum is 900 seconds or 15 minutes).
httpApi:

apiGateway:

minimumCompressionSize: 1024
shouldStartNameWithService: true
restApiId: ""
restApiRootResourceId: ""

stage: - name of the environment e.g. production;
iamManagedPolicies: a list of ARNs of policies that will be associated to the Lambda's computing instance e.g. policy which allows access to S3 buckets etc...
lambdaHashingVersion
environment: dictionary of environment variable names and values
vpc

securityGroupIds: list
subnetIds - typically a list of private subnets with NAT gateway.

functions: a dictionary which defines the AWS Lambda functions that are deployed as part of this Serverless service. This is where we define the AWS Lambda functions that our Serverless service will deploy.

<function_name>: string, a logical name of the function (e.g., my-function). This name is used to reference the function within the Serverless Framework and in deployment outputs. A name of the provisioned Lambda function is in format: <service_name>-<stage>-<function_name>. Each function entry under functions specifies:

handler - tells Serverless which file and exported function to execute as the Lambda entry point (e.g., src/fn/lambda.handler which points to handler export in the src/fn/lambda module). Specifies the entry point for the Lambda function. When the function is invoked, AWS Lambda will execute this handler.
events - (optional, array) a list of events that trigger this function

Some triggers:

schedule, scheduled events: for periodic invocation (cron-like jobs)
sns: for invocation via an AWS SNS topic
HTTP endpoints,
S3 events
messages from a Kafka topic in an MSK cluster (msk)

If the array is empty, that means that the function currently has no event sources configured and will not be triggered automatically by any AWS event.

plugins: a list of serverless plugins e.g.

serverless-webpack
serverless-esbuild
serverless-offline [https://www.serverless.com/plugins/serverless-offline, https://github.com/dherault/serverless-offline]

emulates AWS Lambda and API Gateway. It starts an HTTP server that handles the request's lifecycle like APIG does and invokes the handlers.
sls offline --help

serverless-plugin-log-subscription

custom: - section for serverless plugins settings e.g. for esbuild, logSubscription, webpack etc...

example: serverless-plugin-log-subscription plugin has the settings:

logSubscription: {
   enabled: true,
   destinationArn: process.env.SUBSCRIPTION_STREAM,
   roleArn: process.env.SUBSCRIPTION_ROLE,
}

example: serverless-domain-manager - used to define stage-specific domains.

domains: {
production: {
url: "app.api.example.com",
certificateArn: "arn:aws:acm:us-east-2:123456789012:certificate/a8f8f8e2-95fe-4934-abf2-19dc08138f1f",
},
staging: {
url: "app.staging.example.com",
certificateArn: "arn:aws:acm:us-east-2:123456789012:certificate/a32e9708-7aeb-495b-87b1-8532a2592eeb",
},
dev: {

url: "",

certificateArn: ""

},
},

Useful Kibana DevTools Queries

Cluster

To check cluster health:

GET /_cluster/health

GET /_cluster/health?level=shards

The output contains status which can be green, yellow or red.

To check status of each shard:

GET _cat/shards?v&h=index,shard,prirep,state,unassigned.reason,node

The output shows if shard is primary (p) or replica (r). It also shows the status which can be e.g. STARTED, UNASSIGNED and reason which can be e.g. ALLOCATION_FAILED.

To get memory allocation and consumption per node:

GET /_cat/allocation?v&s=node

The output contains the following columns:

shards (number)
shards.undesired
write_load.forecast
disk.indices.forecast (in Gb or Tb)
disk.indices (in Gb or Tb)
disk.used (in Gb or Tb)
disk.avail (in Gb or Tb)
disk.total (in Gb or Tb)
disk.percent (number, %)
host (IP address)
ip (IP address)
node (node name or UNASSIGNED)
node.role (combination of cdfhilmrstw)

If some shard is not allocated, we can check the reason:

GET /_cluster/allocation/explain

To manually trigger retry of all previously failed shard allocations:

POST /_cluster/reroute?retry_failed=true

To check the progress, check the health of the cluster and:

GET /_cat/recovery/my_index?v

Index

To perform a search operation on a specific index:

GET /my_index/_search

By itself (without a request body), it returns the first 10 documents by default. This request is the same as the above one:

GET /my_index/_search

{

"query": {

"match_all": {}

}

In Kibana's Dev Tools, the query parameter in a GET request refers to the search query that defines which documents we want to retrieve from Elasticsearch. It's part of the request body and specifies the search criteria. The query parameter essentially tells Elasticsearch "find me documents that match these conditions." It's the core part of any search request and determines which documents from our index will be returned in the response.

The query object can contain various types of queries. Common query types:

match_all - Returns all documents:

{

"query": {

"match_all": {}

}

match - Full-text search on a specific field:

{

"query": {

"match": {

"field_name": "search_term"

}

term - Exact term matching:

{

"query": {

"term": {

"status": "active"

}

bool - Combine multiple queries with logical operators:

{

"query": {

"bool": {

"must": [

{"match": {"title": "elasticsearch"}},

{"range": {"date": {"gte": "2023-01-01"}}}

]

}

range - Query for values within a range:

{

"query": {

"range": {

"age": {

"gte": 18,

"lte": 65

}

To get the number of documents in an Elasticsearch index, you can use the _count API or the _stats API.

GET /my_index/_count

This will return a response like:

{

"count": 12345,

"_shards": {

"total": 5,

"successful": 5,

"skipped": 0,

"failed": 0

}

To get a certain number of documents, use size argument:

GET my_index/_search?size=900

We can also use _cat API:

GET /_cat/count/my_index?v

This will return output like:

epoch timestamp count

1718012345 10:32:25 12345

GET /my_index/_stats

"indices": {

"my_index": {

"primaries": {

"docs": {

"count": 12345,

"deleted": 12

}

To get the union of all values of some field e.g. channel_type field across all documents in the my_index index, we can use an Elasticsearch terms aggregation:

GET my_index/_search

{

"size": 0,

"aggs": {

"unique_channel_types": {

"terms": {

"field": "channel_type.keyword",

"size": 10000 // increase if you expect many unique values

}

Explanation:

"size": 0: No documents returned, just aggregation results.
"terms": Collects unique values.
"channel_type.keyword": Use .keyword to aggregate on the raw value (not analyzed text).
"size": 10000: Max number of buckets (unique values) to return. Adjust as needed.

Response example:

{

"aggregations": {

"unique_channel_types": {

"buckets": [

{ "key": "email", "doc_count": 456 },

{ "key": "push", "doc_count": 321 },

{ "key": "sms", "doc_count": 123 }

]

}

The "key" values in the buckets array are your union of channel_type values.

Let's assume that my_index has the timestamp field (as the root field...but it can be at any path in which case we'd need to adjust the query) is correctly mapped as a date type.

To get the oldest document:

GET my_index/_search

{

"size": 1,

"sort": [

{ "@timestamp": "asc" }

]

}

To get the newest document:

GET my_index/_search

{

"size": 1,

"sort": [

{ "@timestamp": "desc" }

]

}

How to get all possible values of some field in all documents added to index in last 24 hours?

We can use Terms Aggregation with Range Query:

GET /my_index/_search

{

"size": 0,

"query": {

"range": {

"@timestamp": {

"gte": "now-24h/h",

"lte": "now"

}

"aggs": {

"unique_values": {

"terms": {

"field": "my_field.keyword",

"size": 10000

}

Check number of documents which are older than N days:

POST my_index/_count

{

"query": {

"range": {

"@timestamp": {

"lt": "now-Nd/d"

}

Delete all documents older than N days:

POST my_index/_delete_by_query?conflicts=proceed&wait_for_completion=false

{

"query": {

"range": {

"@timestamp": {

"lt": "now-Nd/d"

}

The output of the above command is task ID.

To check the task status:

GET _tasks/<task_id>

To check if any delete_by_query task is running and number of docs deleted so far:

GET _tasks?actions=*delete/byquery&detailed=true

Once delete_by_query task is completed: deletion is done, but disk space might not yet be reclaimed. To free disk space, run a forcemerge:

POST my_index/_forcemerge?only_expunge_deletes=true

For a huge shard, consider doing this after several weekly chunks, not after every single one, to reduce I/O spikes.

Check if any forcemerge tasks are running:

GET _tasks?actions=*forcemerge

GET _tasks?actions=*forcemerge&detailed=true

GET _tasks?actions=*forcemerge&detailed=true&group_by=parents

Check number of merges:

GET my_index/_stats?level=shards

----

My Public Notepad

Pages

Monday, 30 June 2025

Introduction to Amazon API Gateway

RESTful APIs

Friday, 27 June 2025

GitHub Workflows and AWS

IAM User Authentication

OpenID Connect (OIDC) Authentication

Friday, 13 June 2025

Introduction to Serverless Framework

Serverless Yaml Configuration File

Thursday, 12 June 2025

Useful Kibana DevTools Queries

Cluster

Index