Friday, 8 August 2025

AWS EKS Cluster Networking





If we select a cluster and go to Networking tab, we'll see the following settings: 
  • VPC
  • Cluster IP address family
  • Service IPv4 range
  • Subnets
  • Cluster security group
  • Additional security groups
  • API server endpoint access

Manage drop down groups them into following:
  • VPC Resources (Network environment)
    • Subnets
    • Additional security groups - optional
  • Endpoint access (API server endpoint access)
  • Remote networks


We'll describe here the meaning and purpose for each of them.


VPC


Amazon Virtual Private Cloud (Amazon VPC) enables you to launch AWS resources into a virtual network that you have defined. This virtual network closely resembles a traditional network that you would operate in your own data center, with the benefits of using the scalable infrastructure of AWS. 

A virtual private cloud (VPC) is a virtual network dedicated to your AWS account. A subnet is a range of IP addresses in your VPC. 

Each Managed Node Group requires you to specify one of more subnets that are defined within the VPC used by the Amazon EKS cluster. Nodes are launched into subnets that you provide. The size of your subnets determines the number of nodes and pods that you can run within them. 

You can run nodes across multiple AWS availability zones by providing multiple subnets that are each associated different availability zones. Nodes are distributed evenly across all of the designated Availability Zones

If you are using the Kubernetes Cluster Autoscaler and running stateful pods, you should create one Node Group for each availability zone using a single subnet and enable the -\-balance-similar-node-groups feature in cluster autoscaler.

EKS suggests using Private Subnets for worker nodes.




Cluster IP address family



Select the IP address type that pods and services in your cluster will receive: IPv4 or IPv6.

Amazon EKS does not support dual stack clusters. However, if you worker nodes contain an IPv4 address, Amazon EKS will configure IPv6 pod routing so that pods can communicate with cluster external IPv4 endpoints.


Service IPv4 range


The IP address range from which cluster services will receive IP addresses. Manually configuring this range can help prevent conflicts between Kubernetes services and other networks peered or connected to your VPC.

Service CIDR is only configurable when choosing IPv4 as your cluster IP address family. With IPv6, the service CIDR will be an auto generated unique local address (ULA) range.


Subnets


Choose the subnets in your VPC where the control plane may place elastic network interfaces (ENIs) to facilitate communication with your cluster. The specified subnets must span at least two availability zones.

To control exactly where the ENIs will be placed, specify only two subnets, each from a different AZ, and Amazon EKS will make cross-account ENIs in those subnets. The Amazon EKS control plane creates up to 4 cross-account ENIs in your VPC for each cluster.

You may choose one set of subnets for the control plane that are specified as part of cluster creation, and a different set of subnets for the worker nodes.

EKS suggests using private subnets for worker nodes.

If you select IPv6 cluster address family, the subnets specified as part of cluster creation must contain an IPv6 CIDR block.

Cluster security group & Additional security groups


Amazon VPC Security groups control communications within the Amazon EKS cluster including between the managed Kubernetes control plane and compute resources in your AWS account such as worker nodes and Fargate pods.

The Cluster Security Group is a unified security group that is used to control communications between the Kubernetes control plane and compute resources on the cluster. The cluster security group is applied by default to the Kubernetes control plane managed by Amazon EKS as well as any managed compute resources created by Amazon EKS. 

EKS automatically creates a cluster security group on cluster creation to facilitate communication between worker nodes and control plane. Description of such SG is: EKS created security group applied to ENI that is attached to EKS Control Plane master nodes, as well as any managed workloadsThe name of this SG is in form: eks-cluster-sg-<cluster_name>-1234567890. It's attached to the same VPC that cluster is in. Its rules are:
  • Inbound: allow all traffic (all protocols and ports) from itself (see https://stackoverflow.com/questions/66917854/aws-security-group-source-of-inbound-rule-same-as-security-group-name)
  • Outbound: allow all IPv4 and IPv6 traffic

Optionally, choose additional security groups to apply to the EKS-managed Elastic Network Interfaces that are created in your control plane subnets. To create a new security group, go to the corresponding page in the VPC console.

Additional cluster security groups control communications from the Kubernetes control plane to compute resources in your account. Worker node security groups are security groups applied to unmanaged worker nodes that control communications from worker nodes to the Kubernetes control plane.

You can apply additional cluster security groups to control communications from the Kubernetes control plane to compute resources in your account.

Worker node security groups are security groups applied to unmanaged worker nodes that control communications from worker nodes to the Kubernetes control plane.

Example:

  • Description: EKS cluster security group
  • Inbound rules:
    • IP version: IPv4
    • Type: HTTPS 
    • Protocol: TCP
    • Port range: 443
    • Source: 192.168.1.0/24
    • Description: Office LAN CIDR (for acccess via Site-to-site VPN)



API server endpoint access


You can limit, or completely disable, public access from the internet to your Kubernetes cluster endpoint.

Amazon Amazon EKS creates an endpoint for the managed Kubernetes API server that you use to communicate with your cluster (using Kubernetes management tools such as kubectl). By default, this API server endpoint is public to the internet, and access to the API server is secured using a combination of AWS Identity and Access Management (IAM) and native Kubernetes Role Based Access Control (RBAC).

You can, optionally, limit the CIDR blocks that can access the public endpoint. If you limit access to specific CIDR blocks, then it is recommended that you also enable the private endpoint, or ensure that the CIDR blocks that you specify include the addresses that worker nodes and Fargate pods (if you use them) access the public endpoint from.

You can enable private access to the Kubernetes API server so that all communication between your worker nodes and the API server stays within your VPC. You can limit the IP addresses that can access your API server from the internet, or completely disable internet access to the API server.


Cluster endpoint access Info

Configure access to the Kubernetes API server endpoint.
  • Public - The cluster endpoint is accessible from outside of your VPC. Worker node traffic will leave your VPC to connect to the endpoint.
  • Public and private - The cluster endpoint is accessible from outside of your VPC. Worker node traffic to the endpoint will stay within your VPC.
  • Private - The cluster endpoint is only accessible through your VPC. Worker node traffic to the endpoint will stay within your VPC.
If we choose Public or Public and private we get Advanced settings with option to Add/edit sources to public access endpoint. We can add here up to 40 CIDR blocks

Public access endpoint sources - Determines the traffic that can reach the Kubernetes API endpoint of this cluster.

Use CIDR notation to specify an IP address range (for example, 203.0.113.5/32).

If connecting from behind a firewall, you'll need the IP address range used by the client computers.

By default, your public endpoint is accessible from anywhere on the internet (0.0.0.0/0).

If you restrict access to your public endpoint using CIDR blocks, it is strongly recommended to also enable private endpoint access so worker nodes and/or Fargate pods can communicate with the cluster. Without the private endpoint enabled, your public access endpoint CIDR sources must include the egress sources from your VPC. For example, if you have a worker node in a private subnet that communicates to the internet through a NAT Gateway, you will need to add the outbound IP address of the NAT Gateway as part of a allowlisted CIDR block on your public endpoint.


---

Friday, 1 August 2025

Introduction to AWS IAM Identity Center




IAM Identity Center (formerly AWS Single Sign-On, or AWS SSO) enables you to centrally manage workforce access to multiple AWS accounts and applications via single sign-on.



IAM Identity Center setup


(1) Confirm your identity source


The identity source is where you administer users and groups, and it is the service that authenticates your users. By default, IAM Identity Center creates an Identity Center directory.

(2) Manage permissions for multiple AWS accounts


Give users and groups access to specific AWS accounts in your organization.

(3) Set up application user and group assignments


Give users and groups access to specific applications configured to work with IAM Identity Center.

(4) Register a delegated administrator


Delegate the ability to manage IAM Identity Center to a member account in your AWS organization.



AWS SSO Authentication


When you run aws sso login, it initiates an authentication flow that communicates with your organization's configured IAM Identity Center instance to obtain temporary AWS credentials for CLI use. This command does not interact with the legacy AWS SSO service, but with the current IAM Identity Center (the new, official name as of July 2022). The authentication process exchanges your SSO credentials for tokens that allow you to use other AWS CLI commands with the associated permissions.

aws sso login --profile my_profile

The profile named in aws sso login --profile my_profile must be defined in your AWS CLI configuration file, specifically in ~/.aws/config (on Linux/macOS) or %USERPROFILE%.aws\config (on Windows).

To define or create an SSO profile, use the interactive command:

aws configure sso --profile my_profile

This command will prompt you for required details such as the SSO start URL, AWS region, account ID, and role name, and then write them into your ~/.aws/config file.

A typical SSO profile configuration in ~/.aws/config might look like:

[profile my_profile]
sso_session = my-sso
sso_account_id = 123456789012
sso_role_name = AdministratorAccess

[sso-session my-sso]
sso_start_url = https://myorg.awsapps.com/start
sso_region = us-east-1
sso_registration_scopes = sso:account:access


Never define the profile in ~/.aws/credentials; SSO profiles rely on ~/.aws/config.

After defining it, aws sso login --profile my_profile will use the details in ~/.aws/config to initiate login.

The most straightforward method is using aws configure sso with your desired profile name as shown above.

If you already have an SSO session defined, you can reuse it across multiple profiles by referencing the same sso_session.
...

Monday, 21 July 2025

AWS Site-to-Site VPN


How to Setup a VPN Connection between the office router and AWS VPN?

How to setup a IPSEC VPN Connection between our office router e.g. Cisco ASA and the AWS VPN endpoints?

AWS Virtual Private Network solutions establish secure connections between our on-premises networks, remote offices, client devices, and the AWS global network. 

AWS VPN is comprised of two services: AWS Site-to-Site VPN and AWS Client VPN. 

Each service provides a highly-available, managed, and elastic cloud VPN solution to protect our network traffic.

In this article we'll talk about AWS Site-to-Site VPN.


AWS Site-to-Site VPN 


Network diagram:


on-premise LAN: 192.168.0.0/16 
-----------------------------------------
/ \                         / \
 |                           |
 |  active tunnel            |  passive (standby) tunnel
 |                           |
\ /                         \ /
-----------------------------------------
Router1                    Router 2

VGW - Virtual Gateway 
VPC: 172.16.0.0/16; Route Table: 192.168.0.0/16 ---> VGW-xxxx


Can VPC CIDR and LAN CIDR overlap?

VPN connection consists of two tunnels:
  • active (up and running)
  • passive (down); if first one goes down, this one will take over

VPC route table will need to be modified so traffic destined for 192.168.0.0/16 to be routed to VGW-xxxx


AWS VPN service consists of 3 components:

Creating and configuring a Customer Gateway


Customer Gateway is a resource that we create in AWS that represents the a (customer) gateway device in our on-premises network.

When we create a customer gateway, we provide information about our device to AWS. We or our network administrator must configure the device to work with the site-to-site VPN connection.


We first need to create a Customer Gateway in AWS. We can do that via AWS console or Terraform provider. 



If we click on Create customer gateway, we'll see this form:



Details

  • Name tag
    • optional
    • Creates a tag with a key of 'Name' and a value that we specify.
    • Value must be 256 characters or less in length.
  • BGP ASN
    • The ASN of our customer gateway device.
    • e.g. 65000
    • Value must be in 1 - 4294967294 range.
    • The Border Gateway Protocol (BGP) Autonomous System Number (ASN) in the range of 1 – 4,294,967,294 is supported. We can use an existing public ASN assigned to our network, with the exception of the following:
      • 7224 - Reserved in all Regions
      • 9059 - Reserved in the eu-west-1 Region
      • 10124 - Reserved in the ap-northeast-1 Region
      • 17943 - Reserved in the ap-southeast-1 Region
    • If we don't have a public ASN, we can use a private ASN in the range of 64,512–65,534 or 4,200,000,000 - 4,294,967,294. The default ASN is 65000.
    • It is required if we want to set up dynamic routing. If we want to use static routing, we can use an arbitrary (default) value.
    • Where to find BGP ASN for e.g. UDM Pro?
    • If we want to use IPSec and dynamic routing, then our router device needs to support BGP over IPSec
    • When to use static and when to use dynamic routing?
  • IP address
    • Specify the IP address for our customer gateway device's external interface. This is internet-routable IP address for our gateway's external interface.
    • The address must be static and can't be behind a device performing Network Address Translation (NAT)
    • If office router is connected to ISP via e.g. WAN1 connection, this is the IP of that WAN connection 
    • Basically, this is the office's public IP address.
  • Certificate ARN
    • optional
    • The ARN of a private certificate provisioned in AWS Certificate Manager (ACM).
    • We can select certificate ARN from a drop-down list
    • How is this certificate used?
    • When to use this certificate?
  • Device
    • optional
    • A name for the customer gateway device.

Creating and configuring a Virtual private gateway


A virtual private gateway is the VPN concentrator on the Amazon side of the site-to-site VPN connection. We create a virtual private gateway and attach it to the VPC we want to use for the site-to-site VPN connection.


A VPN concentrator is a specialized networking device designed to manage numerous secure connections (VPN tunnels) for remote users or sites accessing a central network. It acts as a central point for establishing, processing, and maintaining these connections, enabling large organizations to securely connect many users simultaneously. 

Key Functions:
  • Multiple VPN Tunnel Management: VPN concentrators handle a large number of encrypted VPN tunnels simultaneously, allowing multiple users to securely connect to the network. 
  • Centralized Security: They provide a central point for managing and enforcing security policies for all remote connections, ensuring consistent protection. 
  • Scalability: VPN concentrators are designed to handle a large number of users and connections, making them suitable for large organizations with many remote workers or sites. 
  • Traffic Encryption: They encrypt all data transmitted between the remote user and the central network, ensuring secure communication and protecting sensitive information. 
  • Enhanced Security Posture: By managing and controlling all VPN connections, they help organizations maintain a strong security posture and minimize risks associated with remote access. 
How it Works:
  • 1. Remote User Connection: Remote users initiate a VPN connection, which is then routed to the VPN concentrator. 
  • 2. Authentication and Authorization: The concentrator authenticates and authorizes the user, verifying their identity and permissions. 
  • 3. Tunnel Establishment: If the user is authorized, the concentrator establishes an encrypted VPN tunnel between the user's device and the central network. 
  • 4. Secure Communication: All data transmitted through the tunnel is encrypted, protecting it from eavesdropping or interception. 
  • 5. Traffic Management: The concentrator manages and prioritizes traffic within the network, ensuring efficient and secure communication. 
Use Cases:
  • Large Enterprises: Companies with numerous remote employees often use VPN concentrators to provide secure access to their internal network. 
  • Extranet VPNs: VPN concentrators are also used in extranet setups, where multiple organizations need to securely share resources and information. 
  • Large Scale Remote Access: They are ideal for organizations that need to provide secure remote access to a large number of users from various locations. 
In essence, a VPN concentrator is a robust and scalable solution for managing secure remote access in larger organizations, providing the necessary infrastructure for secure and efficient communication across the network




If we click on Create button we'll get this form to fill:


If we select Custom ASN:



Upon creation, VGW will be in detached state. We want to attach it to VPC.
We can select to which VPC we want to attach it to.

Tuesday, 8 July 2025

How to install MongoDB Shell (mongosh) on Mac

 


The main Homebrew repository no longer includes MongoDB due to licensing changes made by MongoDB Inc. So to install MongoDB-related tools (like mongosh, mongodb-community, or mongod), we need to use their own tap (mongodb/brew), which contains these formulas.

Tap is a package source (formula repository).


Let's add tap maintained by MongoDB to our local Homebrew setup:

% brew tap mongodb/brew

To install mongo shell:

% brew install mongosh

Verification:

% mongosh --version 
2.5.3


If we now run it with no arguments, it will try to connect to the local instance:

% mongosh                    
Current Mongosh Log ID: 6853eadee32b5e6cd3cc5d2f
Connecting to: mongodb://127.0.0.1:27017/?directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+2.5.3
MongoNetworkError: connect ECONNREFUSED 127.0.0.1:27017

Let's explore the arguments:

% mongosh --help

  $ mongosh [options] [db address] [file names (ending in .js or .mongodb)]

  Options:

    -h, --help                                 Show this usage information
    -f, --file [arg]                           Load the specified mongosh script
        --host [arg]                           Server to connect to
        --port [arg]                           Port to connect to
        --build-info                           Show build information
        --version                              Show version information
        --quiet                                Silence output from the shell during the connection process
        --shell                                Run the shell after executing files
        --nodb                                 Don't connect to mongod on startup - no 'db address' [arg] expected
        --norc                                 Will not run the '.mongoshrc.js' file on start up
        --eval [arg]                           Evaluate javascript
        --json[=canonical|relaxed]             Print result of --eval as Extended JSON, including errors
        --retryWrites[=true|false]             Automatically retry write operations upon transient network errors (Default: true)

  Authentication Options:

    -u, --username [arg]                       Username for authentication
    -p, --password [arg]                       Password for authentication
        --authenticationDatabase [arg]         User source (defaults to dbname)
        --authenticationMechanism [arg]        Authentication mechanism
        --awsIamSessionToken [arg]             AWS IAM Temporary Session Token ID
        --gssapiServiceName [arg]              Service name to use when authenticating using GSSAPI/Kerberos
        --sspiHostnameCanonicalization [arg]   Specify the SSPI hostname canonicalization (none or forward, available on Windows)
        --sspiRealmOverride [arg]              Specify the SSPI server realm (available on Windows)

  TLS Options:

        --tls                                  Use TLS for all connections
        --tlsCertificateKeyFile [arg]          PEM certificate/key file for TLS
        --tlsCertificateKeyFilePassword [arg]  Password for key in PEM file for TLS
        --tlsCAFile [arg]                      Certificate Authority file for TLS
        --tlsAllowInvalidHostnames             Allow connections to servers with non-matching hostnames
        --tlsAllowInvalidCertificates          Allow connections to servers with invalid certificates
        --tlsCertificateSelector [arg]         TLS Certificate in system store (Windows and macOS only)
        --tlsCRLFile [arg]                     Specifies the .pem file that contains the Certificate Revocation List
        --tlsDisabledProtocols [arg]           Comma separated list of TLS protocols to disable [TLS1_0,TLS1_1,TLS1_2]
        --tlsFIPSMode                          Enable the system TLS library's FIPS mode

  API version options:

        --apiVersion [arg]                     Specifies the API version to connect with
        --apiStrict                            Use strict API version mode
        --apiDeprecationErrors                 Fail deprecated commands for the specified API version

  FLE Options:

        --awsAccessKeyId [arg]                 AWS Access Key for FLE Amazon KMS
        --awsSecretAccessKey [arg]             AWS Secret Key for FLE Amazon KMS
        --awsSessionToken [arg]                Optional AWS Session Token ID
        --keyVaultNamespace [arg]              database.collection to store encrypted FLE parameters
        --kmsURL [arg]                         Test parameter to override the URL of the KMS endpoint

  OIDC auth options:

        --oidcFlows[=auth-code,device-auth]    Supported OIDC auth flows
        --oidcRedirectUri[=url]                Local auth code flow redirect URL [http://localhost:27097/redirect]
        --oidcTrustedEndpoint                  Treat the cluster/database mongosh as a trusted endpoint
        --oidcIdTokenAsAccessToken             Use ID tokens in place of access tokens for auth
        --oidcDumpTokens[=mode]                Debug OIDC by printing tokens to mongosh's output [redacted|include-secrets]
        --oidcNoNonce                          Don't send a nonce argument in the OIDC auth request

  DB Address Examples:

        foo                                    Foo database on local machine
        192.168.0.5/foo                        Foo database on 192.168.0.5 machine
        192.168.0.5:9999/foo                   Foo database on 192.168.0.5 machine on port 9999
        mongodb://192.168.0.5:9999/foo         Connection string URI can also be used

  File Names:

        A list of files to run. Files must end in .js and will exit after unless --shell is specified.

  Examples:

        Start mongosh using 'ships' database on specified connection string:
        $ mongosh mongodb://192.168.0.5:9999/ships

  For more information on usage: https://mongodb.com/docs/mongodb-shell.

To test connection:

% mongosh "mongodb://myuser:pass@mongo0.example.com:27017,mongo1.example.com:27017,mongo2.example.com:27017"
Current Mongosh Log ID: 6853ed1a745676a15bb62b1f
Connecting to: mongodb://<credentials>@mongo0.example.com:27017,mongo1.example.com:27017,mongo2.example.com:27017/?appName=mongosh+2.5.3
Using MongoDB: 8.0.8-3
Using Mongosh: 2.5.3

For mongosh info see: https://www.mongodb.com/docs/mongodb-shell/


To help improve our products, anonymous usage data is collected and sent to MongoDB periodically (https://www.mongodb.com/legal/privacy-policy).
You can opt-out by running the disableTelemetry() command.

------
   The server generated these startup warnings when booting
   2025-06-11T15:17:19.361+00:00: While invalid X509 certificates may be used to connect to this server, they will not be considered permissible for authentication
------
[mongos] test>

To test if this is a master cluster:

[mongos] test> db.runCommand({ isMaster: 1 })
{
  ismaster: true,
  msg: 'isdbgrid',
  topologyVersion: {
    processId: ObjectId('68499daa6d644d093f3230a7'),
    counter: Long('0')
  },
  maxBsonObjectSize: 16777216,
  maxMessageSizeBytes: 48000000,
  maxWriteBatchSize: 100000,
  localTime: ISODate('2025-06-19T10:58:20.543Z'),
  logicalSessionTimeoutMinutes: 30,
  connectionId: 7122158,
  maxWireVersion: 25,
  minWireVersion: 0,
  ok: 1,
  '$clusterTime': {
    clusterTime: Timestamp({ t: 1750330700, i: 1 }),
    signature: {
      hash: Binary.createFromBase64('2yaQ2MNlRYg3aXYhfRzQ4jxXIA0=', 0),
      keyId: Long('7511354796578701336')
    }
  },
  operationTime: Timestamp({ t: 1750330700, i: 1 })
}

To issue hello command (which returns a document that describes the role of the mongod instance): 

[mongos] test> db.runCommand({ hello: 1 })
{
  isWritablePrimary: true,
  msg: 'isdbgrid',
  topologyVersion: {
    processId: ObjectId('68499d52d772b382ee78bcc8'),
    counter: Long('0')
  },
  maxBsonObjectSize: 16777216,
  maxMessageSizeBytes: 48000000,
  maxWriteBatchSize: 100000,
  localTime: ISODate('2025-06-19T10:58:43.289Z'),
  logicalSessionTimeoutMinutes: 30,
  connectionId: 7126567,
  maxWireVersion: 25,
  minWireVersion: 0,
  ok: 1,
  '$clusterTime': {
    clusterTime: Timestamp({ t: 1750330723, i: 1 }),
    signature: {
      hash: Binary.createFromBase64('4en7J3oSF9fRGUUOHmkq4icWsOQ=', 0),
      keyId: Long('7511354796578701336')
    }
  },
  operationTime: Timestamp({ t: 1750330723, i: 1 })
}
[mongos] test> 


Monday, 30 June 2025

Introduction to Amazon API Gateway


 Amazon API Gateway:

  • fully managed service to create, publish, maintain, monitor, and secure APIs at any scale
    • APIs act as the "front door" for applications to access data, business logic, or functionality from our backend services
  • allows creating:
    • RESTful APIs
      • optimized for serverless workloads and HTTP backends using HTTP APIs
        • they act as triggers for Lambda functions
      • HTTP APIs are the best choice for building APIs that only require API proxy functionality
      • Use REST APIs if our APIs require in a single solution both:
        • API proxy functionality 
        • API management features
    • WebSocket APIs that enable real-time two-way communication applications
  • supports:
    • containerized workloads
    • serverless workloads
    • web applications
  • handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including:
    • traffic management
    • CORS support
    • authorization and access control
    • throttling
    • monitoring
    • API version management
  • has no minimum fees or startup costs. We pay for the API calls we receive and the amount of data transferred out and, with the API Gateway tiered pricing model, we can reduce our cost as our API usage scales


RESTful APIs


What is the difference between REST API endpoints (apiGateway) and HTTP API endpoints (httpApi)?

The difference between REST API endpoints (apiGateway) and HTTP API endpoints (httpApi) in Amazon API Gateway primarily comes down to features, performance, cost, and use cases.


REST API endpoints (apiGateway):
  • Older, feature-rich, supports API keys, usage plans, request/response validation, custom authorizers, and more.
  • More configuration options, but higher latency and cost.
  • Defined under the provider.apiGateway section and function events: http.

HTTP API endpoints (httpApi):
  • Newer, simpler, faster, and cheaper.
  • Supports JWT/Lambda authorizers, CORS, and OIDC, but lacks some advanced REST API features.
  • Defined under provider.httpApi and function events: httpApi.


Friday, 27 June 2025

GitHub Workflows and AWS




GitHub workflow can communicate with our AWS resources, directly (via AWS CLI commands) or indirectly (via e.g. Terraform AWS provider).

Before running AWS CLI commands, deploying AWS infrastructure with Terraform, or interacting with AWS services in any way we need to include a step which configures AWS credentials. It ensures that the workflow runner is authenticated with AWS and knows which region to target.

This step should contain configure-aws-credentials action provided by AWS. This action sets up the necessary environment variables so that AWS CLI commands and SDKs can authenticate with AWS services.

aws-region input sets the default AWS region to us-east-2 (Ohio). All AWS commands run in later steps will use this region unless overridden.

We can use either IAM user or OIDC (temp token) authentication.

IAM User Authentication


If using IAM user authentication, we can store user's credentials in a dedicated GitHub secrets:

env:
    AWS_ACCOUNT_ID: ${{ secrets.AWS_ACCOUNT_ID }}
    AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
    AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
    AWS_REGION: us-east-2

// Define this step before steps which are accessing AWS:

- name: Configure AWS Credentials
     uses: aws-actions/configure-aws-credentials@v2
     with:
        aws-region: ${{ env.AWS_REGION }}

 OpenID Connect (OIDC) Authentication


In this authentication, configure-aws-credentials GitHub Action uses GitHub's OpenID Connect (OIDC) for secure authentication with AWS. It leverages the OIDC token provided by GitHub to request temporary AWS credentials from AWS STS, eliminating the need to store long-lived AWS access keys in GitHub Secrets. 

Note that we now need to grant the workflow run a permissions for write access to the id-token:
id-token: write allows the workflow to request and use OpenID Connect (OIDC) tokens. The write level is required for actions that need to generate or use OIDC tokens to authenticate with external systems. Granting id-token: write is essential for workflows that use OIDC-based authentication, such as securely assuming AWS IAM roles via GitHub Actions. This enables secure, short-lived authentication to AWS and other cloud providers. This permission is a security best practice for modern CI/CD workflows that use OIDC to authenticate with cloud providers, reducing the need for static secrets.


env:
    AWS_REGION: us-east-2

permissions:
  id-token: write # aws-actions/configure-aws-credentials (OIDC)

...
- name: Configure AWS Credentials
    uses: aws-actions/configure-aws-credentials@v4
    with:
        role-to-assume: arn:aws:iam::123456789012:role/github-actions-role
        role-session-name: my-app
        aws-region:  ${{ env.AWS_REGION }}



Here's how it works: 
  1. GitHub OIDC Provider: GitHub acts as an OIDC provider, issuing signed JWTs (JSON Web Tokens) to workflows that request them.
  2. configure-aws-credentials Action: This action, when invoked in a GitHub Actions workflow, receives the JWT from the OIDC provider.
  3. AWS STS Request: The action then uses the JWT to request temporary security credentials from AWS Security Token Service (STS).
  4. Credential Injection: AWS STS returns temporary credentials (access key ID, secret access key, and session token) which the action injects as environment variables into the workflow's execution environment.
  5. AWS SDKs and CLI: AWS SDKs and the AWS CLI automatically detect and use these environment variables for authenticating with AWS services.

Benefits of using OIDC with configure-aws-credentials:
  • Enhanced Security: Eliminates the need to store long-lived AWS access keys, reducing the risk of compromise.
  • Simplified Credential Management: Automatic retrieval and injection of temporary credentials, simplifying workflow setup and maintenance.
  • Improved Auditing: Provides better traceability of actions performed within AWS, as the identity is linked to the GitHub user or organization. 

Before using the action:
  • Configure an OpenID Connect provider in AWS: We need to establish an OIDC trust relationship between GitHub and our AWS account.
  • Create an IAM role in AWS: Define the permissions for the role that the configure-aws-credentials action will assume.
  • Set up the GitHub workflow: Configure the configure-aws-credentials action with the appropriate parameters, such as the AWS region and the IAM role to assume. 

In an OpenID Connect (OIDC) authentication scenario, the aws-actions/configure-aws-credentials action creates the following environment variables when assuming a role with temporary credentials: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN. These variables are used by the AWS SDK and CLI to interact with AWS resources. 

Here's a breakdown:
  • AWS_ACCESS_KEY_ID: This environment variable stores the access key ID of the temporary credentials. 
  • AWS_SECRET_ACCESS_KEY: This environment variable stores the secret access key of the temporary credentials. 
  • AWS_SESSION_TOKEN: This environment variable stores the session token associated with the temporary credentials, which is required for operations with AWS Security Token Service (STS). 

These environment variables are populated by the action after successful authentication with the OIDC provider and assuming the specified IAM role. The action retrieves the temporary credentials from AWS and makes them available to subsequent steps in the workflow. 


Once AWS authentication is done and this env variables are created, the next steps in the workflow can access our AWS resources, e.g. read secrets from AWS Secrets Manager:

- name: Read secrets from AWS Secrets Manager into environment variables
    uses: aws-actions/aws-secretsmanager-get-secrets@v2
    with:
        secret-ids: |
            my-secret
        parse-json-secrets: true

- name: deploy
    run: |
        echo $AWS_ACCESS_KEY_ID
        echo $AWS_SECRET_ACCESS_KEY
    env:
        MY_KEY: ${{ env.MY_SECRET_MY_KEY }}

This example assumes that in AWS secret my-secret we have a key MY_KEY, set to the secret value we want to fetch and use.

Friday, 13 June 2025

Introduction to Serverless Framework



Serverless Framework is a tool designed to streamline the development and deployment of serverless applications, including functions and infrastructure, by abstracting away the need to manage servers. 

We define desired infrastructure in serverless yaml files and then deploy it by executing:

sls deploy

This command parses serverless yaml file into larger AWS CloudFormation template which automatically gets filled with values from the yaml. 

The sls deploy command in the Serverless Framework is effectively idempotent at the infrastructure level, but with important nuances:

How it works: 

sls deploy packages our service and deploys it via AWS CloudFormation. CloudFormation itself is designed to be idempotent: if we deploy the same stack with the same configuration and code, AWS will detect no changes and will not modify our resources. If there are changes, only those changes are applied.

What this means:

Repeated runs of sls deploy with no changes will not create duplicate resources or apply unnecessary updates.

If we make changes (to code, configuration, or infrastructure), only the differences are deployed.

Side effects in Lambda code: While infrastructure deployment is idempotent, our Lambda functions themselves must be written to handle repeated invocations safely if we want end-to-end idempotency. The deployment command itself does not guarantee idempotency at the application logic level.

Limitations:

If we use sls deploy function (to update a single function without CloudFormation), this command simply swaps out the function code and is also idempotent in the sense that re-uploading the same code does not cause issues.

If we use plugins or custom resources, their behavior may not always be idempotent unless explicitly designed that way.

To conclude:
  • sls deploy is idempotent for infrastructure: Re-running it with no changes is safe and does not cause duplicate resources or unintended side effects at the CloudFormation level.
  • Application-level idempotency is our responsibility: Ensure our Lambda functions and integrations handle repeated events if that is a requirement for our use case

Serverless Yaml Configuration File


serverless yaml file defines a serverless service. It is a good idea to break up the serverless project into multiple services, each of which is defined by its own serverless yaml file. We don't want to have everything in one big infrastructure stack. 

Example:
  • database e.g. DynamoDB
  • Rest API e.g. which handles the submitted web form and stores data in DynamoDB
  • front-end website which e.g. stores React app website in s3 bucket

Services can be deployed in multiple regions. (Multi-region architecture is supported)


serverless.yml example:


service: my-service
frameworkVersion: "3"
useDotenv: true
plugins: 
  - serverless-plugin-log-subscription
  - serverless-dotenv-plugin
provider:
  name: aws
  runtime: nodejs14.x
  region: eu-east-1
  memorySize: 512
  timeout: 900
  deploymentBucket:
    name: my-serverless-deployments
  vpc: 
    securityGroupIds: 
      - "sg-0123cf34f6c6354cb"
    subnetIds: 
      - "subnet-01a23493f9e755207"
      - "subnet-02b234dbd7d66d33c"
      - "subnet-03c234712e99ae1fb"
  iam: 
    role:
      statements:
        - Effect: Allow
          Action:
            - lambda:InvokeFunction
          Resource: arn:aws:lambda:eu-east-1:123456789099:function:my-database
package:
  patterns:
    - "out/**"
    - "utils.js"
    - "aws-sdk"
functions:
  my-function:
    handler: lambda.handler
    events:
      - schedule:
          name: "my-service-${opt:stage, self:provider.stage}"
          description: "Periodically run my-service lambdas"
          rate: rate(4 hours)
          inputTransformer:
            inputTemplate: '{"Records":[{"EventSource":"aws:rate","EventVersion":"1.0","EventSubscriptionArn":"arn:aws:sns:eu-east-1:{{accountId}}:ExampleTopic","Sns":{"Type":"Notification","MessageId":"95df01b4-1234-5678-9903-4c221d41eb5e","TopicArn":"arn:aws:sns:eu-east-1:123456789012:ExampleTopic","Subject":"example subject","Message":"example message","Timestamp":"1970-01-01T00:00:00.000Z","SignatureVersion":"1","Signature":"EXAMPLE","SigningCertUrl":"EXAMPLE","UnsubscribeUrl":"EXAMPLE","MessageAttributes":{"type":{"Type":"String","Value":"populate_unsyncronised"},"count":{"Type":"Number","Value":"400"}}}}]}'
      - sns:
          arn: arn:aws:sns:us-east-2:123456789099:trigger-my-service
      - http: 
custom:
  dotenv:
    dotenvParser: env.loader.js
  logSubscription:
      enabled: true
      destinationArn: ${env:KINESIS_SUBSCRIPTION_STREAM}
      roleArn: ${env:KINESIS_SUBSCRIPTION_ROLE}



  • service: - name of the service
  • useDotenv: boolean (true|false)
  • configValidationMode: error
  • frameworkVersion: e.g. "3"
  • provider - 
    • name - provider name e.g. aws
    • runtime - e.g. nodejs18.x
    • region e.g. us-east-1
    • memorySize - how much memory will have the machine on which Lambda will be running e.g. 1024 (MB). It is good to check the actual memory usage and adjust the required memory size - downsizing can lower the costs!
    • timeout: (number) e.g. 60 [seconds] - the maximum amount of time, in seconds, that a serverless function (such as an AWS Lambda function) is allowed to run before it is forcibly terminated by the AWS platform. This setting ensures that our function does not run indefinitely. If the function execution exceeds 60 seconds, the serverless platform will automatically stop it and return a timeout error. The timeout property is commonly used to control resource usage and prevent runaway executions. It is especially important for functions that interact with external services or perform long-running tasks. If not specified, most serverless platforms (like AWS Lambda) use a default timeout (for AWS Lambda, the default is 3 seconds, and the maximum is 900 seconds or 15 minutes).
    • httpApi:
      • id:
    • apiGateway:
      • minimumCompressionSize: 1024
      • shouldStartNameWithService: true
      • restApiId: ""
      • restApiRootResourceId: ""
    • stage: - name of the environment e.g. production; 
    • iamManagedPolicies: a list of ARNs of policies that will be associated to the Lambda's computing instance e.g. policy which allows access to S3 buckets etc...
    • lambdaHashingVersion
    • environment: dictionary of environment variable names and values
    • vpc
      • securityGroupIds: list 
      • subnetIds - typically a list of private subnets with NAT gateway. 
  • functions: a dictionary which defines the AWS Lambda functions that are deployed as part of this Serverless service. This is where we define the AWS Lambda functions that our Serverless service will deploy. 
    • <function_name>: string, a logical name of the function (e.g., my-function). This name is used to reference the function within the Serverless Framework and in deployment outputs. A name of the provisioned Lambda function is in format: <service_name>-<stage>-<function_name>. Each function entry under functions specifies:
      • handler - tells Serverless which file and exported function to execute as the Lambda entry point (e.g., src/fn/lambda.handler which points to handler export in the src/fn/lambda module). Specifies the entry point for the Lambda function. When the function is invoked, AWS Lambda will execute this handler.
      • events - (optional, array) a list of events that trigger this function
        • Some triggers:
          • schedule, scheduled events: for periodic invocation (cron-like jobs)
          • sns: for invocation via an AWS SNS topic
          • HTTP endpoints,
          • S3 events
          • messages from a Kafka topic in an MSK cluster (msk)
        • If the array is empty, that means that the function currently has no event sources configured and will not be triggered automatically by any AWS event.
  • plugins: a list of serverless plugins e.g. 
  • custom: - section for serverless plugins settings e.g. for esbuild, logSubscription, webpack etc...
    • example: serverless-plugin-log-subscription plugin has the settings:
      logSubscription: {
          enabled: true,
          destinationArn: process.env.SUBSCRIPTION_STREAM,
          roleArn: process.env.SUBSCRIPTION_ROLE,
      }

    • example: serverless-domain-manager - used to define stage-specific domains.

domains: {
  production: {
   url: "app.api.example.com",
   certificateArn: "arn:aws:acm:us-east-2:123456789012:certificate/a8f8f8e2-95fe-4934-abf2-19dc08138f1f",
},
  staging: {
    url: "app.staging.example.com",
    certificateArn: "arn:aws:acm:us-east-2:123456789012:certificate/a32e9708-7aeb-495b-87b1-8532a2592eeb",
},
  dev: { 
    url: "", 
    certificateArn: "" 
  },
},