My Public Notepad

Monday, 12 August 2024

Introduction to GitHub Actions

GitHub Actions (GHA):

Workflow Automation Service offered by the GitHub
Allows you to automate all kinds of repository-related processes and actions
Service that offers various automations around the code that is stored on GitHub, around these repositories that hold that code.
Free for public repositories

Two main areas of processes that GitHub Actions can automate:

CI/CD processes (Continuous Integration/Continuous Delivery/Continuous Deployment) - methods for automating app development, testing, building and deployment

Continuous Integration is all about automatic handling code changes - integrating new code or code changes into an existing code base by building that code automatically. So that changed code by testing it automatically and by then merging it into existing code.
Continuous Delivery or deployment is about publishing new versions of your app or package or website automatically after the code has been tested and integrated
Example: After we make a change to the website code, we want to automatically upload and publish a new version of our website
GitHub Actions helps setting up, configuring and running such CI/CD workflows
It makes it very easy for us to set up processes that do automatically build, test, and publish new versions of our app, website, or package whenever we make a code change.

Code and repository management - automating:

code reviews
issue management

Key Elements

Workflows
Jobs
Steps

Workflows:

Attached to GitHub repositories
We can add as many workflows to GitHub repository as we wish
The first thing we build/create when setting up an automation process with GHA
Include one or more jobs
Built to set up some automated process that should be executed
Not executed all the time but on assigned triggers or events which define when a given workflow will be executed. Here are some examples of events that can be added:

an event that requires manual activation of a workflow
an event that executes a workflow whenever a new commit is pushed to a certain branch

Defined in a YAML file at this path: <repo_root>/.github/workflows/<workflow_name>.y[a]ml

Workflow can have the following elements:

name - name of the workflow
on - defines a workflow trigger. It can have some of these values:

workflow_call - can have the following attributes:

inputs - an object containing one or more key-value pairs. inputs are only supported for workflow_dispatch (manual trigger). In each such pair a key is the names of the input variable which is used for referencing this input later in yaml document as ${{ inputs.<INPUT_VAR_NAME> }}. A value is an object containing one or more key-value pairs where keys can be:

required: boolean
type: string |
default - string, number or boolean - a default value of the input variable, if one is not specified/set

pull_request

Runs the workflow when activity on a pull request in the workflow's repository occurs (by default when a pull request is opened or reopened or when the head branch of the pull request is updated)
We can use the branches or branches-ignore filter to configure our workflow to only run on pull requests that target specific branches. E.g. if we specify master under branches, that means that this workflow will be triggered on any new/updated Pull Request which has master as a target (base) branch. Name of the feature branch can be arbitrary and is not important.
Inputs are NOT allowed for workflows triggered by pull_request. The pull_request event does not accept user-defined inputs because it is triggered automatically when a PR is opened, synchronized, or updated.
If you need dynamic behavior in a pull_request workflow, you can:

Use environment variables (env)
Use a GitHub secret or variable (secrets / vars)

push - run the workflow when a push is made to any branch in the workflow's repository
[ array of values above ] - if we want to combine multiple different triggers

permissions - workflow permissions
jobs - list of jobs. Each job is defined by stating its name followed by semicolon (e.g. jobA:). For each job we can define:

name: - Descriptive name of the step
runs-on: e.g. ubuntu-latest
steps: List of steps. It's a YAML list so each step object needs to be prepended with "- ". For each step we can define:

name: - step name
uses: - name of the GitHub action e.g. actions/checkout@master

Best practice is not to use the latest tag (@master) but a fixed version, e.g. actions/checkout@v4

with: - block used to pass input parameters to an action

persist-credentials: boolean <-- specific input for certain actions, such as actions/checkout

run: - if we want to specify raw commands
env:

env - define environment variables available in all jobs

values can be hardcoded or read from GitHub secrets

on: examples:

on:

workflow_call:

inputs:

AWS_REGION:

type: string

default: eu-east-2

on:

pull_request:

types: [opened, reopened]

on:

pull_request:

types:

- opened

branches:

- 'releases/**'

paths:

- '**.js'

permissions: examples:

permissions:

id-token: write # aws-actions/configure-aws-credentials (OIDC)

contents: read

pull-requests: write # actions/github-script to create comment in PR

env:

AWS_DEFAULT_REGION: us-east-2

AWS_ACCOUNT_ID: ${{ secrets.AWS_ACCOUNT_ID }}

Jobs:

Contain one or more steps that will be executed in the order in which they're specified
Define a runner

Execution environment, the machine and operating system that will be used for executing these steps
Can either be predefined by GitHub (runners for Linux, Mac OS, and Windows) or self-hosted, custom, configured by ourselves

Steps will be executed in the specified runner environment/machine
If we have multiple jobs they run in parallel by default, but we can also configure them to run in sequential order, one job after another
We can also set up conditional jobs which will not always run, but which instead need a certain condition to be met.

if: This keyword allows you to specify a condition that must be true for the job or step to execute.

if: startsWith(github.ref, 'refs/tags/')

This line is used in a GitHub Actions workflow to conditionally run a job or step only when the workflow is triggered by a tag event.

startsWith(github.ref, 'refs/tags/')

This expression uses the startsWith function to check if the github.ref context variable begins with the string 'refs/tags/'.

github.ref contains the full Git reference that triggered the workflow, such as refs/heads/main for a branch or refs/tags/v1.0.0 for a tag.

If the workflow was triggered by pushing a tag, github.ref will start with refs/tags/.

Each job can have the following attributes:

name - Job name which will be shown in GitHub Actions tab
needs
runs-on - runner type: ubuntu-latest or custom
env - its value is an object containing one or more key-value pairs where key is the environment variable name and value is its value. These environment values have the scope of the current job.
steps; Steps are listed within steps key
outputs:

If specified runner does not exist, GitHub workflow will hang indefinitely, with error like this:

Waiting for a runner to pick up this job...

Job is about to start running on the runner: non_existing_runner_name

Example:

# create job named "plan"

plan:

needs: [conflict, format, validate, lint, security]

runs-on: ${{ inputs.RUNNER }}

defaults:

run:

working-directory: ${{ inputs.TF_ROOT }}

outputs:

tfplan_identifier: ${{ steps.set_tfplan_output.outputs.tfplan_identifier }}

tag_name: ${{ steps.get_tag.outputs.tag_name }}

steps:

- name: Step1

run: echo "This is a Step 1"

- name: Print all available GitHub secrets

run: |

echo "GitHub secrets:"

echo "$SECRETS_CONTEXT" | rev

env:

SECRETS_CONTEXT: ${{ toJson(secrets) }}

- name: Get tag

id: get_tag

run: echo "tag_name=${GITHUB_SHA::7}" >> $GITHUB_OUTPUT

deploy:
   runs-on: ubuntu-latest
   needs: plan
   steps:
   - name: Print tag
      run: echo "Tag is ${{ needs.plan.outputs.tag_name }}"

Steps:

Define the actual things that will be done
Example:

download the code in the first step
install the dependencies in the second step
run automated tests in the third step

Belong to jobs, and a job can have one or more steps
And a step is either:

a shell script
a command in the command line that should be executed (e.g. for simple tasks), or
an action, which is another important building block

predefined scripts that performs a certain task
We can build our own actions or use third party actions

We must have at least have one step,
Steps are then executed in order, they don't run in parallel, but instead, step after step
Steps can also be conditional

Steps are elements of yaml list named steps.

A step element can contain the following keys:

name - step name (step title)
id - a string uniquely identifying the step within enclosing job. Used in references to this step e.g. ${{ steps.<id>.outputs.<output_variable_name> }}
run -
uses - name of the reusable GitHub Action
with
env - its value is an object containing one or more key-value pairs where key is the environment variable name and value is its value. These environment values have the scope of the current step. To access its value, use ${process.env.<ENV_VAR_NAME>}.

environment variable can be used as workflow's internal variable
some actions require certain environment variables to be defined and set e.g. https://github.com/terraform-linters/tflint/blob/master/docs/user-guide/plugins.md#avoiding-rate-limiting

continue-on-error - a boolean (true|false) value denoting whether execution of the job can resume even if this step errors out
run - an arbitrary bash script

Example step:

- name: Step 1

uses: actions/github-script@v7

env:

PLAN_OUTPUT: "${{ steps.plan.outputs.stdout }}"

with:

script: |

let plan = "${process.env. PLAN_OUTPUT}"

- name: TFLint Init

run: tflint --init

env:

# https://github.com/terraform-linters/tflint/blob/master/docs/user-guide/plugins.md#avoiding-rate-limiting

GITHUB_TOKEN: ${{ github.token }}

- name: Terraform Plan

id: plan

continue-on-error: true

env:

TF_ROOT: "${{ inputs.TF_ROOT }}"

run: |

tfplan_identifier="${{ inputs.TF_APP_NAME }}-tfplan-expected"

echo "tfplan_identifier=$tfplan_identifier" >> $GITHUB_OUTPUT

terraform plan -var-file="prod.tfvars" -input=false -no-color -out=${tfplan_identifier}

terraform-bin show -no-color $tfplan_identifier > "$tfplan_identifier.log"

How to create a Workflow?

Workflow can be created in two ways:

directly on the remote, via browser
in the local repo, and then pushed to remote

If we use browser, we need to go to our repo's web page and then click on Actions tab. There we can select a default Workflow or choose some other template. Default workflow creates the following file:

my-repo/.github/workflows/blank.yml:

# This is a basic workflow to help you get started with Actions

name: ci-prod-my-app

# Controls when the workflow will run

on:

# Triggers the workflow on push or pull request events but only for the "main" branch

push:

branches: [ "main" ]

pull_request:

branches: [ "main" ]

# Allows you to run this workflow manually from the Actions tab

workflow_dispatch:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel

jobs:

# This workflow contains a single job called "build"

build:

# The type of runner that the job will run on

runs-on: ubuntu-latest

# Steps represent a sequence of tasks that will be executed as part of the job

steps:

# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it

- uses: actions/checkout@v4

# Runs a single command using the runners shell

- name: Run a one-line script

run: echo Hello, world!

# Runs a set of commands using the runners shell

- name: Run a multi-line script

run: |

echo Add other actions to build,

echo test, and deploy your project.

How to trigger a workflow?

Workflows can be triggered on various events:

pushing changes on an arbitrary branch
creating a pull request (including a draft PR)
pushing a tag to remote
manually
...

Certain triggers (like workflow_dispatch) requires the new workflow to be merged to main branch first before showing up in GitHub Actions tab.

CI workflows are usually triggered on pushing changes to remote or on creating a pull request.

Trigger on push to any branch:

on:

push:

branches:

- '**'

Trigger on pull request on master branch:

on:

pull_request:

branches:

- master

Trigger on pull request on master and prod branch:

on:

pull_request:

branches: [main, prod]

Trigger on pull request that include files at any of these paths:

on:

pull_request:

paths: ["path/to/my-app/prod/app/**", "terraform/modules/**", ".github/workflows/my-app-prod-tf-ci.yaml"]

CD workflows are usually triggered on pushing a tag to remote:

on:

push:

tags:

- "my-app/prod/v*"

Workflows which un-deploy the apps or destroy resources are usually triggered manually:

on:

workflow_dispatch:

workflow_dispatch-triggered workflow files need to be on the default branch (e.g. main). See Events that trigger workflows - GitHub Docs.

Once files are there, we can select against which branch we want to execute this workflow:

If we define input variables, they will be listed and we can enter their values:

Deployment Environments

CD (Deployment) workflows deploy applications and/or provision resources on the remote infrastructure. We want to be 100% sure that CD workflow won't deploy e.g. changes from test branch into production. To make sure there will be no mismatch in the commit hash and deployment environment, we can specify deployment environment in the workflow and also in GitHub specify protection rules for it e.g. which tag (defined by tag format) can deploy in that environment.

Environments in a GitHub repository serve as configurable targets like production, staging, or development that help manage settings, secrets, and deployment strategies for different phases of a project’s lifecycle. They enable you to define specific contexts for deployments, enhance the security and control of your workflow operations, and regulate who can approve or trigger deployments to sensitive environments like production.

Purpose of Environments

Segregation of Contexts: Environments allow you to separate deployment targets such as development, staging, and production. Each environment can have its own configuration, secrets, and rules.

Security: Environment-specific secrets (like API keys) are only accessible within jobs that use that environment, safeguarding sensitive information from unauthorized access or accidental leaks.

Controlled Deployments: You can enforce protections such as required reviewer approvals, deployment delays, or branch restrictions, which are especially useful for critical environments such as production.

Audit and Oversight: GitHub tracks and displays deployment activity for each environment, providing audit trails and deployment histories.

Using Environments in GitHub Workflows

Create Environments:

Go to your repository’s Settings and find the Environments section.

Add environments such as production, staging, or test, and configure options like required approvals, wait timers, or environment URLs.

Configure Secrets and Variables:

Add secrets or variables specific to each environment; these override repository-level secrets of the same name when the workflow references the environment.

Reference Environments in Workflows:

In your workflow YAML files, specify the environment with the environment keyword within jobs:

jobs:

deploy:

runs-on: ubuntu-latest

environment: production

steps:

...

This ensures jobs have access to only those secrets and settings defined for the specified environment.

Leverage Protections:

Set up approval rules, wait times, or limit which branches can trigger workflows for each environment to make deployment processes as robust or restricted as needed.

Example Use Case

A deployment workflow can have separate jobs for development, staging, and production, each referencing their respective environment. This provides strict control over what is deployed, where, and under what circumstances, enabling safer and more reliable releases.

Environments are essential for preserving project integrity, securing secrets, and establishing clear, traceable deployment processes across the CI/CD pipeline in GitHub Actions

Usual workflows in a repository

Continuous Integration (CI) Workflow

Usually triggered on open PR or new commits in PR
Runs:

tests
builds
packaging

Continuous Delivery (CD) Workflow

Usually triggered on merging feature branch to main or cutting the tag on main
Runs:

builds
packages
deploys

Useful Reusable Actions

actions/github-script: Write workflows scripting the GitHub API in JavaScript

Examples

Automate Terraform with GitHub Actions | Terraform | HashiCorp Developer

GitHub Action Versions

Using @master as version is a bad practice. The risks:

No reproducibility — master is a moving target; the same workflow run can behave differently on different days
Silent breaking changes — any commit merged to the action's master immediately affects your workflow with no review or opt-in
Security risk — if the action's repo is compromised, a malicious commit to master runs in your pipeline instantly; pinning to a specific SHA or tag limits the blast radius

Best practice is to pin to an exact version tag (e.g. actions/checkout@v4) or even better a full commit SHA (e.g. actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683) — the SHA is tamper-proof even if a tag is force-pushed.

Two options for tagging all actions:

All use SHAs — most secure (immune to tag force-push), but opaque (de0fac2e... tells you nothing without a comment) and harder to maintain
All use version tags — human-readable, easy to update, but a malicious/accidental force-push to a tag would silently affect you

GitHub's own security hardening guide recommends full SHAs. A common middle ground is SHAs with a version comment, e.g.:

uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2

Note: @vM is called floating tag. It always points to the latest version which has M as a major version so over time it can point to different full versions - it floats.

Is it safe to use "@master" for our own custom actions which reside in our internal org's repository?

It's significantly safer than for third-party actions. Since your org controls that repo, the supply chain attack risk is essentially zero. The remaining risk is just accidental breakage from an unrelated commit landing on master mid-pipeline run.

That said, pinning to a SHA is still the more disciplined practice for reproducibility — you know exactly what ran in every pipeline. The downside is maintenance: whenever your-actions-repo/master gets meaningful updates (e.g. a new kubectl version), you have to come here and manually bump the SHA, whereas @master always picks it up automatically.

Both are defensible; the current pinned SHA approach just prioritises reproducibility over convenience.

GitHub Actions and Authentication with AWS

In GitHub Actions, named profiles don't exist. Environment variables (AWS_ACCESS_KEY_ID etc.) take precedence for credentials, but an explicit profile reference in the (aws) provider block can still cause the AWS SDK to error if the profile entry is missing from ~/.aws/credentials. Verify how the reusable workflow handles this (e.g. via AWS_PROFILE override or by unsetting the profile in CI).

GitHub Actions and NodeJS version

If workflow has "runs-on: ubuntu-latest", which Node will be initially installed on the runner?

ubuntu-latest currently (April 2026) maps to ubuntu-24.04, which comes with Node 20 pre-installed on the runner image.

This is an important distinction from the deprecation warnings — there are two separate Node versions in play:

Actions runner Node — the Node version GitHub uses to execute action JavaScript (like aws-actions/configure-aws-credentials). This is what the warnings are about: GitHub is moving this from Node 20 → Node 24 by default on June 2, 2026.
Pre-installed Node on the runner — what's available for your shell scripts/run: steps. Currently Node 20 on ubuntu-24.04.

This is relevant to jobs which use e.g. actions/setup-node@v6.4.0 with node-version: "16" to explicitly install Node 16 for the test run, which overrides the pre-installed Node 20. That's unrelated to the deprecation warning (like Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/setup-node@v3. Actions will be forced to run with Node.js 24 by default starting June 2nd, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/

image-scan) and works independently.

So actions themselves are running on their own pre-configured Node version and it can happen that e.g. actions/setup-node runs on Node v24 but it installs Node v16 on the runner?

Exactly. The two things are completely independent:

The action's runtime (runs.using: node24 in the action's action.yml) — the Node version that executes the action's JavaScript code internally. This is what the deprecation warnings are about.

What the action does — actions/setup-node just downloads and installs a Node binary onto the runner's PATH. The version it installs (node-version: "16") has nothing to do with what Node version the action itself runs on.

So yes — actions/setup-node@v6.4.0 runs its own JavaScript on Node 24, but the end result is Node 16 becoming available on the runner for subsequent run: steps.

---

Sunday, 11 August 2024

Introduction to Microservices

From Adnan Maqbool Khan's post on LinkedIn:

𝐂𝐨𝐦𝐩𝐨𝐧𝐞𝐧𝐭𝐬 𝐨𝐟 𝐌𝐢𝐜𝐫𝐨𝐬𝐞𝐫𝐯𝐢𝐜𝐞𝐬 𝐀𝐫𝐜𝐡𝐢𝐭𝐞𝐜𝐭𝐮𝐫𝐞

Microservices architecture breaks down applications into smaller, independent services. Here's a rundown of the 𝟏𝟎 𝐤𝐞𝐲 𝐜𝐨𝐦𝐩𝐨𝐧𝐞𝐧𝐭𝐬 in this architecture:

1. 𝐂𝐥𝐢𝐞𝐧𝐭
These are the end-users who interact with the application via different interfaces like web, mobile, or PC.

2. 𝐂𝐃𝐍 (Content Delivery Network)
CDNs deliver static content like images, stylesheets, and JavaScript files efficiently by caching them closer to the user's location, reducing load times.

3. 𝐋𝐨𝐚𝐝 𝐁𝐚𝐥𝐚𝐧𝐜𝐞𝐫
It distributes incoming network traffic across multiple servers, ensuring no single server becomes a bottleneck and improving the application's availability and reliability.

4. 𝐀𝐏𝐈 𝐆𝐚𝐭𝐞𝐰𝐚𝐲
An API Gateway acts as an entry point for all clients, handling tasks like request routing, composition, and protocol translation, which helps manage multiple microservices behind the scenes.

5. 𝐌𝐢𝐜𝐫𝐨𝐬𝐞𝐫𝐯𝐢𝐜𝐞𝐬
Each microservice is a small, independent service that performs a specific business function. They communicate with each other via APIs.

6. 𝐌𝐞𝐬𝐬𝐚𝐠𝐞 𝐁𝐫𝐨𝐤𝐞𝐫
A message broker facilitates communication between microservices by sending messages between them, ensuring they remain decoupled and can function independently.

7. 𝐃𝐚𝐭𝐚𝐛𝐚𝐬𝐞𝐬
Each microservice typically has its database to ensure loose coupling. This can involve different databases for different microservices

8. 𝐈𝐝𝐞𝐧𝐭𝐢𝐭𝐲 𝐏𝐫𝐨𝐯𝐢𝐝𝐞𝐫
This component handles user authentication and authorization, ensuring secure access to services.

9. 𝐒𝐞𝐫𝐯𝐢𝐜𝐞 𝐑𝐞𝐠𝐢𝐬𝐭𝐫𝐲 𝐚𝐧𝐝 𝐃𝐢𝐬𝐜𝐨𝐯𝐞𝐫𝐲
This system keeps track of all microservices and their instances, allowing services to find and communicate with each other dynamically.

10. 𝐒𝐞𝐫𝐯𝐢𝐜𝐞 𝐂𝐨𝐨𝐫𝐝𝐢𝐧𝐚𝐭𝐢𝐨𝐧 (e.g., Zookeeper)
Tools like Zookeeper help manage and coordinate distributed services, ensuring they work together smoothly.

Image source: Adnan Maqbool Khan's post on LinkedIn

Saturday, 10 August 2024

How to perform an initial audit of DevOps pipelines

Performing an initial audit of DevOps pipelines involves evaluating the efficiency, security, and compliance of the processes and tools used to develop, test, deploy, and maintain software. Here’s a step-by-step guide to conducting this audit:

Understand the Existing Environment

Inventory of Tools and Technologies:

List all tools used in the CI/CD pipeline, including :

version control systems (e.g., Git)
CI/CD platforms (e.g., Jenkins, GitLab CI, CircleCI, TeamCity, GitHub Actions)
deployment tools
monitoring systems

Pipeline Architecture Overview:

Document the architecture of the current pipeline, including the stages from code commit to production deployment.

Team Roles and Responsibilities:

Identify the roles involved in the DevOps processes, including developers, DevOps engineers, QA, and security teams.

Security Review

Access Controls:

Ensure proper access controls are in place for all tools and environments, enforcing the principle of least privilege.

Secrets Management:

Check how secrets (API keys, passwords) are managed and stored. They should be encrypted and stored securely (e.g., in a vault).

Code Scanning and Analysis:

Verify that the following is integrated into the pipeline:

static code analysis
vulnerability scanning
dependency checks

Pipeline Security:

Assess the security of the CI/CD tools themselves, ensuring they are regularly patched and updated.

Compliance and Governance

Regulatory Requirements:

Identify any industry-specific regulations (e.g., GDPR, HIPAA) and ensure that the pipeline meets compliance requirements.

Audit Trails:

Ensure that all actions within the pipeline are logged and can be audited. Logs should include code changes, deployments, and access logs.

Data Handling:

Review how sensitive data is handled during the build and deployment process, ensuring it is not exposed.

Pipeline Efficiency

Build and Deployment Times:

Evaluate the time taken for builds and deployments, identifying bottlenecks in the process.

Resource Utilization:

Analyze the resource usage of the pipeline, including compute, storage, and bandwidth, looking for inefficiencies.

Parallelization and Automation:

Check if the pipeline leverages parallel execution where possible and whether manual steps can be automated.

Quality Assurance

Testing Integration:

Review the integration of testing frameworks in the pipeline, including unit tests, integration tests, and end-to-end tests.

Code Quality Metrics:

Ensure that code quality metrics (e.g., test coverage, code complexity) are tracked and enforced.

Rollback Mechanisms:

Assess the rollback mechanisms in place in case of deployment failures.

Monitoring and Logging

Continuous Monitoring:

Verify that application and infrastructure monitoring is integrated, with alerts set up for key performance indicators (KPIs).

Log Management:

Ensure logs from various stages of the pipeline are centralized and can be easily accessed for troubleshooting.

Incident Response:

Review the process for responding to incidents detected through monitoring.

Scalability and Flexibility

Pipeline Scalability:

Check if the pipeline can scale with the growing needs of the organization, both in terms of workload and the number of users.

Environment Flexibility:

Assess the ease of managing different environments (e.g., development, staging, production) and the consistency between them.

Documentation and Reporting

Pipeline Documentation:

Ensure that the entire pipeline is well-documented, including the purpose of each stage, tools used, and configuration settings.

Reporting:

Set up regular reporting on pipeline performance, security, and compliance, making it accessible to relevant stakeholders.

Feedback and Continuous Improvement

Stakeholder Feedback:

Gather feedback from all stakeholders, including developers, QA, and operations teams, on pain points and areas for improvement.

Continuous Improvement Process:

Implement a process for regularly updating and improving the pipeline based on audit findings and feedback.

Final Report and Recommendations

Compile Findings:

Prepare a detailed report summarizing the audit findings, highlighting strengths and areas needing improvement.

Actionable Recommendations:

Provide clear, actionable recommendations to address any identified issues, prioritize them based on impact and effort, and set timelines for implementation.

By following these steps, you can comprehensively assess the DevOps pipelines in a SaaS company, ensuring they are secure, efficient, and aligned with best practices and regulatory requirements.

Friday, 9 August 2024

Software Development Lifecycle, Environments and DevOps Metrics

Agile Software Development Lifecycle can be visualised as in the following infogram:


image source: LinkedIn (Brij kishore Pandey)

Why do we need multiple environments?

Developers and testers might not like to work on the same environment as they may use and modify the same data and it may impact the developer's troubleshooting ability or the tester's test result reliability. This is why devops may setup multiples of the same infrastructure stack and call them by different names (environments).

QA vs QC vs Testing

Before we list environments, we need to clarify that these terms are not the same:

Quality Assurance - ensures that processes and procedures are in place to achieve quality
Quality Control - ensures product quality
Testing - validates the product against specifications

functional
non-functional
acceptance testing

This is why QA environment might not be the same as Testing environment.

More on this: What is the difference between Testing and Quality Assurance? And, does it matter?

DevOps Environments

Continuous Testing is performed in at least two environment families:

Lower environments - any architecture which is not a direct copy of production; environments with different purposes, which don't necessarily need to replicate the Prod system.

Dev/Local development
Sandbox environments
CI environments
Test environments
QA environments
Nonfunctional testing envs

Production replica environments:

Pre-Production / Staging - test deployment into a Prod replica without Prod data; live environments with non-production data and beta testing
NPPD (Non-Production environment with Production Data) is a prod replica with prod data.
Customer UAT (User Acceptance Testing) /training environment

Production environment - for end users.

image source: LinkedIn (Brij kishore Pandey)

Thursday, 8 August 2024

Load Balancing Algorithms

Load balancing:

Used in distributed systems to distribute incoming network traffic across multiple servers or resources
Crucial for optimizing performance and ensuring even distribution of workload
Enhances system reliability by ensuring no single server becomes a bottleneck, thus reducing the risk of server overload and potential downtime

image source: Post | LinkedIn

Some popular load balancing algorithms:

Round Robin

distributes incoming requests sequentially to each server in a circular manner
simple and easy to implement but may not take into account server load or capacity
most used

Weighted Round Robin

similar to Round Robin, but with the ability to assign different weights to servers based on their capacity or performance
Servers with higher weights receive more requests

IP Hash

Uses the client's IP address to determine which server to send the request to
Requests from the same IP address are consistently routed to the same server

Least Connections

directs incoming requests to the server with the fewest active connections at the time
helps distribute the load evenly among servers based on their current workload

Least Response Time

Routes requests to the server with the lowest response time or latency
Aims to optimize performance by sending requests to the fastest server.

Random

Randomly selects a server from the pool to handle each request
While simple, it may not ensure even distribution of load across servers

Each load balancing algorithm has its own advantages and considerations.

The choice of algorithm depends on the specific requirements of the system and the desired load distribution strategy.

Disclaimer:

All credits for the inspiration for the article, an infograph image and part of the content go to Sina Riyahi [https://www.linkedin.com/in/sina-riyahi/].

Monday, 5 August 2024

Introduction to Amazon Simple Queue Service (SQS)

Amazon Simple Queue Service (SQS) is a fully managed message queuing service provided by Amazon Web Services (AWS). It enables decoupling and scaling of microservices, distributed systems, and serverless applications.

Here's an overview of how Amazon SQS works:

Key Concepts

Queue:

A queue is a temporary storage location for messages waiting to be processed (polled by consumers). There are two types of queues in SQS:

Standard Queue: Offers maximum throughput, best-effort ordering, and at-least-once delivery.
FIFO Queue: Ensures exactly-once processing and preserves the exact order of messages.

Message:

A message is the data that is sent between different components. It can be up to 256 KB in size and contains the information needed for processing.

Producer:

The producer (or sender) sends messages to the queue. Producers can be applications, microservices and other AWS services.

Consumer:

The consumer (or receiver) retrieves and processes messages from the queue. Consumers can be Lambda functions, EC2 instances and other AWS services

Visibility Timeout:

A period during which a message is invisible to other consumers after a consumer retrieves it from the queue. This prevents other consumers from processing the same message concurrently.

Dead-Letter Queue (DLQ):

A queue for messages that could not be processed successfully after a specified number of attempts. This helps in isolating and analyzing problematic messages.

Workflow

Sending Messages:

A producer sends messages to an SQS queue using the SendMessage action. Each message is assigned a unique ID and placed in the queue.

Receiving Messages:

A consumer retrieves messages from the queue using the ReceiveMessage action. This operation can specify:

number of messages to retrieve (up to 10)
duration to wait if no messages are available

Processing Messages:

After receiving a message, the consumer processes it. The message remains invisible to other consumers for a specified visibility timeout.

Deleting Messages:

Once processed, the consumer deletes the message from the queue using the DeleteMessage action. If not deleted within the visibility timeout, the message becomes visible again for other consumers to process.

Handling Failures:

If a message cannot be processed successfully within a specified number of attempts, it is moved to the Dead-Letter Queue for further investigation.

Additional Features

Long Polling:

Reduces the number of empty responses by allowing the ReceiveMessage action to wait for a specified amount of time until a message arrives in the queue.

Message Attributes:

Metadata about the message that can be used for filtering and routing.

Batch Operations:

SQS supports batch sending, receiving, and deleting of messages, which can improve efficiency and reduce costs.

Security and Access Control

IAM Policies:

Use AWS Identity and Access Management (IAM) policies to control access to SQS queues.

Encryption:

Messages can be encrypted in transit using SSL/TLS and at rest using AWS Key Management Service (KMS).

Use Cases

Decoupling Microservices:

SQS allows microservices to communicate asynchronously, improving scalability and fault tolerance.

Work Queues:

Distributing tasks to multiple workers for parallel processing.

Event Sourcing:

Storing a series of events to track changes in state over time.

Example Scenario

Order Processing System:

An e-commerce application has separate microservices for handling orders, inventory, and shipping.
The order service sends an order message to an SQS queue.
The inventory service retrieves the message, processes it (e.g., reserves stock), and then sends an updated message to another queue.
The shipping service retrieves the updated message and processes it (e.g., ships the item).

By using Amazon SQS, these microservices can operate independently and scale as needed, ensuring reliable and efficient order processing.

What is best-effort ordering?

Best-effort ordering is the default delivery logic for Amazon SQS Standard queues. Under this model, SQS attempts to deliver messages in the same order they were sent, but it does not guarantee it.

How It Works

General Alignment: SQS uses a highly distributed architecture to achieve nearly unlimited throughput. While it tries to maintain a "loose FIFO" (First-In, First-Out) flow, messages may occasionally be delivered out of sequence.
Cause of Reordering: Out-of-order delivery typically occurs due to the way messages are stored across multiple servers and availability zones for redundancy. Factors like high throughput, network delays, or failure recovery can cause a message sent later to be available for retrieval before an earlier one.

Comparison with FIFO Queues

If your application requires strict ordering, you must use SQS FIFO queues instead of Standard queues.

Feature Best-Effort Ordering (Standard) Strict Ordering (FIFO)

---------- ------------------------------------------ -----------------------------

Ordering Guarantee No (messages may arrive out of order) Yes (exact order preserved)

Throughput Nearly unlimited Limited (unless High Throughput mode is used)

Delivery Model At-least-once (duplicates possible) Exactly-once (no duplicates)

Cost Lower Slightly higher

Best Practices for Best-Effort Ordering

Idempotency: Ensure your application can handle the same message multiple times without unintended side effects.
Tolerance for Shuffle: Use Standard queues for workloads where order isn't critical, such as processing log data, real-time analytics, or distributing independent background tasks.
Application-Level Logic: If you need some ordering but want the high throughput of Standard queues, you can include sequence numbers in your message attributes and handle the reordering logic within your consumer application.

How can some AWS Service send messages to a queue?

AWS services send messages to an Amazon SQS queue through three primary methods: direct API calls, event-driven notifications, or as a downstream target for messaging services.

1. Direct API Integration (Producer Model)

Many compute services act as "producers" by calling the SendMessage or SendMessageBatch API actions directly using an AWS SDK (like Boto3 for Python or the SDK for Node.js).

AWS Lambda: A function can use an SDK to programmatically push results or tasks into a queue for further processing.
Amazon EC2 & ECS: Applications running on virtual machines or in containers can send messages to SQS to decouple from backend systems.
AWS Step Functions: You can use a "Task" state to publish a message directly to SQS as part of a workflow.

2. Event-Driven Notifications

Certain services can be configured to automatically "push" notifications into a queue when specific events occur.

Amazon S3: You can set up S3 Event Notifications to send a message to SQS whenever an object is created, deleted, or restored in a bucket.
Amazon EventBridge: You can create rules that match specific system events and route them to an SQS queue as a target.

3. Messaging Service Fan-out

SQS often acts as a subscriber or target for other messaging and integration services.

Amazon SNS: Using the "fan-out" pattern, a message published to an SNS topic can be automatically delivered to multiple SQS queues simultaneously.
Amazon API Gateway: You can integrate an API endpoint directly with SQS. This allows external clients to send messages to your queue via a REST API without needing a Lambda function in between.

Crucial Requirement: Permissions

For any service to send messages, it must have the sqs:SendMessage permission granted via an IAM Policy. Additionally, the SQS Queue Access Policy must explicitly allow the sending service or account to perform that action.

Dead Letter Queues (DLQ)

In the context of AWS SQS, DLQ stands for Dead Letter Queue.

It is not a special type of queue; it is simply a standard SQS queue that is designated as a "holding pen" for messages that could not be processed successfully by your consumer application.

How it Works

When a message is picked up from your main queue, the consumer tries to process it. If the consumer fails (crashes, times out, or throws an error), the message returns to the main queue to be tried again. Without a DLQ, a "poison pill" message (one that causes a crash every single time) could cycle through your system forever, wasting resources. A DLQ solves this by setting a Redrive Policy.

Main Queue: Receives the message.

Maximum Receives: You define a limit (e.g., 3 or 5). If a message fails this many times, SQS gives up.

The Move: SQS automatically moves that specific message from the main queue into the Dead Letter Queue.

Why use a DLQ?

Isolate Problematic Data: It separates "bad" messages from the "good" ones so your main pipeline can keep flowing.

Debugging: You can inspect the DLQ to see exactly what caused the failure (e.g., malformed JSON or an unexpected null value).

Manual Recovery: Once you fix the bug in your code, you can "drive" the messages back from the DLQ into the main queue to be processed correctly.

Common SQS Attributes for DLQs

Term Meaning

------- ------------

Source Queue The original queue where messages arrive first.

Redrive Policy The configuration that links the source queue to the DLQ.

maxReceiveCount The number of times a message can fail before being sent to the DLQ.

If queues ending in -dlq have 0.0 activity (number of messages written in past X days), this is actually a sign of a healthy system! It means:

Your consumers are processing messages successfully.
No "poison pill" messages have failed enough times to be kicked over to the dead letter storage.

Always set an alarm on your DLQs for ApproximateNumberOfMessagesVisible > 0. You want to know immediately when a message lands there, as it usually means a bug in your consumer code or a problem with your data.

Is DLQ created automatically for each SQS queue?

No, a Dead Letter Queue (DLQ) is not created automatically. It is a configuration you must set up yourself.

In AWS, a DLQ is just a regular SQS queue that you have "pointed" another queue toward. If you create a queue named orders-queue and don't explicitly tell it to use a DLQ, it simply won't have one.

The Two Steps to Creating a DLQ

To set up a DLQ, you have to perform two distinct actions:

Create the DLQ itself: You create a second, standalone SQS queue (usually named with a -dlq suffix).
Configure the Redrive Policy: You go back to your source queue and update its settings to point to the ARN (Amazon Resource Name) of the DLQ you just created.

Key Requirements

For a DLQ relationship to work, a few rules must be followed:

Same Region: The source queue and the DLQ must be in the same AWS Region and the same AWS Account
Same Type: A FIFO queue must use a FIFO queue as its DLQ. A Standard queue must use a Standard queue as its DLQ.
Permissions: If you are using a custom KMS key for encryption, the source queue needs permission to use that key to move messages to the DLQ.

What happens if you don't have a DLQ?

If a message fails to process and you haven't configured a DLQ:

The message returns to the source queue after the Visibility Timeout expires.
The consumer picks it up again.
If the message is "poison" (causes a crash), this loop repeats indefinitely until the message's Message Retention Period (default 4 days) expires.
Once the retention period is up, the message is simply deleted by AWS and lost forever.

In professional environments it is "Best Practice" to create a DLQ for every production queue. We should have 1:1 ratio of queue and -dlq names in our list of SQS queues. Engineers should create and link them to ensure no data is lost during a processing failure.

Any queues in our list that don't have a matching -dlq might be candidates for a quick configuration update!

Message Queuing Service - Amazon Simple Queue Service - AWS

Saturday, 3 August 2024

Running Helm as Docker container

I tend to run tools as Docker containers, if their Docker images are provided. Such image for Helm is: alpine/helm - Docker Image | Docker Hub.

To run the container (and its default command which is helm -h):

$ docker run --rm alpine/helm

The Kubernetes package manager

Common actions for Helm:

- helm search: search for charts

- helm pull: download a chart to your local directory to view

- helm install: upload the chart to Kubernetes

- helm list: list releases of charts

Environment variables:

| Name | Description |

|------------------------------------|------------------------------------------------------------------------------------------------------------|

| $HELM_CACHE_HOME | set an alternative location for storing cached files. |

| $HELM_CONFIG_HOME | set an alternative location for storing Helm configuration. |

| $HELM_DATA_HOME | set an alternative location for storing Helm data. |

| $HELM_DEBUG | indicate whether or not Helm is running in Debug mode |

| $HELM_DRIVER | set the backend storage driver. Values are: configmap, secret, memory, sql. |

| $HELM_DRIVER_SQL_CONNECTION_STRING | set the connection string the SQL storage driver should use. |

| $HELM_MAX_HISTORY | set the maximum number of helm release history. |

| $HELM_NAMESPACE | set the namespace used for the helm operations. |

| $HELM_NO_PLUGINS | disable plugins. Set HELM_NO_PLUGINS=1 to disable plugins. |

| $HELM_PLUGINS | set the path to the plugins directory |

| $HELM_REGISTRY_CONFIG | set the path to the registry config file. |

| $HELM_REPOSITORY_CACHE | set the path to the repository cache directory |

| $HELM_REPOSITORY_CONFIG | set the path to the repositories file. |

| $KUBECONFIG | set an alternative Kubernetes configuration file (default "~/.kube/config") |

| $HELM_KUBEAPISERVER | set the Kubernetes API Server Endpoint for authentication |

| $HELM_KUBECAFILE | set the Kubernetes certificate authority file. |

| $HELM_KUBEASGROUPS | set the Groups to use for impersonation using a comma-separated list. |

| $HELM_KUBEASUSER | set the Username to impersonate for the operation. |

| $HELM_KUBECONTEXT | set the name of the kubeconfig context. |

| $HELM_KUBETOKEN | set the Bearer KubeToken used for authentication. |

| $HELM_KUBEINSECURE_SKIP_TLS_VERIFY | indicate if the Kubernetes API server's certificate validation should be skipped (insecure) |

| $HELM_KUBETLS_SERVER_NAME | set the server name used to validate the Kubernetes API server certificate |

| $HELM_BURST_LIMIT | set the default burst limit in the case the server contains many CRDs (default 100, -1 to disable) |

| $HELM_QPS | set the Queries Per Second in cases where a high number of calls exceed the option for higher burst values |

Helm stores cache, configuration, and data based on the following configuration order:

- If a HELM_*_HOME environment variable is set, it will be used

- Otherwise, on systems supporting the XDG base directory specification, the XDG variables will be used

- When no other location is set a default location will be used based on the operating system

By default, the default directories depend on the Operating System. The defaults are listed below:

|------------------|---------------------------|--------------------------------|-------------------------|

Usage:

helm [command]

Available Commands:

completion generate autocompletion scripts for the specified shell

create create a new chart with the given name

dependency manage a chart's dependencies

env helm client environment information

get download extended information of a named release

help Help about any command

history fetch release history

install install a chart

lint examine a chart for possible issues

list list releases

package package a chart directory into a chart archive

plugin install, list, or uninstall Helm plugins

pull download a chart from a repository and (optionally) unpack it in local directory

push push a chart to remote

registry login to or logout from a registry

repo add, list, remove, update, and index chart repositories

rollback roll back a release to a previous revision

search search for a keyword in charts

show show information of a chart

status display the status of the named release

template locally render templates

test run tests for a release

uninstall uninstall a release

upgrade upgrade a release

verify verify that a chart at the given path has been signed and is valid

version print the client version information

Flags:

--burst-limit int client-side default throttling limit (default 100)

--debug enable verbose output

-h, --help help for helm

--kube-apiserver string the address and the port for the Kubernetes API server

--kube-as-group stringArray group to impersonate for the operation, this flag can be repeated to specify multiple groups.

--kube-as-user string username to impersonate for the operation

--kube-ca-file string the certificate authority file for the Kubernetes API server connection

--kube-context string name of the kubeconfig context to use

--kube-insecure-skip-tls-verify if true, the Kubernetes API server's certificate will not be checked for validity. This will make your HTTPS connections insecure

--kube-tls-server-name string server name to use for Kubernetes API server certificate validation. If it is not provided, the hostname used to contact the server is used

--kube-token string bearer token used for authentication

--kubeconfig string path to the kubeconfig file

-n, --namespace string namespace scope for this request

--qps float32 queries per second used when communicating with the Kubernetes API, not including bursting

--registry-config string path to the registry config file (default "/root/.config/helm/registry/config.json")

--repository-cache string path to the file containing cached repository indexes (default "/root/.cache/helm/repository")

--repository-config string path to the file containing repository names and URLs (default "/root/.config/helm/repositories.yaml")

Use "helm [command] --help" for more information about a command.

We'd get the same output if we run:

$ docker run --rm alpine/helm -h

Let's check the Helm version:

$ docker run --rm alpine/helm version

version.BuildInfo{Version:"v3.15.3", GitCommit:"3bb50bbbdd9c946ba9989fbe4fb4104766302a64", GitTreeState:"clean", GoVersion:"go1.22.5"}

Let's say we want to list releases in the local Minikube cluster. Let's assume kubectl's current context is set to minikube.

If we run:

$ docker run --rm alpine/helm list

Error: Kubernetes cluster unreachable: Get "http://localhost:8080/version": dial tcp [::1]:8080: connect: connection refused

The error tells that Helm didn't find the right kubeconfig and it used default cluster API server url. We need to provide Helm the correct kubeconfig. In our case it's ~/.kube/config so we need to mount ~/.kube as a volume:

$ docker run --rm -v ~/.kube:/root/.kube alpine/helm list

Error: Kubernetes cluster unreachable: invalid configuration: [unable to read client-cert /home/bojan/.minikube/profiles/minikube/client.crt for minikube due to open /home/bojan/.minikube/profiles/minikube/client.crt: no such file or directory, unable to read client-key /home/bojan/.minikube/profiles/minikube/client.key for minikube due to open /home/bojan/.minikube/profiles/minikube/client.key: no such file or directory, unable to read certificate-authority /home/bojan/.minikube/ca.crt for minikube due to open /home/bojan/.minikube/ca.crt: no such file or directory]

This error shows Helm could not access certificates that are references in kubeconfig:

$ cat ~/.kube/config

$ kubectl config view

...give the following output:

apiVersion: v1

clusters:

...

- cluster:

certificate-authority: /home/bojan/.minikube/ca.crt

extensions:

- extension:

last-update: Sat, 03 Aug 2024 00:16:58 BST

provider: minikube.sigs.k8s.io

version: v1.33.0

server: https://192.168.59.100:8443

contexts:

...

- context:

cluster: minikube

extensions:

- extension:

last-update: Sat, 03 Aug 2024 00:16:58 BST

provider: minikube.sigs.k8s.io

version: v1.33.0

namespace: default

user: minikube

current-context: minikube

kind: Config

preferences: {}

users:

...

- name: minikube

user:

client-certificate: /home/bojan/.minikube/profiles/minikube/client.crt

client-key: /home/bojan/.minikube/profiles/minikube/client.key

Helm in container needs to access both kubeconfig file and also all files referenced in it, in our case:

/home/bojan/.minikube/ca.crt
/home/bojan/.minikube/profiles/minikube/client.crt
/home/bojan/.minikube/profiles/minikube/client.key

We can mount /home/bojan/.minikube/ as a volume (see docker run | Docker Docs), at the same path (which will be created in container).

$ docker run --rm -v ~/.kube:/root/.kube -v /home/bojan/.minikube:/home/bojan/.minikube alpine/helm list

NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION

This shows we don't have any releases in our Minikube cluster.

Converting the original (non-templated) manifests to Helm chart

Let's look at we have the following original (non-templated) manifest files:

If we run Helm as a Docker container and want to use it to create deployment in the Minikube cluster we can do:

$ docker run --rm \

-v ~/.kube:/root/.kube \

-v $(pwd)/minikube/php-fmp-nginx-demo:/apps \

-v /home/bojan/.minikube:/home/bojan/.minikube \

alpine/helm

Creating helm-chart

This creates a helm-chart directory:

Note that Docker's default user is root so this new directory and all its content will have root as an owner and any future local action on these objects will require elevated privileges:

$ ls -la ./minikube/php-fmp-nginx-demo/helm-chart/

total 28

drwxr-xr-x 4 root root 4096 Aug 3 13:36 .

drwxrwxr-x 4 bojan bojan 4096 Aug 3 13:36 ..

drwxr-xr-x 2 root root 4096 Aug 3 13:36 charts

-rw-r--r-- 1 root root 1146 Aug 3 13:36 Chart.yaml

-rw-r--r-- 1 root root 349 Aug 3 13:36 .helmignore

drwxr-xr-x 3 root root 4096 Aug 3 13:36 templates

-rw-r--r-- 1 root root 2363 Aug 3 13:36 values.yaml

To prevent this we can tell Docker to use non-root user.

$ id

uid=1000(bojan) gid=1000(bojan) groups=1000(bojan),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),116(lpadmin),126(sambashare),142(libvirt),999(docker)

$ sudo rm -rf ./minikube/php-fmp-nginx-demo/helm-chart/

$ docker run --rm -v ~/.kube:/root/.kube -v $(pwd)/minikube/php-fmp-nginx-demo:/apps -v /home/bojan/.minikube:/home/bojan/.minikube --user 1000:1000 alpine/helm create helm-chart

Creating helm-chart

$ ls -la ./minikube/php-fmp-nginx-demo/helm-chart/

total 28

drwxr-xr-x 4 bojan bojan 4096 Aug 3 14:24 .

drwxrwxr-x 4 bojan bojan 4096 Aug 3 14:24 ..

drwxr-xr-x 2 bojan bojan 4096 Aug 3 14:24 charts

-rw-r--r-- 1 bojan bojan 1146 Aug 3 14:24 Chart.yaml

-rw-r--r-- 1 bojan bojan 349 Aug 3 14:24 .helmignore

drwxr-xr-x 3 bojan bojan 4096 Aug 3 14:24 templates

-rw-r--r-- 1 bojan bojan 2363 Aug 3 14:24 values.yaml

minikube/php-fmp-nginx-demo/helm-chart/values.yaml:

deployment-nginx:

minikube/php-fmp-nginx-demo/helm-chart/templates/nginx-deployment.yaml:

apiVersion: apps/v1

kind: Deployment

metadata:

$ docker run --rm -v ~/.kube:/root/.kube -v $(pwd)/minikube/php-fmp-nginx-demo:/php-fmp-nginx-demo -v /home/bojan/.minikube:/home/bojan/.minikube alpine/helm install --dry-run --debug php-fmp-nginx-demo-release-v1.0 /php-fmp-nginx-demo/helm-chart

install.go:222: [debug] Original chart version: ""

install.go:239: [debug] CHART PATH: /php-fmp-nginx-demo/helm-chart

Error: INSTALLATION FAILED: parse error at (helm-chart/templates/nginx-deployment.yaml:4): bad character U+002D '-'

helm.go:84: [debug] parse error at (helm-chart/templates/nginx-deployment.yaml:4): bad character U+002D '-'

INSTALLATION FAILED

Fix: Accessing values of the subchart with dash in the name · Issue #2192 · helm/helm

Templating advice · Issue #3292 · helm/helm

https://stackoverflow.com/questions/75375090/merge-annotations-in-helm

https://github.com/helm/helm/issues/2192

https://v2.helm.sh/docs/chart_best_practices/

https://github.com/helm/helm-www/issues/1272

https://stackoverflow.com/questions/63853679/helm-templating-doesnt-let-me-use-dash-in-names

https://stackoverflow.com/questions/47844377/how-can-i-create-a-volume-for-the-current-user-home-directory-in-docker-compose

https://helm.sh/docs/helm/helm_install/

https://github.com/roboll/helmfile/issues/176

https://controlplane.com/community-blog/post/kubeconfig-file-for-the-aws-eks-cluster

https://discuss.kubernetes.io/t/the-connection-to-the-server-localhost-8080-was-refused-did-you-specify-the-right-host-or-port/1464/4

https://stackoverflow.com/questions/63066604/error-kubernetes-cluster-unreachable-get-http-localhost8080-versiontimeou

https://k21academy.com/docker-kubernetes/the-connection-to-the-server-localhost8080-was-refused/

https://docs.docker.com/reference/cli/docker/container/run/#volume

References:

Do not install Helm, use it within a Docker container

Pages

Monday, 12 August 2024

Key Elements

How to create a Workflow?

How to trigger a workflow?

Deployment Environments

Purpose of Environments

Using Environments in GitHub Workflows

Usual workflows in a repository

Continuous Integration (CI) Workflow

Continuous Delivery (CD) Workflow

Useful Reusable Actions

Examples

GitHub Action Versions

Is it safe to use "@master" for our own custom actions which reside in our internal org's repository?

GitHub Actions and Authentication with AWS

GitHub Actions and NodeJS version

If workflow has "runs-on: ubuntu-latest", which Node will be initially installed on the runner?

So actions themselves are running on their own pre-configured Node version and it can happen that e.g. actions/setup-node runs on Node v24 but it installs Node v16 on the runner?

Sunday, 11 August 2024

Saturday, 10 August 2024

Understand the Existing Environment

Security Review

Compliance and Governance

Pipeline Efficiency

Quality Assurance

Monitoring and Logging

Scalability and Flexibility

Documentation and Reporting

Feedback and Continuous Improvement

Final Report and Recommendations

Friday, 9 August 2024

Why do we need multiple environments?

QA vs QC vs Testing

DevOps Environments

Thursday, 8 August 2024

Disclaimer:

Monday, 5 August 2024

Key Concepts

Workflow

Additional Features

Security and Access Control

Use Cases

Example Scenario

Order Processing System:

What is best-effort ordering?

How It Works

Comparison with FIFO Queues

Best Practices for Best-Effort Ordering

How can some AWS Service send messages to a queue?

1. Direct API Integration (Producer Model)

2. Event-Driven Notifications

3. Messaging Service Fan-out

Crucial Requirement: Permissions

Dead Letter Queues (DLQ)

How it Works

Why use a DLQ?

Common SQS Attributes for DLQs

Is DLQ created automatically for each SQS queue?

The Two Steps to Creating a DLQ

Key Requirements

What happens if you don't have a DLQ?

Saturday, 3 August 2024

Converting the original (non-templated) manifests to Helm chart

References: