Wednesday, 31 July 2024

How to use HashiCorp Cloud as a remote storage for Terraform state file

Terraform state file keeps track of the infrastructure which is under Terraform's control. Terraform compares resource configuration files against it in order to find out which resource needs to be added, edited or deleted. If state file gets lost, Terraform will try to re-create all resources.

By default Terraform state file (terraform.tfstate) is stored locally, on the machine where we initialize Terraform. But this carries the risk of adding this file (which may contain sensitive data) to the repository and pushing it to remote which can be a security risk or deleting it by chance which can be painful experience - see Lessons learned after losing the Terraform state file | Trying things.

To minimize chances of losing the Terraform state file and enable multiple contributors to work on the same infrastructure in parallel we should define a remote storage for it. We can store it in AWS S3 bucket, Google Cloud etc...but one of the totally free options, which also includes the shared state file locking mechanism, is Terraform Cloud.

Here are the steps which explain how to do it.

Go to Terraform Cloud (https://app.terraform.io/) and create an account.
Create an organization (e.g. terraform-states) and a workspace (e.g. remote-state-demo) within Terraform Cloud. Workspaces are where state files are stored and managed.

Configure Terraform Cloud Backend:

Add the following backend configuration to e.g. terraform.tf file:

terraform {

backend "remote" {

organization = "terraform-states"

workspaces {

name = "remote-state-demo"

}

$ terraform login

Terraform will request an API token for app.terraform.io using your browser.

If login is successful, Terraform will store the token in plain text in

the following file for use by subsequent commands:

/home/<user>/.terraform.d/credentials.tfrc.json

Do you want to proceed?

Only 'yes' will be accepted to confirm.

Enter a value: yes

---------------------------------------------------------------------------------

Terraform must now open a web browser to the tokens page for app.terraform.io.

If a browser does not open this automatically, open the following URL to proceed:

https://app.terraform.io/app/settings/tokens?source=terraform-login

---------------------------------------------------------------------------------

Generate a token using your browser, and copy-paste it into this prompt.

Terraform will store the token in plain text in the following file

for use by subsequent commands:

/home/<user>/.terraform.d/credentials.tfrc.json

Token for app.terraform.io:

Enter a value: Opening in existing browser session.

Retrieved token for user <tf_user>

---------------------------------------------------------------------------------

----- -

--------- --

--------- - -----

--------- ------ -------

------- --------- ----------

---- ---------- ----------

-- ---------- ----------

Welcome to HCP Terraform! - ---------- -------

--- ----- ---

Documentation: terraform.io/docs/cloud -------- -

----------

---------

-----

New to HCP Terraform? Follow these steps to instantly apply an example configuration:

$ git clone https://github.com/hashicorp/tfc-getting-started.git

$ cd tfc-getting-started

$ scripts/setup.sh

During this process a Terraform Cloud token generation page opens in browser:

terraform login should automatically pick the token and save it but in case this fails, you can copy the token and paste it here:

/home/<user>/.terraform.d/credentials.tfrc.json:

{

"credentials": {

"app.terraform.io": {

"token": "1kLiQ....h3A"

}

This authentication is necessary for the next step:

Initialize the Backend:

Run terraform init to initialize the backend configuration

If we don't login to Terraform first we'll get:

$ terraform init

Initializing HCP Terraform...

╷

│ Error: Required token could not be found

│

│ Run the following command to generate a token for app.terraform.io:

│ terraform login

╵

If we're authenticated with Terraform:

$ terraform init

Initializing the backend...

Successfully configured the backend "remote"! Terraform will automatically

use this backend unless the backend configuration changes.

Initializing provider plugins...

- Finding latest version of hashicorp/local...

- Installing hashicorp/local v2.5.1...

- Installed hashicorp/local v2.5.1 (signed by HashiCorp)

Terraform has created a lock file .terraform.lock.hcl to record the provider

selections it made above. Include this file in your version control repository

so that Terraform can guarantee to make the same selections by default when

you run "terraform init" in the future.

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see

any changes that are required for your infrastructure. All Terraform commands

should now work.

If you ever set or change modules or backend configuration for Terraform,

rerun this command to reinitialize your working directory. If you forget, other

commands will detect it and remind you to do so if necessary.

Let's assume we have the following resource:

main.tf:

resource "local_file" "foo" {

filename = "${path.cwd}/temp/foo.txt"

content = "This is a text content of the foo file!"

}

We can now see the plan:

$ terraform plan

Running plan in the remote backend. Output will stream here. Pressing Ctrl-C

will stop streaming the logs, but will not stop the plan running remotely.

Preparing the remote plan...

To view this run in a browser, visit:

https://app.terraform.io/app/terraform-states/remote-state-demo/runs/run-nbxxG2TBxSYGEgCm

Waiting for the plan to start...

Terraform v1.9.3

on linux_amd64

Initializing plugins and modules...

Terraform used the selected providers to generate the following execution

plan. Resource actions are indicated with the following symbols:

+ create

Terraform will perform the following actions:

# local_file.foo will be created

+ resource "local_file" "foo" {

+ content = "This is a text content of the foo file!"

+ content_base64sha256 = (known after apply)

+ content_base64sha512 = (known after apply)

+ content_md5 = (known after apply)

+ content_sha1 = (known after apply)

+ content_sha256 = (known after apply)

+ content_sha512 = (known after apply)

+ directory_permission = "0777"

+ file_permission = "0777"

+ filename = "/home/tfc-agent/.tfc-agent/component/terraform/runs/run-nbxxG2TBxSYGEgCm/config/temp/foo.txt"

+ id = (known after apply)

}

Plan: 1 to add, 0 to change, 0 to destroy.

Notice that plan is running in remote backend and file path is also the one on the remote Terraform cloud machine. This is because we left our workspace to use organisation's Execution Mode which is Remote - all resources will be created on the remote machine. But this is not what we want, we want remote to contain only state file. Therefore we need to change the setting:

We can now apply the configuration (after executing terraform init so the new Execution Mode gets picked):

$ terraform plan

local_file.foo: Refreshing state... [id=db5ca40b5588d44e9ec6c1b4005e11a6fd0c910e]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:

+ create

Terraform will perform the following actions:

# local_file.foo will be created

+ resource "local_file" "foo" {

+ content = "This is a text content of the foo file!"

+ content_base64sha256 = (known after apply)

+ content_base64sha512 = (known after apply)

+ content_md5 = (known after apply)

+ content_sha1 = (known after apply)

+ content_sha256 = (known after apply)

+ content_sha512 = (known after apply)

+ directory_permission = "0777"

+ file_permission = "0777"

+ filename = "/home/<user>/...hcp-cloud-state-storage-demo/temp/foo.txt"

+ id = (known after apply)

}

Plan: 1 to add, 0 to change, 0 to destroy.

We can now execute terraform apply and changes will be done on the local machine.

If we create a resource on the remote (cloud), we can see it in web console:

If we by mistake create a resource on the remote (cloud), we can delete it by removing it from the state:

$ terraform state list

local_file.foo

$ terraform state rm local_file.foo

Removed local_file.foo

Successfully removed 1 resource instance(s).

All revisions of the state file are listed in Terraform Cloud.

We can also roll back to some of the previous versions:

After this we need to unlock the state file:

Monday, 29 July 2024

Kubernetes Ingress Service

Ingress is a more flexible and powerful solution for managing external access to services within a Kubernetes cluster than Kubernetes LoadBalancer Service.

It provides:

load balancing
SSL termination
name-based virtual hosting

Ingress controllers can be configured to handle traffic more efficiently and securely.

A full Ingress deployment manifest in Kubernetes defines how external traffic is routed to services within the cluster. It uses an Ingress controller (like NGINX) to manage the rules, and the manifest specifies which services should be exposed, based on hostnames, paths, or other criteria. The manifest typically includes details about the Ingress class, rules, and the services it exposes.

Manifest example:

apiVersion: networking.k8s.io/v1

kind: Ingress

metadata:

namespace: default

spec:

ingressClassName: nginx # Or gce, aws, etc.

rules:

- host: example.com # or myapp.example.com

http:

paths:

- path: /

pathType: Prefix

backend:

service:

port:

number: 80

- host: myapp.example.com

http:

paths:

- path: /

pathType: Prefix

backend:

service:

port:

number: 80

- host: myapp2.example.com

http:

paths:

- path: /

pathType: Prefix

backend:

service:

port:

number: 80

Here is the meaning of all fields in the manifest:

apiVersion: Kubernetes API version
kind: Kubernetes object type (Ingress)
metadata: Metadata about the Ingress, including its name, namespace, and labels.
spec: The heart of the manifest, defining the Ingress rules and behavior.

ingressClassName: Specifies the Ingress controller to use (e.g., nginx, gce, aws).
rules: An array of rules that define how traffic should be routed. Each rule includes:

host: The hostname to match (e.g., example.com).
http: A section defining HTTP traffic routing.

paths: An array of paths to match within the HTTP request.

backend: The service to forward traffic to based on the path.

serviceName: The name of the service.
servicePort: The port of the service.

host

Removing the host (host-specific rules) from the ingress will make it accept traffic for any hostname, potentially exposing the service more broadly than intended. These changes could lead to unintended access to services if the ALB receives requests with different or no host headers.

serviceName

What is this service and what are its requirements so it can serve the traffic?

In Kubernetes (K8s), an Ingress manifest by itself does not directly create an AWS load balancer — but it can indirectly result in one being created depending on how your cluster is configured.

Ingress is just a K8s object that defines HTTP routing rules (e.g., "requests to /foo go to Service A"). It requires an Ingress Controller to actually implement those rules and expose the traffic. If we just apply an Ingress manifest and no Ingress Controller is installed, nothing will happen — no load balancer will be created.

To get an AWS Load Balancer via Ingress we need to install and use an Ingress Controller that integrates with AWS, such as:

AWS ALB Ingress Controller (now called AWS Load Balancer Controller)
Nginx Ingress Controller (but then you'd still need a Service of type LoadBalancer to expose it)

If we're using the AWS Load Balancer Controller and define an Ingress object with proper annotations, then it will create an AWS Application Load Balancer (ALB) and configure it with listeners and target groups according to the Ingress rules. This is the most direct way to have an Ingress lead to the creation of an AWS load balancer.

NOTE: LB creation takes some time e.g. 30 seconds.

Example: Ingress manifest that works with AWS Load Balancer Controller

apiVersion: networking.k8s.io/v1

kind: Ingress

metadata:

namespace: default

annotations:

alb.ingress.kubernetes.io/scheme: internet-facing

alb.ingress.kubernetes.io/target-type: ip

alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}]'

spec:

ingressClassName: alb

rules:

- http:

paths:

- path: /

pathType: Prefix

backend:

service:

port:

number: 80

Annotations

alb.ingress.kubernetes.io/scheme

The annotation alb.ingress.kubernetes.io/scheme in AWS Load Balancer Controller specifies the visibility and network exposure of the Application Load Balancer (ALB) created for your Kubernetes Ingress.

Possible values:

internet-facing

ALB is publicly accessible over the internet.
Use for services that need to be accessed publicly — e.g., websites, APIs, dashboards

internal

ALB is private, only accessible within your VPC (e.g., internal apps).
Use for internal microservices, admin interfaces, or when you only want access within a private network (e.g., from a VPN or other VPC resources).

Example (internet facing):

metadata:

annotations:

alb.ingress.kubernetes.io/scheme: internet-facing

This will create a public-facing ALB with a public DNS name like:

my-ingress-1234567890.us-west-2.elb.amazonaws.com

The subnets in your cluster must be correctly tagged:

For internet-facing: subnets must be public and tagged with: kubernetes.io/role/elb = 1
For internal: subnets must be private and tagged with: kubernetes.io/role/internal-elb = 1

If you don't set the scheme explicitly, it defaults to internet-facing.

Example: Internal Ingress

apiVersion: networking.k8s.io/v1

kind: Ingress

metadata:

namespace: default

annotations:

kubernetes.io/ingress.class: alb

alb.ingress.kubernetes.io/scheme: internal

alb.ingress.kubernetes.io/subnets: subnet-abc123,subnet-def456 # Optional: for precise control

alb.ingress.kubernetes.io/group.name: internal-apps

alb.ingress.kubernetes.io/tags: "env=dev,app=my-internal-service"

spec:

rules:

- http:

paths:

- path: /

pathType: Prefix

backend:

service:

port:

number: 80

This configuration instructs the AWS Load Balancer Controller to:

Create an internal ALB (i.e., not exposed to the public internet).
Place the ALB in private subnets (must be tagged: kubernetes.io/role/internal-elb = 1).
The ALB is reachable only within the VPC or from connected networks (like via VPN or Direct Connect).

The ALB will get a DNS name that looks like this:

internal-k8s-<name>-<hash>.<region>.elb.amazonaws.com

e.g.

internal-k8s-internalingr-abc123456789.us-west-2.elb.amazonaws.com

The internal- prefix on the DNS name indicates it is not public. It will not resolve outside the VPC (e.g., from your home network or the public internet).

We can verify the internal ALB:

AWS Console:

Go to EC2 → Load Balancers → Look at the Scheme column (internal).

CLI:

aws elbv2 describe-load-balancers \

--names internal-k8s-internalingr-abc123456789

alb.ingress.kubernetes.io/listen-ports

The annotation alb.ingress.kubernetes.io/listen-ports in AWS Load Balancer Controller is used to customize the listener ports that are created on the Application Load Balancer (ALB) for a Kubernetes Ingress resource.

By default, the AWS Load Balancer Controller creates listeners for port 80 (HTTP) and/or port 443 (HTTPS) depending on your Ingress TLS settings.

If you need to override this behavior—for example, to support only HTTPS or use non-default ports—you can use this annotation.

Example: Default Behavior (HTTP and HTTPS):

alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS": 443}]'

Creates both:

an HTTP listener on port 80
an HTTPS listener on port 443 (requires TLS configuration)

It must be a JSON array of objects, where each object specifies a protocol and port.

Example: HTTPS Only (no HTTP listener):

alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS": 443}]'

Useful if you want to force HTTPS and avoid exposing port 80. Make sure you also configure tls: section in the Ingress and associate an ACM certificate using:

alb.ingress.kubernetes.io/certificate-arn

Example: Custom Ports (e.g., HTTP on 8080):

alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 8080}]'

ALB will listen on port 8080 for HTTP traffic instead of the default 80.

We can only use ports supported by ALB, which are typically: 1–65535, but port 80 and 443 are standard and expected by most clients.. Custom ports may not work well unless you control the client environment.

Example: Internal Service with Only HTTPS

annotations:

alb.ingress.kubernetes.io/scheme: internal

alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS": 443}]'

alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:...

If we specify only HTTPS, you must provide a valid certificate via alb.ingress.kubernetes.io/certificate-arn.

If we omit this annotation, the controller decides based on TLS block:

If TLS is configured, both HTTP and HTTPS are created
If not, only HTTP on port 80 is created

alb.ingress.kubernetes.io/tags

Specifies which tags will be applied onto Application Load Balancer that will be created by LB Controller. Example:

alb.ingress.kubernetes.io/tags: "kubernetes.io/ingress-name=${local.ingress_name},Environment=${local.workspace},Team=${local.team}"

This means that LB will have these tags:

kubernetes.io/ingress-name, set to value ${local.ingress_name}
Environment set to value ${local.workspace}
Team, set to value ${local.team}

Note that by default, AWS LB Controller applies the following tags:

ingress.k8s.aws/resource = LoadBalancer
ingress.k8s.aws/stack = <namespace>/<ingress_name>

alb.ingress.kubernetes.io/group.name

One of the possible annotations is alb.ingress.kubernetes.io/group.name. It is used in AWS Load Balancer Controller. It specifies the target group name for grouping multiple Ingress resources under a single Application Load Balancer (ALB).

By default, each Ingress resource gets its own ALB, which can be costly or unnecessary. Setting the alb.ingress.kubernetes.io/group.name annotation allows you to share a single ALB across multiple Ingress resources.

Example:

apiVersion: networking.k8s.io/v1

kind: Ingress

metadata:

annotations:

alb.ingress.kubernetes.io/group.name: my-apps

alb.ingress.kubernetes.io/scheme: internet-facing

spec:

rules:

- host: app1.example.com

http:

paths:

- path: /

pathType: Prefix

backend:

service:

port:

number: 80

If we have another Ingress resource with the same group.name, it will be attached to the same ALB.

Benefits of using group.name:

Cost efficiency: Fewer ALBs to manage and pay for.
Simplified architecture: Group related services behind one ALB.
Custom routing: Combine multiple paths or hostnames on one load balancer.

All Ingresses in the same group must share the same ALB settings, such as:

alb.ingress.kubernetes.io/scheme (e.g. internet-facing or internal)
alb.ingress.kubernetes.io/load-balancer-name (optional but can be used)

The ALB is shared, but each rule/path is managed separately based on the Ingress definitions.

external-dns.alpha.kubernetes.io/ingress-hostname-source

annotations-only: This annotation configures External DNS to only use explicit hostname annotations when determining the DNS records to create

---

Friday, 26 July 2024

Introduction to CI/CD Pipeline Development

CI (Continuous Integration) and CD (Continuous Delivery/Continuous Deployment) are practices in modern software development aimed at improving the process of integrating, testing, and deploying code changes.

image source: A Crash Course in CI/CD - ByteByteGo Newsletter

Continuous Integration (CI)

Continuous Integration is a software development practice where developers regularly merge (integrate) their code changes into a shared repository multiple times a day. Each merge triggers an automated build and testing process to detect integration issues early.

Key Aspects:

Frequent Commits: Developers commit code changes frequently, at least daily.
Automated Builds: Each commit triggers an automated build to compile the code.
Automated Testing: Automated tests (unit, integration, functional) are run to verify that the new code does not break existing functionality.

Unit tests verify that code changes didn't introduce any regression at the function level. They should be fast and should be run on the dev machine before the code gets pushed to the remote (e.g. as part of git commit hooks) and also on CI server (which ensures they are 100% executed and also that they are executed in a non-local environment so there is not chance of having "it works on my machine" conflict).
Integration tests verify that all components/modules of the product work together. Example:

API endpoint /createOrder indeed creates an order with all attributes and this can be verified by verifying /getOrder response content

Immediate Feedback: Developers receive immediate feedback on the build and test status, allowing them to address issues promptly.
Shared Repository: All code changes are merged into a central repository (e.g., Git).

Stages:

Locally (on dev machine):

code is added to the repository locally, to the feature branch
as part of the commit, unit tests are run locally
code is pushed to the remote

CI server:

detects new commit
runs unit tests
builds the output binary, package or Docker image
runs integrations tests

Benefits:

Improve the feedback loop

Faster feedback on business decisions is another powerful side effect of CI. Product teams can test ideas and iterate product designs faster with an optimized CI platform. Changes can be rapidly pushed and measured for success. Bugs or other issues can be quickly addressed and repaired. [What is Continuous Integration | Atlassian]

Early detection of bugs and integration issues. - Before new code is merged it must pass the CI test assertion suite which will prevent any new regressions.

Reduced integration problems.
Improved code quality.
Faster development cycles.

Enables Scaling

CI enables organizations to scale in engineering team size, codebase size, and infrastructure. By minimizing code integration bureaucracy and communication overhead, CI helps build DevOps and agile workflows. It allows each team member to own a new code change through to release. CI enables scaling by removing any organizational dependencies between development of individual features. Developers can now work on features in an isolated silo and have assurances that their code will seamlessly integrate with the rest of the codebase, which is a core DevOps process. [What is Continuous Integration | Atlassian]

Challenges:

Adoption and installation
Technology learning curve

Best Practices:

Test Driven Development (TDD) - the practice of writing out the test code and test cases before doing any actual feature coding.
Pull requests and code reviews

Pull requests:

critical practice to effective CI
created when a developer is ready to merge new code into the main codebase
notifies other developers of the new set of changes that are ready for integration
an opportune time to kick off the CI pipeline and run the set of automated approval steps. An additional, manual approval step is commonly added at pull request time, during which a non-stakeholder engineer performs a code review of the feature

foster passive communication and knowledge share among an engineering team. This helps guard against technical debt.

Optimizing pipeline speed

Given that the CI pipeline is going to be a central and frequently used process, it is important to optimize its execution speed.
It is a best practice to measure the CI pipeline speed and optimize as necessary.

Continuous Delivery (CD)

Continuous Delivery is a software development practice where code changes are automatically built, tested, and prepared for a release to production. It extends CI by ensuring that the codebase is always in a deployable state, but the actual deployment to production is done manually.

Key Aspects:

Automated Deployment Pipeline: Code changes go through an automated pipeline, including build, test, and packaging stages.
Deployable State: The codebase is always ready for deployment to production.
Manual Release: Deployment to production is triggered manually, ensuring final checks and balances.
Staging Environment: Changes are deployed to a staging environment for final validation before production.

Benefits:

Reduced deployment risk.
Faster and more reliable releases.
High confidence in code quality and stability.
Easier and more frequent releases.

Continuous Deployment (CD)

Continuous Deployment takes Continuous Delivery a step further by automatically deploying every code change that passes the automated tests to production without manual intervention.

Key Aspects:

Automated Deployment: Every code change that passes all stages of the pipeline (build, test, package) is automatically deployed to production.
Monitoring and Alerting: Robust monitoring and alerting systems are essential to detect and respond to issues quickly.

latency
performance
resource utilization
KPIs/Key business parameters e.g. number of new installs, number of install re-tries, number of engagements, monetization, usage of various features
errors/failures/warnings in logs

Rollbacks and Roll-forwards: Mechanisms to roll back or roll forward changes in case of failures.

ideally, rollbacks would be automated

Benefits:

Accelerated release cycle.
Immediate delivery of new features and bug fixes.
Continuous feedback from the production environment.
Higher customer satisfaction due to faster updates.

image source: Azure CI CD Pipeline Creation with DevOps Starter

CI/CD Pipeline

A CI/CD pipeline is a series of automated processes that help deliver new software versions more efficiently.

The typical stages include:

Source Code Management

Developers commit code to a shared repository.

Build

The code is compiled and built into a deployable format (e.g., binary, Docker image).

Automated Testing

Automated tests are run to ensure the code functions correctly (unit tests, integration tests, functional tests).

Packaging

The build artifacts are packaged for deployment.

Deployment

Continuous Delivery: Artifacts are deployed to a staging environment, and deployment to production is manual.
Continuous Deployment: Artifacts are automatically deployed to production.

Monitoring

The deployed application is monitored for performance and errors

CI/CD Pipeline Development: Build and maintain a continuous integration/continuous deployment (CI/CD) pipeline to automate the testing and deployment of code changes.

image source: EP71: CI/CD Pipeline Explained in Simple Terms

Tools for CI/CD

CI Tools:

Jenkins
GitHub Actions - The most popular free CI/CD platform

Linting & Testing

GitLab CI
CircleCI
Travis CI

CD Tools:

Spinnaker
ArgoCD
Tekton
AWS CodePipeline

Testing Tools:

JUnit
Selenium
Cypress
pytest

Build Tools:

Maven
Gradle
npm
Docker

Monitoring Tools:

Prometheus
Grafana
ELK Stack

By implementing CI/CD practices, development teams can achieve faster delivery cycles, higher code quality, and a more reliable deployment process.

How to enhance automation and scalability in CI/CD?

Enhancing automation and scalability in CI/CD practices involves implementing strategies and tools that streamline processes, reduce manual intervention, and ensure that the system can handle increasing workloads effectively. Here are some key practices to achieve these goals:

CI/CD Practices for Enhanced Automation and Scalability:

Automated Testing:

Unit Testing: Automated tests for individual units of code ensure that changes don’t break functionality.
Integration Testing: Tests that verify the interaction between different parts of the application.
End-to-End (E2E) Testing: Simulate real user scenarios to ensure the application works as expected.
Continuous Testing: Running tests automatically on every code change.

Pipeline as Code:

Define CI/CD pipelines using code (e.g., YAML files) stored in version control.
This makes it easier to track changes, review pipeline modifications, and replicate environments.
Example: In TeamCity it is possible to enable storing build configurations as a Kotlin code, in a dedicated repository

Infrastructure as Code (IaC):

Use tools like Terraform, Ansible, or CloudFormation to manage infrastructure.
IaC allows for automated provisioning, scaling, and management of environments.

Containerization:

Use Docker or similar containerization technologies to create consistent environments.
Containers ensure that applications run the same way regardless of where they are deployed, simplifying scaling and deployment.

Orchestration:

Use Kubernetes or other orchestration tools to manage containerized applications.
Orchestration tools help in scaling applications automatically based on demand.

Parallel Execution:

Run tests and build processes in parallel to reduce overall pipeline execution time.
This is especially useful for large test suites and complex builds.

Caching:

Implement caching for dependencies, build artifacts, and other frequently used resources to speed up the CI/CD pipeline.
Cache mechanisms reduce the time required for repetitive tasks.

Artifact Management:

Use artifact repositories like JFrog Artifactory or Nexus to store build artifacts.
Proper artifact management ensures reliable and consistent deployments.

Environment Consistency:

Ensure development, testing, staging, and production environments are as similar as possible.
Consistent environments reduce the likelihood of environment-specific bugs.

Monitoring and Logging:

Implement monitoring and logging throughout the CI/CD pipeline.
Use tools like Prometheus, Grafana, ELK Stack, or Splunk to gain insights and quickly identify issues.

Feature Toggles:

Use feature toggles to control the release of new features without deploying new code.
This allows for safer and more controlled feature releases and rollbacks.

Scalable Architecture:

Design applications to be stateless and horizontally scalable.
Use microservices architecture to break down applications into smaller, manageable services that can be scaled independently.

Automated Rollbacks:

Implement automated rollback mechanisms in case of deployment failures.
This ensures quick recovery from failed deployments without manual intervention.

Security Automation:

Integrate security checks into the CI/CD pipeline using tools like Snyk, OWASP ZAP, or Aqua Security.
Automated security scanning helps in identifying vulnerabilities early.

By adopting these practices, organizations can achieve a highly automated, reliable, and scalable CI/CD pipeline that supports rapid and safe software delivery.

Example Workflow for Automation and Scalability

Here’s a high-level example of a CI/CD workflow that incorporates some of these practices:

github-actions-example.yaml:

on: [push, pull_request]

jobs:

build:

runs-on: ubuntu-latest

strategy:

matrix:

node-version: [12, 14, 16]

steps:

- uses: actions/checkout@v2

- name: Cache dependencies

uses: actions/cache@v2

with:

path: ~/.npm

key: ${{ runner.os }}-node-${{ matrix.node-version }}-${{ hashFiles('**/package-lock.json') }}

restore-keys: |

${{ runner.os }}-node-${{ matrix.node-version }}-

- name: Setup Node.js

uses: actions/setup-node@v2

with:

node-version: ${{ matrix.node-version }}

- run: npm install

- run: npm test

deploy:

needs: build

runs-on: ubuntu-latest

steps:

- name: Deploy to Staging

run: |

# Deployment commands

echo "Deploying to staging environment..."

- name: Automated Tests on Staging

run: npm run test:e2e

promote:

needs: deploy

runs-on: ubuntu-latest

if: success()

steps:

- name: Deploy to Production

run: |

# Production deployment commands

echo "Deploying to production environment..."

References:

What is Continuous Integration | Atlassian

8 Key Continuous Delivery Principles | Atlassian

Friday, 12 July 2024

Introduction to Containers

Hardware Virtualization

Let's review the layers of service abstractions:

source: The New Era of Cloud Computing: SaaS, IaaS and PaaS | LinkedIn

Infrastructure as a service (IaaS):

uses Virtual Machines to virtualize the hardware
allows us to share compute resources with other developers
each developer can:

deploy their own operating system (OS)
configure the underlying system resources such as disc space, disk I/O, or networking
install their favorite run time, web server, database or middleware
build their applications in a self contained environment with access to underlying hardware (RAM, file systems, networking interfaces, etc.)
The smallest unit of compute is an app with its VM

If we want to scale it the app, we'll also scale VM which is resource and time consuming

Shortcomings of (using only) hardware virtualization

Flexibility listed above comes with a cost:

Guest OS might be large (several gigabytes) and take long time to boot
As demand for our application increases, we have to copy an entire VM and boot the guest OS for each instance of our app, which can be slow and costly

OS virtualization

PaaS: abstraction of the OS
IaaS: abstraction of hardware

source: What Is IaaS, PaaS, and SaaS? Examples and Definitions: A Cloud Report | Mindsight

Container:

gives the independent scalability of workloads in PaaS and an abstraction layer of the OS and hardware in IaaS.
an invisible box around our code and its dependencies with limited access to its own partition of the file system and hardware
only requires a few system calls to create
starts as quickly as a process
All that's needed on each host is:

OS kernel that supports containers
container runtime

In essence, the OS is being virtualized. It scales like PaaS, but gives us nearly the same flexibility as IaaS. This makes code ultra portable, and the OS and hardware can be treated as a black box.

We can go from development to staging, to production, or from our laptop to the Cloud without changing or rebuilding anything. As an example, let's say we want to scale a web server. With a container we can do this in seconds and deploy dozens or hundreds of them depending on the size of our workload on a single host. That's just a simple example of scaling one container, running the whole application on a single host.

However, we'll probably want to build our applications using lots of containers, each performing their own function like microservices. If we build them this way and connect them with network connections, we can make them modular, deploy easily and scale independently across a group of hosts. The hosts can scale up and down and start and stop containers as demand for our app changes or as hosts fail.

Containers: Docker overview

Let's say we need to deploy a stack of various technologies:

Web server Node.js Express
MongoDB
Redis messaging system
Ansible as orchestration tool

If we would go about deploying them on the bare metal host or VM, each of these components needs to be compatible with running host's hardware, OS and installed dependencies and libraries. But this is usually not the case. This is therefore named Matrix from hell.

Docker helps preventing these dependency issues. E.g. we can run each of these components in its own container, which contains libraries and dependencies that the component is compatible with.

Docker runs on top of the OS (Win, Mac, Linux etc).

Containers are completely isolated environments. They have their own processes, network interfaces, mounts...just like virtual machines except they all share the same OS kernel (which is interfacing the hardware).

Docker adds an abstraction layer over LXC (LinuX Containers). Docker is like an extension of LXC. [LXC vs Docker: Why Docker is Better | UpGuard]

Ubuntu, Fedora, SUSE and CentOS share the same OS kernel (Linux) but have different software (GUI, drivers, compilers, file systems, ...) above it. This custom software differentiates OSes between each other.

Docker containers share the underlying OS kernel. For example, Docker on Ubuntu can run any flavour of Linux which runs on the same Linux kernel as Ubuntu. This is why we can't run Windows-based container on Docker running on Linux OS - they don't share the same kernel.

Hypervisor:
Abstracts away hardware for the virtual machines so they can run an operating system
Coordinates between the machine's physical hardware and virtual machines.
A container engine (e.g. Docker Engine):
Abstracts away an operating system so containers can run applications
Coordinates between the operating system and (Docker) containers
Docker containers are process-isolated and don't require a hardware hypervisor. This means Docker containers are much smaller and require far fewer resources than a VM.

What is a hypervisor? A beginner’s guide | Ubuntu

Unlike hypervisors, Docker is not meant to virtualize and run different operating systems and kernels on the same hardware.

The main purpose of Docker is to containerise applications, ship them and run them.

In case of Docker we have:

Containers (one or more) containing:

Application
Libraries & Dependencies

Docker
OS
Hardware

In case of virtual machine we have:

Virtual Machines (one or more) containing:

Application
Libraries & Dependencies
OS

Hypervisor
OS
Hardware

Docker uses less processing power, less disk space and has faster boot up time than VMs.

Docker containers share the same kernel while different VMs are completely isolated. We can run VM with Linux on the host with Windows.

Many companies release and ship their software products as Docker images, published on Docker Hub, public Docker registry.

We can run each application from the example above in its own container:

$ docker run nodejs
$ docker run mongodb
$ docker run redis
$ docker run ansible

Docker image is a template, used to create one or more containers.

Containers are running instances of images that are isolated and have their own environments and set of processes.

Dockerfile describes the image.

References:

Google Cloud Fundamentals: Core Infrastructure | Coursera

Kubernetes for the Absolute Beginners - Hands-on | Udemy

My Public Notepad

Pages

Wednesday, 31 July 2024

How to use HashiCorp Cloud as a remote storage for Terraform state file

Monday, 29 July 2024

Kubernetes Ingress Service

Annotations

Friday, 26 July 2024

Introduction to CI/CD Pipeline Development

Continuous Integration (CI)

Key Aspects:

Stages:

Benefits:

Challenges:

Best Practices:

Continuous Delivery (CD)

Key Aspects:

Benefits:

Continuous Deployment (CD)

Key Aspects:

Benefits:

CI/CD Pipeline

Tools for CI/CD

How to enhance automation and scalability in CI/CD?

Example Workflow for Automation and Scalability

References:

Friday, 12 July 2024

Introduction to Containers

Hardware Virtualization

Shortcomings of (using only) hardware virtualization

OS virtualization

Container:

Containers: Docker overview

References: