Friday, 28 March 2025

Automatic Component Template Matching for Custom Component Templates in Elasticsearch




Index templates in Elasticsearch specify index mappings, settings, and aliases.

Component templates are building blocks for constructing index templates.

An index template can be composed of multiple component templates. To use a component template, specify it in an index template’s composed_of list. 

Both index and component templates can be managed (created by Elasticsearch by default) and custom (created later, by users).


Automatic Component Template Matching


Key Mechanism: Custom Template Integration


Elasticsearch uses a pattern-based auto-matching for component templates. Component templates with a shared prefix can be automatically applied to matching index templates. This is different from manual explicit addition to composed_of.


Matching Rules


Component templates are matched based on their naming prefix. Index templates with similar prefixes automatically incorporate these component templates. The matching is done through an intelligent, behind-the-scenes process.


Example Scenario


Any index template whose name starts with "traces-" will automatically pick up the managed component templates "traces@settings" and "traces@mappings", as well as the custom component template "traces@custom" if it exists. This behavior is based on Elasticsearch's composable index template system and naming conventions.

Elasticsearch automatically creates managed index templates for various data types, including traces. These templates typically include references to managed component templates like "traces@settings" and "traces@mappings". The managed index templates also include a reference to a custom component template named "traces@custom" by default, even if it doesn't exist initially.

Automatic Pickup: When we create a custom component template named traces@custom, it is automatically integrated into the existing index template without requiring manual addition to the composed_of list.

When a new index matching the pattern defined in the index template (e.g., traces-*) is created, Elasticsearch applies all the referenced component templates, including the custom one.

Component Template: traces@custom

Matching Index Templates:

- traces-apm.sampled
- traces-apm.rum
- traces-apm.backend
- traces-...

How Auto-Matching Works


When a component template is created with a specific prefix, Elasticsearch scans existing index templates. It automatically adds the component template to index templates with matching prefixes. This happens without manually updating the composed_of list.

The automatic prefix-based matching applies to:
  • Elasticsearch-managed (built-in) templates (e.g., metrics, logs, APM templates)
  • User-created managed index templates (created via _index_template API)
It does NOT apply to legacy index templates created with the older _template API.

So when we added traces@custom and saw it automatically applied to multiple index templates like traces-apm.sampled, we were seeing the matching against Elasticsearch's managed templates and user-created managed templates.

When a component template is created (e.g., traces@custom):
  • Elasticsearch scans ALL existing index templates
  • Automatically adds the component template to matching index templates
  • Matching is based on prefix, regardless of template origin.

Naming Convention Impact


Prefix matching is crucial.

traces@custom will match templates starting with traces-.
metrics@custom would match metrics- prefixed templates.

Prefix-based matching is a system-wide feature, not limited to custom or user-created templates. It applies uniformly across all managed index templates.

This system allows for easy customization of index settings and mappings without modifying the managed templates directly. It's important to note that while custom templates are automatically picked up, they must still be created manually if you want to add custom configurations.

---

Friday, 28 February 2025

Introduction to GitHub CLI

 


git CLI client can perform some usual tasks with the repo like creating it, cloning, commiting and pushing updates etc but some of the common actions are specific to git repository provider like GitHub or GitLab and can't be done with git client only. Such actions are creating a pull request, checking assigned issues, review requests, adding a comment to pull request etc. This article is a short intro into GitHub's CLI client, gh.

Installation


To install it on Mac:

% brew install gh 

No admin (sudo) permissions are required. 

To verify installation:

% gh
Work seamlessly with GitHub from the command line.

USAGE
  gh <command> <subcommand> [flags]

CORE COMMANDS
  auth:        Authenticate gh and git with GitHub
  browse:      Open repositories, issues, pull requests, and more in the browser
  codespace:   Connect to and manage codespaces
  gist:        Manage gists
  issue:       Manage issues
  org:         Manage organizations
  pr:          Manage pull requests
  project:     Work with GitHub Projects.
  release:     Manage releases
  repo:        Manage repositories

GITHUB ACTIONS COMMANDS
  cache:       Manage GitHub Actions caches
  run:         View details about workflow runs
  workflow:    View details about GitHub Actions workflows

ALIAS COMMANDS
  co:          Alias for "pr checkout"

ADDITIONAL COMMANDS
  alias:       Create command shortcuts
  api:         Make an authenticated GitHub API request
  attestation: Work with artifact attestations
  completion:  Generate shell completion scripts
  config:      Manage configuration for gh
  extension:   Manage gh extensions
  gpg-key:     Manage GPG keys
  label:       Manage labels
  ruleset:     View info about repo rulesets
  search:      Search for repositories, issues, and pull requests
  secret:      Manage GitHub secrets
  ssh-key:     Manage SSH keys
  status:      Print information about relevant issues, pull requests, and notifications across repositories
  variable:    Manage GitHub Actions variables

HELP TOPICS
  actions:     Learn about working with GitHub Actions
  environment: Environment variables that can be used with gh
  exit-codes:  Exit codes used by gh
  formatting:  Formatting options for JSON data exported from gh
  mintty:      Information about using gh with MinTTY
  reference:   A comprehensive reference of all gh commands

FLAGS
  --help      Show help for command
  --version   Show gh version

EXAMPLES
  $ gh issue create
  $ gh repo clone cli/cli
  $ gh pr checkout 321

LEARN MORE
  Use `gh <command> <subcommand> --help` for more information about a command.
  Read the manual at https://cli.github.com/manual
  Learn about exit codes using `gh help exit-codes`


Setting up authentication with GitHub


In GitHub website, to to Settings >> Developer Settings >> Personal access tokens and create a token with desired permissions. Copy its value and add the following environment variable to your personal settings of the terminal you use:

export GH_TOKEN=<github_personal_access_token_value>

% vi ~/.zshrc
% source ~/.zshrc

% vi ~/.bash_profile
% source ~/.bash_profile

To verify it, execute this in each terminal:

% echo $GH_TOKEN 

The output should be <github_personal_access_token_value>.


Checking the status across multiple repositories 

To check the status of the GitHub account repositories

% gh status
Assigned Issues
...
Assigned Pull Requests
...
Review Requests
...
Mentions
...
Repository Activity
...


Working with Pull Requests

Here are some common commands used to manage pull requests but make sure you are in the directory which is a GitHub repository otherwise gh will emit the error like:

failed to run git: fatal: not a git repository (or any of the parent directories): .git


To view all pull requests in a repository:

% gh pr list

It lists all open pull requests in the current repository. Use --state closed or --state all to see merged/closed PRs.

To view a specific pull request:

% gh pr view <PR_NUMBER>

or

% gh pr view <PR_URL>

It displays details of a specific PR, including description, status, and mergeability.

To open a pull request in the browser:

% gh pr view <PR_NUMBER> --web

It opens the PR page in your default web browser.

To show PRs created by you:

% gh pr list --author @me

To show PRs assigned to you:

% gh pr list --assignee @me


Working with GitHub Workflows


The main command for working with workflows is - workflow. Let's see the list of its subcommands:


% gh workflow                               
List, view, and run workflows in GitHub Actions.

USAGE
  gh workflow <command> [flags]

AVAILABLE COMMANDS
  disable:     Disable a workflow
  enable:      Enable a workflow
  list:        List workflows
  run:         Run a workflow by creating a workflow_dispatch event
  view:        View the summary of a workflow

FLAGS
  -R, --repo [HOST/]OWNER/REPO   Select another repository using the [HOST/]OWNER/REPO format

INHERITED FLAGS
  --help   Show help for command


To show all workflows and their IDs in the current repository:

gh workflow list

NAME                      STATE   ID       
my-workflow               active  131118881
...

To see the details of some workflow:

% gh workflow view my-workflow
my-workflow - my-workflow.yaml
ID: 131118881

Total runs 2
Recent runs
   TITLE        WORKFLOW     BRANCH                    EVENT              ID         
✓  my-workflow  my-workflow  project/app-upgrade/test  workflow_dispatch  13543363356
X  my-workflow  my-workflow  project/app-upgrade/test  workflow_dispatch  13158297800

To see more runs for this workflow, try: gh run list --workflow my-workflow.yaml
To see the YAML for this workflow, try: gh workflow view my-workflow.yaml --yaml

To run a workflow by creating a workflow_dispatch event:

% gh workflow run <workflow.yml> --ref <branch_name>

After triggering the workflow, you can check the status with:

% gh run list

Or check the logs of a specific run:

% gh run watch

If your workflow requires inputs, pass them using --json:

% gh workflow run build.yml --ref main --json '{"environment":"staging"}'

To see more details of a specific run:

% gh run view <run_id>

To cancel a running workflow:

% gh run cancel <run_id>


Working with GitHub Workflow Runs



% gh run 
List, view, and watch recent workflow runs from GitHub Actions.

USAGE
  gh run <command> [flags]

AVAILABLE COMMANDS
  cancel:      Cancel a workflow run
  delete:      Delete a workflow run
  download:    Download artifacts generated by a workflow run
  list:        List recent workflow runs
  rerun:       Rerun a run
  view:        View a summary of a workflow run
  watch:       Watch a run until it completes, showing its progress

FLAGS
  -R, --repo [HOST/]OWNER/REPO   Select another repository using the [HOST/]OWNER/REPO format

INHERITED FLAGS
  --help   Show help for command



% gh run list --help 
List recent workflow runs.

Note that providing the `workflow_name` to the `-w` flag will not fetch disabled workflows.
Also pass the `-a` flag to fetch disabled workflow runs using the `workflow_name` and the `-w` flag.

For more information about output formatting flags, see `gh help formatting`.

USAGE
  gh run list [flags]

ALIASES
  gh run ls

FLAGS
  -a, --all               Include disabled workflows
  -b, --branch string     Filter runs by branch
  -c, --commit SHA        Filter runs by the SHA of the commit
      --created date      Filter runs by the date it was created
  -e, --event event       Filter runs by which event triggered the run
  -q, --jq expression     Filter JSON output using a jq expression
      --json fields       Output JSON with the specified fields
  -L, --limit int         Maximum number of runs to fetch (default 20)
  -s, --status string     Filter runs by status: {queued|completed|in_progress|requested|waiting|pending|action_required|cancelled|failure|neutral|skipped|stale|startup_failure|success|timed_out}
  -t, --template string   Format JSON output using a Go template; see "gh help formatting"
  -u, --user string       Filter runs by user who triggered the run
  -w, --workflow string   Filter runs by workflow

INHERITED FLAGS
      --help                     Show help for command
  -R, --repo [HOST/]OWNER/REPO   Select another repository using the [HOST/]OWNER/REPO format

JSON FIELDS
  attempt, conclusion, createdAt, databaseId, displayTitle, event, headBranch,
  headSha, name, number, startedAt, status, updatedAt, url, workflowDatabaseId,
  workflowName

LEARN MORE
  Use `gh <command> <subcommand> --help` for more information about a command.
  Read the manual at https://cli.github.com/manual
  Learn about exit codes using `gh help exit-codes`

bojan@admins-MacBook-Pro.local /Users/bojan/repos/CheckpointGG/infra-elastic [project/elastic-upgrade/test]
% gh run list --json --help
Unknown JSON field: "--help"
Available fields:
  attempt
  conclusion
  createdAt
  databaseId
  displayTitle
  event
  headBranch
  headSha
  name
  number
  startedAt
  status
  updatedAt
  url
  workflowDatabaseId
  workflowName

---

Thursday, 13 February 2025

Introduction to Amazon Bedrock

 


Build Generative AI Applications with Foundation Models - Amazon Bedrock - AWS

Amazon Bedrock is a fully managed service that simplifies the development of generative AI applications using foundation models (FMs) from providers like Anthropic, AI21 Labs, Stability AI, and Amazon itself. 


Key features include:

  • Foundation Models: Pre-trained models that can be lightly customized using techniques like fine-tuning or Retrieval Augmented Generation (RAG) without requiring extensive ML expertise.
  • Serverless Infrastructure: No need to manage infrastructure; it provides a streamlined, API-based experience for quick deployments
  • Security and Privacy: Data is encrypted, region-specific, and not shared with model providers.


    Use Cases

    Ideal for developers looking to rapidly integrate generative AI into applications such as:

      • chatbots
      • text generation
      • image creation


    Cost Model

    Pay-as-you-go pricing based on API usage, making it cost-effective for intermittent workloads.


    Best for developers or businesses without deep ML expertise who need a fast and easy way to deploy generative AI applications.

    Friday, 7 February 2025

    AWS Secrets Manager

     


    AWS Secrets Manager allows us to:
    • Centrally store and manage credentials, API keys, and other secrets.
    • Use AWS Identity and Access Management (IAM) permissions policies to manage access to your secrets.
    • Rotate secrets on demand or on a schedule, without redeploying or disrupting active applications.
    • Integrate secrets with AWS logging, monitoring, and notification services.

    Viewing Secrets

    To list all secrets in a particular region:

    % aws secretsmanager list-secrets --region us-east-2         
    {
        "SecretList": [
            {
                "ARN": "arn:aws:secretsmanager:us-east-2:700840607999:secret:my-app/stage/my-secret-bwwria",
                "Name": "my-app/stage/my-secret ",
                "Description": "Secret for my-app in staging env",
                "LastChangedDate": "2025-01-13T12:51:21.204000+00:00",
                "LastAccessedDate": "2025-02-07T00:00:00+00:00",
                "Tags": [
                    {
                        "Key": "environment",
                        "Value": "stage"
                    },
                    {
                        "Key": "service",
                        "Value": "main-app"
                    }
                ],
                "SecretVersionsToStages": {
                    "11877f11-1999-4f37-8311-283ad04d70f1": [
                        "AWSCURRENT"
                    ],
                    "ab81397d-eb1d-4dc1-8a44-961ce45de258": [
                        "AWSPREVIOUS"
                    ]
                },
                "CreatedDate": "2022-08-17T12:55:43.194000+01:00",
                "PrimaryRegion": "us-east-2"
            },
            ...
          ]
     }


    Deleting a secret


    By default, secret is not deleted immediately but after 7 days.

    To delete a secret immediately use --force-delete-without-recovery option:

    % aws secretsmanager delete-secret --secret-id my-app/stage/my-secret --force-delete-without-recovery --region eu-west-2
    {
        "ARN": "arn:aws:secretsmanager:eu-west-2:700859607999:secret:my-app/stage/my-secret-E0yyRM",
        "Name": "my-app/stage/my-secret",
        "DeletionDate": "2025-02-07T14:54:30.386000+00:00"
    }



    Resources:


    Thursday, 6 February 2025

    Introduction to AWS S3


    S3 = Simple Storage Service

    What is Amazon S3? - Amazon Simple Storage Service

    Amazon S3 provides:
    • storage for data (objects)
      • organized as key-value structure - each object has a unique key and url
      • divided into multiple buckets; each bucket contains objects
    • web service for upload/download
    S3 doesn't know anything about files, directories, or symlinks. It's just objects in buckets. [source]

    S3 also has no concept of symbolic links created by ln -s. By default, it will turn all links into real files by making real copies. You can use aws s3 cp --no-follow-symlinks ... to ignore links. [Use S3 as major storage — GEOS-Chem on cloud]


    Buckets

    Each bucket has its own subdomain. E.g.: the one named as my-bucket would have URL:

    https://my-bucket.s3.amazonaws.com

    Objects

    • Data stored in buckets, which are logical containers
    • There is no official limit to the number of objects or amount of data that can be stored in a bucket
    • The size limit for objects is 5 TB
    • Every object has a key (object name). This is usually a name of the file. 
    Each object has its key which uniquely identifies it within a bucket. E.g. if we upload file and assign key my-root-dir/dirA/fileA1 to it, its URL will be:

    https://my-bucket.s3.amazonaws.com/my-root-dir/dirA/fileA1


    To list all objects;

    % aws s3api list-object-versions --bucket my-bucket --query='{Objects: Versions[].{Key:Key,VersionId:VersionId}}' --output=json --profile=my-profile                   
    {
        "Objects": [
            {
                "Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/data-1NMp6hagSAu8qORLSpyVxw.dat",
                "VersionId": "KaKyog0yM41SG._aWTuDllb9kXp67vLr"
            },
            {
                "Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/data-3v3MO1kOTBOl0idDYWLstA.dat",
                "VersionId": "i3PJhBpFtHrdKEtiEZPU_MFHYb2GqX0s"
            },
            ...
    }

    Versioning

    • One of the bucket features
    • Means of keeping multiple variants of an object in the same bucket
    • Used to preserve, retrieve, and restore every version of every object stored in the bucket
    • Helps recovering more easily from both unintended user actions and application failures
    • Ff Amazon S3 receives multiple write requests for the same object simultaneously, it stores all of those objects
    • Buckets can be in one of three states:
      • Unversioned (the default)
      • Versioning-enabled
      • Versioning-suspended
        • After you version-enable a bucket, it can never return to an unversioned state. But you can suspend versioning on that bucket.
    • If versioning is enabled:
      • If you overwrite an object, Amazon S3 adds a new object version in the bucket. The previous version remains in the bucket and becomes a noncurrent version. You can restore the previous version.
      • Every object, apart from its key/name, also gets its VersionId. Each version of the same object has a different VersionID. When we overwrite an object (PUT command), a new version is crated - object with the same key but a new VersionId.
      • When we delete an object, a delete marker is created and that becomes a current version. GET requests will return 404 - Not Found. But, if we pass the VersionId, GET command will return noncurrent version of the object.
      • To delete old version of the object, we need to pass VersionId in DELETE command
      • Versioning flows: How S3 Versioning works - Amazon Simple Storage Service

    Delete Markers

    • Placeholders for objects that have been deleted 
    • Created when versioning is enabled and a simple DELETE request is made 
    • The delete marker becomes the current version of the object, and the object becomes the previous version 
    • Delete markers have a key name and version ID, but they don't have data associated with them 
    • Delete markers don't retrieve anything from a GET request 
    • The only operation you can use on a delete marker is DELETE, and only the bucket owner can issue such a request 
    To list all delete markers:

    % aws s3api list-object-versions --bucket my-bucket --query '{Objects: DeleteMarkers[].{Key:Key,VersionId:VersionId}}' --output=json --profile=my-profile  

    {
        "Objects": [
            {
                "Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/",
                "VersionId": "hYbkDv9egrv_WE2jI4y0Lys5Bc2dbvgb"
            },
            {
                "Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/data-1NMp6hagSAu8qORLSpyVxw.dat",
                "VersionId": "KdLBHImFlC23EXkfq4Ic0.2x6wGJ2FxR"
            },
            ...
    }


    Bucket Lifecycle Policy


    If versioning is enabled, each time an object is updated a new version becomes the current version while previous version becomes the most recent noncurrent version. Over the time the number of object versions grows, taking more storage and driving costs up. If we want to keep only last V versions and/or we want to delete noncurrent versions after D days we can define a lifecycle policy.


    Deleting a bucket


    Before we attempt to delete a bucket we need to make sure that both all objects and all delete markers are deleted. 

    To delete all objects (current versions):

    % aws s3api delete-objects --bucket my-bucket --delete "$(aws s3api list-object-versions --bucket my-bucket --query='{Objects: Versions[].{Key:Key,VersionId:VersionId}}' --output=json --profile=my-profile)" --profile=my-profile 
    {
        "Deleted": [
            {
                "Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/data-IOT1xReRSWuNLjW0es4SPg.dat",
                "VersionId": "FvX7ePtV5MOLK2cxsWE.smTwMgnLoFie"
            },
            {
                "Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/data-o29VZ5YOTYWL_pZ0aATn4g.dat",
                "VersionId": "rodR2lazpLXZf3p1cXrBEXnnQDRYDGRj"
            },
            ...
    }

    To delete all delete markers:

    % aws s3api delete-objects --bucket my-bucket --delete "$(aws s3api list-object-versions --bucket my-bucket --query '{Objects: DeleteMarkers[].{Key:Key,VersionId:VersionId}}' --output=json --profile=my-profile)" --profile=my-profile
    {
        "Deleted": [
            {
                "Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/data-o29VZ5YOTYWL_pZ0aATn4g.dat",
                "VersionId": "KFGMc_HXfMxG6vt7vIvIxFoAUny9rDWT",
                "DeleteMarker": true,
                "DeleteMarkerVersionId": "KFGMc_HXfMxG6vt7vIvIxFoAUny9rDWT"
            },
            {
                "Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/data-IOT1xReRSWuNLjW0es4SPg.dat",
                "VersionId": "en7k.4jeTceJNvQhZm0PlJUNa6pZEQiL",
                "DeleteMarker": true,
                "DeleteMarkerVersionId": "en7k.4jeTceJNvQhZm0PlJUNa6pZEQiL"
            },
            ...
    }




    Sunday, 2 February 2025

    Introduction to Large Language Models (LLMs)




    Single-turn vs Multi-turn conversation


    ...

    Token

    • the smallest unit of text that the model recognizes
    • can be a word, a number, or even a punctuation mark
    • 1  (English) word has approximately 1.3 tokens

    Context Caching


    In large language model API usage, a significant portion of user inputs tends to be repetitive. For instance, user prompts often include repeated references, and in multi-turn conversations, previous content is frequently re-entered.

    To address this, Context Caching technology caches content that is expected to be reused on a distributed disk array. When duplicate inputs are detected, the repeated parts are retrieved from the cache, bypassing the need for recomputation. This not only reduces service latency but also significantly cuts down on overall usage costs.


    Billing


    The price of using some LLM is usually in units of per 1M tokens. 
    If 1 word has 1.3 tokens, let's see how many words is this: 1w : 1.3t = x : 10^6 => x = 10^6 / 1.3 = 769'230 words ~ 770k words. 

    Billing is usually based on the total number of input and output tokens by the model.

    If Context Caching is implemented, input billing per 1M tokens can further be split into two categories:
    • 1M tokens - Cache Hit (1M tokens that were found in cache)
    • 1M tokens - Cache Miss (1M tokens that were not found in cache

    Use Case: Chatbots


    Chat History

    LLMs don't have any concept of state or memory. Any chat history has to be tracked externally and then passed into the model with each new message. We can use a list of custom objects to track chat history. Since there is a limit on the amount of content that can be processed by the model, we need to prune the chat history so there is enough space left to handle the user's message and the model's responses. Our code needs to delete older messages.


    Retrieval-Augmented Generation (RAG)

    If the responses from the chatbot are based purely on the underlying foundation model (FM), without any supporting data source, they can potentially include made-up responses (hallucination). 

    Retrieval-augmented generation LLMs create a more powerful chatbots that incorporates the retrieval-augmented generation pattern to return more accurate responses.

    ...

    Resources:

    Tuesday, 28 January 2025

    Introduction to VertexAI


    Vertex AI

    • Google's managed machine learning platform
    • Designed to help developers, data scientists, and businesses:
      • build
      • deploy
      • scale machine learning models
    • Provides a comprehensive suite of tools for every stage of the machine learning lifecycle, including:
      • data preparation
      • model training
      • evaluation
      • deployment
      • monitoring

    Here’s an overview of what Vertex AI offers:

    Key Features of Vertex AI:

    1. Unified ML Platform: It consolidates various Google AI services into one integrated platform, making it easier to manage end-to-end workflows.

    2. Custom and Pre-trained Models:

      • You can train your custom machine learning models using your own data.
      • Alternatively, use Google’s pre-trained models or APIs for common AI tasks (e.g., Vision AI, Translation AI, and Natural Language AI).
    3. AutoML:

      • Offers an automated way to train machine learning models, making it accessible even to those without deep expertise in ML.
    4. Notebooks:

      • Managed Jupyter Notebooks are available for building and experimenting with ML models.
    5. Data Preparation and Labeling:

      • Tools for managing datasets, preparing data, and labeling it for supervised learning tasks.
    6. Training and Tuning:

      • Supports large-scale training with powerful infrastructure and features like hyperparameter tuning for optimizing models.
    7. Model Deployment:

      • Seamlessly deploy models to an endpoint for real-time predictions or batch processing.
    8. Model Monitoring:

      • Tracks the performance of deployed models, monitoring metrics such as prediction drift or latency.
    9. Integration with BigQuery and Google Cloud Services:

      • Easily access and analyze data stored in BigQuery and integrate it with other Google Cloud services.
    10. ML Ops Features:

      • Tools for managing and automating ML workflows, like pipelines and version control for reproducibility.

    Why Use Vertex AI?

    • Scalability: It handles infrastructure concerns so you can focus on model development.
    • Ease of Use: Tools like AutoML simplify machine learning for those with less technical expertise.
    • Cost-Effectiveness: Pay-as-you-go pricing lets you control costs.
    • Integration: Works seamlessly with Google Cloud services, making it a powerful choice for businesses already in the Google ecosystem.

    It’s ideal for both beginners looking for simplicity and experts needing advanced tools and customizability.


    Friday, 17 January 2025

    Introduction to Elasticsearch





    What is Elasticsearch?

    • An open-source analytics and full-text search engine.
    • Commonly used to enable search functionality for applications, such as blogs, webshops, or other systems. Example: in blog, search for blog posts, products, categories

    Capabilities of Elasticsearch:

    • Supports complex search functionality similar to Google:
      • Autocompletion.
      • Typo correction.
      • Highlighting matches.
      • Synonym handling.
      • Relevance adjustment.
    • Enables filtering and sorting, such as by price, brand, or other attributes.

    Advanced Use Cases:

    • Full-text search and relevance boosting (e.g., highly-rated products).
    • Filtering and sorting by various factors (price, size, brand, etc.).

    Analytics Platform:

    • Allows querying structured data (e.g., numbers) and aggregating results.
    • Useful for creating pie charts, line charts, and other visualizations.

    Application Performance Management (APM):

    • Common use case for monitoring logs, errors, and server metrics.
    • Examples include tracking web application errors or server CPU/memory usage, displayed on line charts.

    Event and Sales Analysis:

    • Analyze events like sales from physical stores using aggregations.
    • Examples include identifying top-selling stores or forecasting sales using machine learning.

    Machine Learning Capabilities:

    • Forecasting:
      • Sales predictions for capacity management.
      • Estimating staffing needs or server scaling based on historical data.
    • Anomaly detection:
      • Identifying significant deviations from normal behavior (e.g., drop in website traffic).
        • machine learning learns the “norm” and let you know when there is an anomality, i.e. when there is a significant deviation from the normal behavior.
      • Automates alerting for unusual activities without needing manual thresholds.
      • We can then set up alerting (email, Slack) for this and be notified whenever something unusual happens

    How Elasticsearch Works:

    • Data is stored as documents (JSON objects), analogous to rows in a relational database.
    • Each document has fields, similar to columns in a database table.
    • Uses a RESTful API for querying and interacting with the data.
    • Queries are written in JSON, making the API straightforward to use.

    Technology and Scalability:

    • Written in Java and built on Apache Lucene.
    • Highly scalable and distributed by nature, handling massive data volumes and high query throughput.
    • Supports lightning-fast searches, even for millions of documents.

    Community and Adoption:

    • Widely adopted by large companies and has a vibrant community for support and collaboration.


    Index Templates


    Deletion:

    curl -u "user:pass" -X DELETE "https://elasticsearch.my-corp.com:443/_index_template/index_template_name"


    Thursday, 9 January 2025

    ELK Stack Interview Questions




    Elasticsearch (ES)





    Kibana


    • How to install Kibana on bare metal?
      • How to install Kibana in k8s cluster?
    • What are Dashboards?
    • What are Alerts?
    • How to back up and Elastic objects like dashboards and alerts? How to restore them in another Elastic instance?
    • TBC


    Wednesday, 8 January 2025

    How to locally run Helm from a Docker container


    Instead of managing a local installation of Helm, I prefer using its latest version via Docker container: alpine/helm - Docker Image | Docker Hub.

    % docker run -it --rm  -v ~/.helm:/root/.helm -v ~/.config/helm:/root/.config/helm -v ~/.cache/helm:/root/.cache/helm alpine/helm
    The Kubernetes package manager

    Common actions for Helm:

    - helm search:    search for charts
    - helm pull:      download a chart to your local directory to view
    - helm install:   upload the chart to Kubernetes
    - helm list:      list releases of charts

    Environment variables:

    | Name                               | Description                                                                                                |
    |------------------------------------|------------------------------------------------------------------------------------------------------------|
    | $HELM_CACHE_HOME                   | set an alternative location for storing cached files.                                                      |
    | $HELM_CONFIG_HOME                  | set an alternative location for storing Helm configuration.                                                |
    | $HELM_DATA_HOME                    | set an alternative location for storing Helm data.                                                         |
    | $HELM_DEBUG                        | indicate whether or not Helm is running in Debug mode                                                      |
    | $HELM_DRIVER                       | set the backend storage driver. Values are: configmap, secret, memory, sql.                                |
    | $HELM_DRIVER_SQL_CONNECTION_STRING | set the connection string the SQL storage driver should use.                                               |
    | $HELM_MAX_HISTORY                  | set the maximum number of helm release history.                                                            |
    | $HELM_NAMESPACE                    | set the namespace used for the helm operations.                                                            |
    | $HELM_NO_PLUGINS                   | disable plugins. Set HELM_NO_PLUGINS=1 to disable plugins.                                                 |
    | $HELM_PLUGINS                      | set the path to the plugins directory                                                                      |
    | $HELM_REGISTRY_CONFIG              | set the path to the registry config file.                                                                  |
    | $HELM_REPOSITORY_CACHE             | set the path to the repository cache directory                                                             |
    | $HELM_REPOSITORY_CONFIG            | set the path to the repositories file.                                                                     |
    | $KUBECONFIG                        | set an alternative Kubernetes configuration file (default "~/.kube/config")                                |
    | $HELM_KUBEAPISERVER                | set the Kubernetes API Server Endpoint for authentication                                                  |
    | $HELM_KUBECAFILE                   | set the Kubernetes certificate authority file.                                                             |
    | $HELM_KUBEASGROUPS                 | set the Groups to use for impersonation using a comma-separated list.                                      |
    | $HELM_KUBEASUSER                   | set the Username to impersonate for the operation.                                                         |
    | $HELM_KUBECONTEXT                  | set the name of the kubeconfig context.                                                                    |
    | $HELM_KUBETOKEN                    | set the Bearer KubeToken used for authentication.                                                          |
    | $HELM_KUBEINSECURE_SKIP_TLS_VERIFY | indicate if the Kubernetes API server's certificate validation should be skipped (insecure)                |
    | $HELM_KUBETLS_SERVER_NAME          | set the server name used to validate the Kubernetes API server certificate                                 |
    | $HELM_BURST_LIMIT                  | set the default burst limit in the case the server contains many CRDs (default 100, -1 to disable)         |
    | $HELM_QPS                          | set the Queries Per Second in cases where a high number of calls exceed the option for higher burst values |

    Helm stores cache, configuration, and data based on the following configuration order:

    - If a HELM_*_HOME environment variable is set, it will be used
    - Otherwise, on systems supporting the XDG base directory specification, the XDG variables will be used
    - When no other location is set a default location will be used based on the operating system

    By default, the default directories depend on the Operating System. The defaults are listed below:

    | Operating System | Cache Path                | Configuration Path             | Data Path               |
    |------------------|---------------------------|--------------------------------|-------------------------|
    | Linux            | $HOME/.cache/helm         | $HOME/.config/helm             | $HOME/.local/share/helm |
    | macOS            | $HOME/Library/Caches/helm | $HOME/Library/Preferences/helm | $HOME/Library/helm      |
    | Windows          | %TEMP%\helm               | %APPDATA%\helm                 | %APPDATA%\helm          |

    Usage:
      helm [command]

    Available Commands:
      completion  generate autocompletion scripts for the specified shell
      create      create a new chart with the given name
      dependency  manage a chart's dependencies
      env         helm client environment information
      get         download extended information of a named release
      help        Help about any command
      history     fetch release history
      install     install a chart
      lint        examine a chart for possible issues
      list        list releases
      package     package a chart directory into a chart archive
      plugin      install, list, or uninstall Helm plugins
      pull        download a chart from a repository and (optionally) unpack it in local directory
      push        push a chart to remote
      registry    login to or logout from a registry
      repo        add, list, remove, update, and index chart repositories
      rollback    roll back a release to a previous revision
      search      search for a keyword in charts
      show        show information of a chart
      status      display the status of the named release
      template    locally render templates
      test        run tests for a release
      uninstall   uninstall a release
      upgrade     upgrade a release
      verify      verify that a chart at the given path has been signed and is valid
      version     print the client version information

    Flags:
          --burst-limit int                 client-side default throttling limit (default 100)
          --debug                           enable verbose output
      -h, --help                            help for helm
          --kube-apiserver string           the address and the port for the Kubernetes API server
          --kube-as-group stringArray       group to impersonate for the operation, this flag can be repeated to specify multiple groups.
          --kube-as-user string             username to impersonate for the operation
          --kube-ca-file string             the certificate authority file for the Kubernetes API server connection
          --kube-context string             name of the kubeconfig context to use
          --kube-insecure-skip-tls-verify   if true, the Kubernetes API server's certificate will not be checked for validity. This will make your HTTPS connections insecure
          --kube-tls-server-name string     server name to use for Kubernetes API server certificate validation. If it is not provided, the hostname used to contact the server is used
          --kube-token string               bearer token used for authentication
          --kubeconfig string               path to the kubeconfig file
      -n, --namespace string                namespace scope for this request
          --qps float32                     queries per second used when communicating with the Kubernetes API, not including bursting
          --registry-config string          path to the registry config file (default "/root/.config/helm/registry/config.json")
          --repository-cache string         path to the directory containing cached repository indexes (default "/root/.cache/helm/repository")
          --repository-config string        path to the file containing repository names and URLs (default "/root/.config/helm/repositories.yaml")

    Use "helm [command] --help" for more information about a command.


    Example: Adding a Helm chart repository

    % docker run -it --rm  -v ~/.helm:/root/.helm -v ~/.config/helm:/root/.config/helm -v ~/.cache/helm:/root/.cache/helm alpine/helm repo add elastic https://helm.elastic.co
    "elastic" has been added to your repositories


    Example: Updating a Helm chart repository

    % docker run -it --rm  -v ~/.helm:/root/.helm -v ~/.config/helm:/root/.config/helm -v ~/.cache/helm:/root/.cache/helm alpine/helm repo update                             
    Hang tight while we grab the latest from your chart repositories...
    ...Successfully got an update from the "elastic" chart repository
    Update Complete. ⎈Happy Helming!⎈



    Example: View all configurable values in a chart

    % docker run -it --rm  -v ~/.helm:/root/.helm -v ~/.config/helm:/root/.config/helm -v ~/.cache/helm:/root/.cache/helm alpine/helm show values elastic/eck-operator
    # nameOverride is the short name for the deployment. Leave empty to let Helm generate a name using chart values.
    nameOverride: "elastic-operator"

    # fullnameOverride is the full name for the deployment. Leave empty to let Helm generate a name using chart values.
    fullnameOverride: "elastic-operator"

    # managedNamespaces is the set of namespaces that the operator manages. Leave empty to manage all namespaces.
    managedNamespaces: []

    # installCRDs determines whether Custom Resource Definitions (CRD) are installed by the chart.
    # Note that CRDs are global resources and require cluster admin privileges to install.
    # If you are sharing a cluster with other users who may want to install ECK on their own namespaces, setting this to true can have unintended consequences.
    # 1. Upgrades will overwrite the global CRDs and could disrupt the other users of ECK who may be running a different version.
    # 2. Uninstalling the chart will delete the CRDs and potentially cause Elastic resources deployed by other users to be removed as well.
    installCRDs: true

    # replicaCount is the number of operator pods to run.
    replicaCount: 1

    image:
      # repository is the container image prefixed by the registry name.
      repository: docker.elastic.co/eck/eck-operator
      # pullPolicy is the container image pull policy.
      pullPolicy: IfNotPresent
      # tag is the container image tag. If not defined, defaults to chart appVersion.
      tag: null
      # fips specifies whether the operator will use a FIPS compliant container image for its own StatefulSet image.
      # This setting does not apply to Elastic Stack applications images.
      # Can be combined with config.ubiOnly.
      fips: false

    # priorityClassName defines the PriorityClass to be used by the operator pods.
    priorityClassName: ""

    # imagePullSecrets defines the secrets to use when pulling the operator container image.
    imagePullSecrets: []

    # resources define the container resource limits for the operator.
    resources:
      limits:
        cpu: 1
        memory: 1Gi
      requests:
        cpu: 100m
        memory: 150Mi

    # statefulsetAnnotations define the annotations that should be added to the operator StatefulSet.
    statefulsetAnnotations: {}

    # statefulsetLabels define additional labels that should be added to the operator StatefulSet.
    statefulsetLabels: {}

    # podAnnotations define the annotations that should be added to the operator pod.
    podAnnotations: {}

    ## podLabels define additional labels that should be added to the operator pod.
    podLabels: {}

    # podSecurityContext defines the pod security context for the operator pod.
    podSecurityContext:
      runAsNonRoot: true

    # securityContext defines the security context of the operator container.
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL
      readOnlyRootFilesystem: true
      runAsNonRoot: true

    # nodeSelector defines the node selector for the operator pod.
    nodeSelector: {}

    # tolerations defines the node tolerations for the operator pod.
    tolerations: []

    # affinity defines the node affinity rules for the operator pod.
    affinity: {}

    # podDisruptionBudget configures the minimum or the maxium available pods for voluntary disruptions,
    # set to either an integer (e.g. 1) or a percentage value (e.g. 25%).
    podDisruptionBudget:
      enabled: false
      minAvailable: 1
      # maxUnavailable: 3

    # additional environment variables for the operator container.
    env: []

    # additional volume mounts for the operator container.
    volumeMounts: []

    # additional volumes to add to the operator pod.
    volumes: []

    # createClusterScopedResources determines whether cluster-scoped resources (ClusterRoles, ClusterRoleBindings) should be created.
    createClusterScopedResources: true

    # Automount API credentials for the Service Account into the pod.
    automountServiceAccountToken: true

    serviceAccount:
      # create specifies whether a service account should be created for the operator.
      create: true
      # Specifies whether a service account should automount API credentials.
      automountServiceAccountToken: true
      # annotations to add to the service account
      annotations: {}
      # name of the service account to use. If not set and create is true, a name is generated using the fullname template.
      name: ""

    tracing:
      # enabled specifies whether APM tracing is enabled for the operator.
      enabled: false
      # config is a map of APM Server configuration variables that should be set in the environment.
      config:
        ELASTIC_APM_SERVER_URL: http://localhost:8200
        ELASTIC_APM_SERVER_TIMEOUT: 30s

    refs:
      # enforceRBAC specifies whether RBAC should be enforced for cross-namespace associations between resources.
      enforceRBAC: false

    webhook:
      # enabled determines whether the webhook is installed.
      enabled: true
      # caBundle is the PEM-encoded CA trust bundle for the webhook certificate. Only required if manageCerts is false and certManagerCert is null.
      caBundle: Cg==
      # certManagerCert is the name of the cert-manager certificate to use with the webhook.
      certManagerCert: null
      # certsDir is the directory to mount the certificates.
      certsDir: "/tmp/k8s-webhook-server/serving-certs"
      # failurePolicy of the webhook.
      failurePolicy: Ignore
      # manageCerts determines whether the operator manages the webhook certificates automatically.
      manageCerts: true
      # namespaceSelector corresponds to the namespaceSelector property of the webhook.
      # Setting this restricts the webhook to act only on objects submitted to namespaces that match the selector.
      namespaceSelector: {}
      # objectSelector corresponds to the objectSelector property of the webhook.
      # Setting this restricts the webhook to act only on objects that match the selector.
      objectSelector: {}
      # port is the port that the validating webhook binds to.
      port: 9443
      # secret specifies the Kubernetes secret to be mounted into the path designated by the certsDir value to be used for webhook certificates.
      certsSecret: ""

    # hostNetwork allows a Pod to use the Node network namespace.
    # This is required to allow for communication with the kube API when using some alternate CNIs in conjunction with webhook enabled.
    # CAUTION: Proceed at your own risk. This setting has security concerns such as allowing malicious users to access workloads running on the host.
    hostNetwork: false

    softMultiTenancy:
      # enabled determines whether the operator is installed with soft multi-tenancy extensions.
      # This requires network policies to be enabled on the Kubernetes cluster.
      enabled: false

    # kubeAPIServerIP is required when softMultiTenancy is enabled.
    kubeAPIServerIP: null

    telemetry:
      # disabled determines whether the operator periodically updates ECK telemetry data for Kibana to consume.
      disabled: false
      # distributionChannel denotes which distribution channel was used to install the operator.
      distributionChannel: "helm"

    # config values for the operator.
    config:
      # logVerbosity defines the logging level. Valid values are as follows:
      # -2: Errors only
      # -1: Errors and warnings
      #  0: Errors, warnings, and information
      #  number greater than 0: Errors, warnings, information, and debug details.
      logVerbosity: "0"

      # (Deprecated: use metrics.port: will be removed in v2.14.0) metricsPort defines the port to expose operator metrics. Set to 0 to disable metrics reporting.
      metricsPort: 0

      metrics:
        # port defines the port to expose operator metrics. Set to 0 to disable metrics reporting.
        port: "0"
        # secureMode contains the options for enabling and configuring RBAC and TLS/HTTPs for the metrics endpoint.
        secureMode:
          # secureMode.enabled specifies whether to enable RBAC and TLS/HTTPs for the metrics endpoint.
          # * This option makes most sense when using a ServiceMonitor to scrape the metrics and is therefore mutually exclusive with the podMonitor.enabled option.
          # * This option also requires using cluster scoped resources (ClusterRole, ClusterRoleBinding) to
          #   grant access to the /metrics endpoint. (createClusterScopedResources: true is required)
          #
          enabled: false
          tls:
            # certificateSecret is the name of the tls secret containing the custom TLS certificate and key for the secure metrics endpoint.
            #
            # * This is an optional setting and is only required if you are using a custom TLS certificate. A self-signed certificate will be generated by default.
            # * TLS secret key must be named tls.crt.
            # * TLS key's secret key must be named tls.key.
            # * It is assumed to be in the same namespace as the ServiceMonitor.
            #
            # example: kubectl create secret tls eck-metrics-tls-certificate -n elastic-system \
            #            --cert=/path/to/tls.crt --key=/path/to/tls.key
            certificateSecret: ""

      # containerRegistry to use for pulling Elasticsearch and other application container images.
      containerRegistry: docker.elastic.co

      # containerRepository to use for pulling Elasticsearch and other application container images.
      # containerRepository: ""

      # containerSuffix suffix to be appended to container images by default. Cannot be combined with -ubiOnly flag
      # containerSuffix: ""

      # maxConcurrentReconciles is the number of concurrent reconciliation operations to perform per controller.
      maxConcurrentReconciles: "3"

      # caValidity defines the validity period of the CA certificates generated by the operator.
      caValidity: 8760h

      # caRotateBefore defines when to rotate a CA certificate that is due to expire.
      caRotateBefore: 24h

      # caDir defines the directory containing a CA certificate (tls.crt) and its associated private key (tls.key) to be used for all managed resources.
      # Setting this makes caRotateBefore and caValidity values ineffective.
      caDir: ""

      # certificatesValidity defines the validity period of certificates generated by the operator.
      certificatesValidity: 8760h

      # certificatesRotateBefore defines when to rotate a certificate that is due to expire.
      certificatesRotateBefore: 24h

      # disableConfigWatch specifies whether the operator watches the configuration file for changes.
      disableConfigWatch: false

      # exposedNodeLabels is an array of regular expressions of node labels which are allowed to be copied as annotations on Elasticsearch Pods.
      exposedNodeLabels: [ "topology.kubernetes.io/.*", "failure-domain.beta.kubernetes.io/.*" ]

      # ipFamily specifies the IP family to use. Possible values: IPv4, IPv6 and "" (auto-detect)
      ipFamily: ""

      # setDefaultSecurityContext determines whether a default security context is set on application containers created by the operator.
      # *note* that the default option now is "auto-detect" to attempt to set this properly automatically when both running
      # in an openshift cluster, and a standard kubernetes cluster.  Valid values are as follows:
      # "auto-detect" : auto detect
      # "true"        : set pod security context when creating resources.
      # "false"       : do not set pod security context when creating resources.
      setDefaultSecurityContext: "auto-detect"

      # kubeClientTimeout sets the request timeout for Kubernetes API calls made by the operator.
      kubeClientTimeout: 60s

      # elasticsearchClientTimeout sets the request timeout for Elasticsearch API calls made by the operator.
      elasticsearchClientTimeout: 180s

      # validateStorageClass specifies whether storage classes volume expansion support should be verified.
      # Can be disabled if cluster-wide storage class RBAC access is not available.
      validateStorageClass: true

      # enableLeaderElection specifies whether leader election should be enabled
      enableLeaderElection: true

      # Interval between observations of Elasticsearch health, non-positive values disable asynchronous observation.
      elasticsearchObservationInterval: 10s

      # ubiOnly specifies whether the operator will use only UBI container images to deploy Elastic Stack applications as well as for its own StatefulSet image. UBI images are only available from 7.10.0 onward.
      # Cannot be combined with the containerSuffix value.
      ubiOnly: false

    # Prometheus PodMonitor configuration
    # Reference: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#podmonitor
    podMonitor:

      # enabled determines whether a podMonitor should deployed to scrape the eck metrics.
      # This requires the prometheus operator and the config.metrics.port not to be 0
      enabled: false

      # labels adds additional labels to the podMonitor
      labels: {}

      # annotations adds additional annotations to the podMonitor
      annotations: {}

      # namespace determines in which namespace the podMonitor will be deployed.
      # If not set the podMonitor will be created in the namespace where the Helm release is installed into
      # namespace: monitoring

      # interval specifies the interval at which metrics should be scraped
      interval: 5m

      # scrapeTimeout specifies the timeout after which the scrape is ended
      scrapeTimeout: 30s

      # podTargetLabels transfers labels on the Kubernetes Pod onto the target.
      podTargetLabels: []

      # podMetricsEndpointConfig allows to add an extended configuration to the podMonitor
      podMetricsEndpointConfig: {}
      # honorTimestamps: true

    # Prometheus ServiceMonitor configuration
    # Only used when config.enableSecureMetrics is true
    # Reference: https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#servicemonitor
    serviceMonitor:
      # This option requires the following settings within Prometheus to function:
      # 1. RBAC settings for the Prometheus instance to access the metrics endpoint.
      #
      # - nonResourceURLs:
      #   - /metrics
      #   verbs:
      #   - get
      #
      # 2. If using the Prometheus Operator and your Prometheus instance is not in the same namespace as the operator you will need
      #    the Prometheus Operator configured with the following Helm values:
      #
      #   prometheus:
      #     prometheusSpec:
      #       serviceMonitorNamespaceSelector: {}
      #       serviceMonitorSelectorNilUsesHelmValues: false
      #
      # allows to disable the serviceMonitor, enabled by default for backwards compatibility
      enabled: true
      # namespace determines in which namespace the serviceMonitor will be deployed.
      # If not set the serviceMonitor will be created in the namespace where the Helm release is installed into
      # namespace: monitoring
      # caSecret is the name of the secret containing the custom CA certificate used to generate the custom TLS certificate for the secure metrics endpoint.
      #
      # * This *must* be the name of the secret containing the CA certificate used to sign the custom TLS certificate for the metrics endpoint.
      # * This secret *must* be in the same namespace as the Prometheus instance that will scrape the metrics.
      # * If using the Prometheus operator this secret must be within the `spec.secrets` field of the `Prometheus` custom resource such that it is mounted into the Prometheus pod at `caMountDirectory`, which defaults to /etc/prometheus/secrets/{secret-name}.
      # * This is an optional setting and is only required if you are using a custom TLS certificate.
      # * Key must be named ca.crt.
      #
      # example: kubectl create secret generic eck-metrics-tls-ca -n monitoring \
      #            --from-file=ca.crt=/path/to/ca.pem
      caSecret: ""
      # caMountDirectory is the directory at which the CA certificate is mounted within the Prometheus pod.
      #
      # * You should only need to adjust this if you are *not* using the Prometheus operator.
      caMountDirectory: "/etc/prometheus/secrets/"
      # insecureSkipVerify specifies whether to skip verification of the TLS certificate for the secure metrics endpoint.
      #
      # * If this setting is set to false, then the following settings are required:
      #   - certificateSecret
      #   - caSecret
      insecureSkipVerify: true

    # Globals meant for internal use only
    global:
      # manifestGen specifies whether the chart is running under manifest generator.
      # This is used for tasks specific to generating the all-in-one.yaml file.
      manifestGen: false
      # createOperatorNamespace defines whether the operator namespace manifest should be generated when in manifestGen mode.
      # Usually we do want that to happen (e.g. all-in-one.yaml) but, sometimes we don't (e.g. E2E tests).
      createOperatorNamespace: true
      # kubeVersion is the effective Kubernetes version we target when generating the all-in-one.yaml.
      kubeVersion: 1.21.0