Tuesday 30 April 2024

VS Code Extension for YAML & Kubernetes

In VS Code's Extensions Marketplace, search for YAML and install YAML by Red Hat. 

The following steps show how to enable support for Kubernetes.

Upon installation, on extension's tab page, click on the cog and choose Extension Settings.

In settings: find Yaml: Schemas and click on Edit in settings.json

... and in that file add the following:

    "yaml.schemas": {  
        "kubernetes": "*.yaml"

After this, in YAML file, type in apiVersion (the first required property of the Kubernetes YAML definition file) and auto-completion and auto-indentation for Kubernetes YAML files will kick in:

In case of errors in YAML formatting or Kubernetes YAML file not matching the Kubernetes YAML schema, this extension will show an error:

Monday 29 April 2024

YAML Ain't Markup Language (YAML)

YAML file format is used to represent data, like other data structure formats like XML or JSON.

This table shows the same data, represented in all three formats:

Files that use YAML format can have extension .yaml or .yml

Key-Value Pair

Data in its simplest form is a key value pair and that's how it's defined in YAML: key and value separated by a colon (colon must be followed by space).

name: Server1

The key above is name
The value is Server1.

YAML file is basically a collection of one or more key-value pairs where key is a string (without quotation marks) while value is a string or some more complex data structure like a list, dictionary or list of dictionaries etc...


Array name is followed by colon and then each item goes in its own line with a dash in the front:

- Server1
- Server2
- ServerN

The dash indicates that it's an element of an array.

In the example above we have actually a key-value pair where value is a list. As YAML document is a collection of key-value pairs, we can have a file like this:


- Server1
- Server2

- London
- Frankfurt

This style of writing sequences is called a block style. Sequences can also be written in flow style, where elements are separated by comma, within square brackets:

Servers: [Server1, Server2]
DataCentres: [London, Frankfurt]

Dictionary (Map)

A dictionary is a set of properties grouped together under an item.

Technically, the example below is a key-value pair where key is the name of the dictionary and value is the dictionary itself:

    name: Server1
    owner: John
    created: 123456
    status: active

Notice the blank space before each property. There must be an equal number of blank spaces before the properties of a single item so they are all aligned together, meaning that they are all siblings of their parent, which is key Server1.

The number of (indentation) spaces before these properties doesn't matter. But that number should be the same as they are siblings.

A YAML file use spaces as indentation, you can use 2 or 4 spaces for indentation, but no tab. In other words, tab indentation is forbidden. [php - A YAML file cannot contain tabs as indentation - Stack Overflow]

Why does YAML forbid tabs?

Tabs have been outlawed since they are treated differently by different editors and tools. And since indentation is so critical to proper interpretation of YAML, this issue is just too tricky to even attempt. Indeed Guido van Rossum of Python has acknowledged that allowing TABs in Python source is a headache for many people and that were he to design Python again, he would forbid them. [YAML Ain't Markup Language]

Notice the number of spaces before each property that indicates these key value pairs fall within Server1

Let's suppose that we have:

    name: Server1
    owner: John
       created: 123456
    status: active

In this case created has more spaces on the left than owner and so it is now a child of the owner property instead being its sibling, which is incorrect.

Also these properties must have more spaces than its parent which is Server1.

What if we had extra spaces for created and status?  

    name: Server1
    owner: John
       created: 123456
       status: active

Then they will fall under owner and thus become properties of owner.  This will result in a syntax error which will tell you that mapping values are not allowed here because owner already has a value set which is John

For a value of the key-value pair we can either set a direct value or a hash map. We cannot have both. So the number of spaces before each property is key in YAML.

Complex Data Types

A list containing dictionaries 

We have here a key-value pair where value is a list of key-value pairs where value is a dictionary (we can say that we have a list of dictionaries where each dictionary has a name):

- Server1:
    name: Server1
    owner: John
    created: 123456
    status: active
- Server2:
    name: Server2
    owner: Jack
    created: 789012
    status: shutdown

We have here a list of servers and the elements of the list are key-value pairs Server1 and Server2.
Their values are dictionaries containing server information.

We can have a list of (unnamed) dictionaries where each element of list is not a key-value pair but a dictionary itself:

  - name: Server1
    owner: John
    created: 123456
    status: active
  - name: Server2
    owner: Jack
    created: 789012
    status: shutdown

A list containing dictionaries containing list

- Server1:
    name: Server1
    owner: John
    created: 123456
    status: active
       - web server
       - authentication database
- Server2:
    name: Server2
    owner: Jack
    created: 789012
    status: shutdown
       - caching database

When to use a list, dictionary and list of dictionaries?

Use dictionary if need to represent information or multiple properties of a single object.

Dictionary is a collection of key-value pairs grouped together:

name: Server1
owner: John
created: 123456
status: active

In case we need to split the owner further into name and surname, we could then represent this as a dictionary within another dictionary.

name: Server1
   name: John
   surname: Smith
created: 123456
status: active

In this case the single value of owner is now replaced by a small dictionary with two properties name
and surname. So this is a dictionary within another dictionary.

Use a list/array to represent multiple items of the same type of object.  
E.g. that type could be a string.

We have here a key-value pair where value is a list of strings

- Server1
- Server2

What if we would like to store all information about each server? We'll expand each item in the array and replace the name with the dictionary. This way we are able to represent all information about multiple servers in a single YAML file using a list of dictionaries.

We have here a key-value pair where value is a list of dictionaries:

- name: Server1
  owner: John
  created: 123456
  status: active
- name: Server2
  owner: Jack
  created: 789012
  status: shutdown

When the order of items matter?

Dictionary is an unordered collection.
Lists/arrays are ordered collections.


name: Server1
owner: John

is the same as:

owner: John
name: Server1

But list:

- Server1
- Server2

is not the same as list:

- Server2
- Server1

Comments in YAML

Any line beginning with a hash is automatically ignored and considered as a comment.

# List of servers
- Server1
- Server2

How to break long string values into multiple lines?

  • vertical bar (|) pipe character which preserves the new line
  • greater-than (>) character which folds the new line and converts it into spaces


Sunday 28 April 2024

Introduction to Kubernetes

These are custom notes that extend my notes from an Udemy course "Kubernetes for the Absolute Beginners - Hands-on". All course content rights belong to course creators. 


Web Applications are nowadays developed with containerisation on mind as containers contain everything that is needed to run the application: code, runtime, databases, system libraries, etc.

Kubernetes (k8s) is:
  • Platform for managing application containers (containerised applications, container-oriented applications, containerized workloads and services) across multiple hosts (one or more host clusters)
  • The easiest and the most recommended way to manage containers in production
  • It makes it easy to orchestrate many containers on many hosts, scale them as microservices, and easily deploy, rollouts and rollbacks
  • a set of APIs that we can use to deploy containers on a set of nodes called a cluster. 
  • We can describe a set of applications and how they should interact with each other, and Kubernetes determines how to make that happen
  • Workload scheduler with focus on containerized applications.
  • Container orchestration technology -  system for automating the operations and management of application containers in complex, multi-container workloads:
    • Container creation
    • Container deployment
    • Rolling deployment *)
    • Auto-scaling
    • Load balancing
    • Container health monitoring
    • Compute resource management
    • Volume management
    • Persistent storage
    • Networking
    • High availability by cluster federation
  • Open-source
  • Originally designed by Google, based upon their running of containers in production. Now maintained by the Cloud Native Computing Foundation
  • Supports hosting enhanced and complex applications on various kinds of architectures; it is designed to run anywhere:
    • on a bare metal
    • in our data center
    • on the public cloud - supported on any cloud platform; it is platform-agnostic and integrates with a number of different cloud providers, allowing us to pick the platform that best suits our needs
    • on the hybrid cloud
  • 2 steps involved in scheduling container on a Kubernetes cluster:
    • Provisioning somewhere the Kubernetes cluster with all its components
    • Defining the Kubernetes resources, such as Deployments, Services, etc.
  • With Kubernetes:
    • we can decide when our containers should run
    • increase, or decrease the size of application containers
    • check the resource consumption of our application deployments
  • To save time and effort when scaling applications and workloads, Kubernetes can be bootstrapped using:
    • Amazon Elastic Kubernetes Service (EKS)
    • Google Kubernetes engine (GKE)

*) Rolling deployment:
  • A deployment strategy that slowly replaces previous versions of an application with new versions of an application by completely replacing the infrastructure on which the application is running;
  • It is renowned for its ability to update applications without downtime. Incrementally updating nodes or replicas ensures that the service remains available to users throughout the deployment process)
  • Rolling deployments use the concept of a window size—this is the number of servers that are updated at any given time. For example, if a Kubernetes cluster is running 10 instances of an application (10 pods), and you want to update two of them at a time, you can perform a rolling deployment with a window size of 2.

To revise our knowledge on containers, let's read Introduction to Containers.

Deploying containers on nodes by using a wrapper around one or more containers is what defines a pod. A pod is the smallest unit in Kubernetes that you can create or deploy. It represents a running process on your cluster as either a component of your application or an entire app. Generally, you only have one container per pod, but if you have multiple containers with a hard dependency, you can package them into a single pod and share networking and storage resources between them. The pod provides a unique network IP and set of ports for your containers, and configurable options that govern how your containers should run. 

One way to run a container in a pod in Kubernetes is to use the kubectl run command, which starts a deployment with a container running inside a pod. 

A deployment represents a group of replicas of the same pod and keeps your pods running even when the nodes they run on fail. A deployment could represent a component of an application or even an entire app. 

To see a list of the running pods in your project, run the command, kubectl get pods

Kubernetes creates a service with a fixed IP address for our pods. And a controller says, I need to attach an external load balancer with a public IP address to that service so others outside the cluster can access it. 

In GKE, the load balancer is created as a network load balancer. Any client that reaches that IP address will be routed to a pod behind the service. 

A service is an abstraction which defines a logical set of pods and a policy by which to access them. As deployments create and destroy pods, pods will be assigned their own IP addresses, but those addresses don't remain stable over time. 

A service group is a set of pods and provides a stable endpoint or fixed IP address for them. For example, if you create two sets of pods called frontend and backend, and put them behind their own services, the backend pods might change, but frontend pods are not aware of this. They simply refer to the backend service. 

To scale a deployment run the kubectl scale command. In this example, three pods are created in your deployment, and they're placed behind the service and share one fixed IP address. You could also use autoscaling with other kinds of parameters. For example, you can specify that the number of pods should increase when CPU utilization reaches a certain limit. 

So far, we've seen how to run imperative commands like expose and scale. This works well to learn and test Kubernetes step by step. But the real strength of Kubernetes comes when you work in a declarative way. 

Instead of issuing commands, you provide a configuration file that tells Kubernetes what you want your desired state to look like, and Kubernetes determines how to do it. You accomplish this by using a deployment config file. You can check your deployment to make sure the proper number of replicas is running, by using either kubectl get deployments or kubectl describe deployments. To run five replicas instead of three, al you do is update the deployment config file and run the kubectl apply command to use the updated config file. 

You can still reach your endpoint as before by using kubectl get services to get the external IP of the service and reach the public IP address from a client. 

The last question is, what happens when you want to update a new version of your app? Well, you want to update your container to get new code in front of users, but rolling out all those changes at one time would be risky. So in this case, you would use kubectl rollout or change your deployment configuration file and then apply the change using kubectl apply. New pods will then be created according to your new update strategy. Here's an example configuration that will create new version pods individually and wait for a new pod to be available before destroying one of the old pods.

Container Orchestration

Applications run in their own containers. 

What if one application depends on another e.g. web server, running in one container, depends on the DB running in another container? 
What if the number of users increases and we need to scale out our application
How to scale down when the load decreases?
How to build services across multiple machines without dealing with cumbersome network and storage settings? 
How to manage and roll out our microservices by different service cycle?

We should have an underlying platform that takes care of these dependencies and scaling. This process of deploying and managing containers is called container orchestration.

Container orchestration technologies:
  • Docker Swarm
    • easy to set up
    • lacks advanced features
  • Kubernetes (Google)
    • most popular
    • difficult to set up
    • has lots of options to support deployments of complex architecture setups
    • supported on all main public cloud service providers like GCP, Azure, AWS
  • Mesos (Apache)
    • difficult to set up
    • has advanced features
Kubernetes advantages:
  • Used to deploy and manage hundreds or thousands of containers in a clustered environment
  • Kubernetes is designed with high availability (HA). We have multiple instances of our application running on different nodes so hardware failures on some nodes won't impact the availability. We are able to create multiple master nodes from preventing single point of failure. 
  • Traffic is load balanced across multiple containers.
  • Scaling is done by scaling the number of containers running on a single host but also increasing the number of hosts (hardware scaling) if processing demands reach maximum thresholds on existing nodes.
  • The lifetime of containers might be short. They may be killed or stopped anytime when they exceed the limit of resource, how do we ensure our services always serve a certain number of containers? ReplicationController or ReplicaSet in Kubernetes will ensure a certain number of group of containers are up. 
  • Kubernetes even supports liveness probe to help you define your application health.
  • For better resource management, we can also define the maximum capacity on Kubernetes nodes and the resource limit for each group of containers (a.k.a pod). Kubernetes scheduler will then select a node that fulfills the resource criteria to run the containers. 
  • Kubernetes provides an optional horizontal pod auto-scaling feature. With this feature, we could scale a pod horizontally by resource or custom metrics.
  • Perfect match for microservices where it helps their CD (Continuous Delivery). We can create a Deployment to rollout, rollover, or roll back selected containers. 
  • Containers are considered as ephemeral - they can quickly and/or often die. We can mount the volume into a container to preserve the data in a single host world. In the cluster world, a container might be scheduled to run on any host. Kubernetes Volumes and Persistent Volumes make the volume mounting work as permanent storage seamlessly.
  • This is all achieved with the set of declarative object configuration files.

Kubernetes Architecture

Kubernetes system is divided into:
  • a set of primary components that run as the control plane
  • a set of nodes that run containers. In Kubernetes, a node represents a computing instance like a machine. Note that this is different to a node on Google Cloud, which is a virtual machine running in computer engine

  • Node (worker node, minion)
    • machine, physical or virtual, on which Kubernetes is installed
    • worker machine on which containers will be launched by Kubernetes; workers run containers
    • if node fails, our application will go down => we need to have more nodes
  • Cluster
    • Set of nodes grouped together
    • Even if one node fails, application is still accessible from other nodes
    • Having multiple nodes also helps sharing the load
    • Kubernetes cluster consists of two types of nodes, master nodes and worker nodes. 
  • Master (master node)
    • responsible for managing the cluster
    • controls and schedules all activities in the cluster
    • stores the information about all members of the cluster
    • monitors nodes
    • when node fails, moves workload of the failed node to other worker nodes
    • Master is a node with Kubernetes installed on it and is configured as a master node
    • Master watches over the nodes in the cluster and is responsible for orchestration of containers on the worker nodes
    • Master nodes host the K8s control plane components. The master node will hold configuration and state data used to maintain the desired state.

When we install Kubernetes on the host, we install multiple components on it.

There are two types of nodes/servers: master and worker. And there is a set of components that make up Kubernetes. How are these components distributed across different types of servers? How does one server become a master and the other the slave? 

master (controller) server (node):
  • API Server (kube-apiserver | Kubernetes
    • this is what makes node a master
    • provides REST API and acts as the front end of Kubernetes cluster
    • users, management devices, command line interfaces talk to it in order to interact with Kubernetes cluster 
  • etcd service
    • All the information gathered are stored in a key value store based on the popular etcd framework
    • name is the abbreviation of Experimental Distributed Tracing Service (?) 
    • key store
    • distributed reliable key-value store used by Kubernetes to store all data used to manage the cluster
    • when we have multiple nodes and multiple masters in the cluster, etcd stores all that information on all the nodes in the cluster in the distributed manner
    • responsible for implementing locks within the cluster to ensure there are no conflicts between the masters
  • Controller Manager (kube-controller-manager | Kubernetes)
    • brain behind the orchestration
    • responsible for noticing and responding when nodes, containers or endpoints go down
    • make decisions to bring up new containers in such cases
  • Scheduler (kube-scheduler | Kubernetes)
    • responsible for distributing work of containers across multiple nodes
    • it looks for newly created containers and assigns them to nodes
Master components
(credit: DevOps with Kubernetes by Hideto Saito, Hui-Chuan Chloe Lee and Cheng-Yang Wu)

(I/F = Interface)

This article describes well the control plane of the master node:

API Server and its clients
(image credit: Rini Thomas; source: https://medium.com/@rinithomas/the-kubernetes-api-server-430a39aec2d7)

All communications and operations between the control plane components and external clients, such as kubectl (see Introduction to kubectl), are translated into RESTful API calls that are handled by the API server. 
Effectively, the API server is a RESTful web application that processes RESTful API calls over HTTP to store and update API objects in the etcd datastore.   
Control Plane on the master/controller node(s) consists of the API server, controller manager, and scheduler.  
API server is the central management entity and the only component that talks directly with the distributed storage component etcd. 
 API server has the following core responsibilities:
  • To serve the Kubernetes APIThis API is used :
    • cluster-internally by the:
      • master components 
      • worker nodes
      • our Kubernetes-native apps
    • externally by clients such as kubectl
  • To proxy cluster components, such as the Kubernetes dashboard, or to stream logs, service ports, or serve kubectl exec sessions.  
Serving the API means:
  • Reading state: getting single objects, listing them, and streaming changes
  • Manipulating state: creating, updating, and deleting objects.  
kubectl command is translated into an HTTP API request in JSON format and is sent to the API server. Then, the API server returns a response to the client, along with any requested information.  
API server is stateless (that is, its behavior will be consistent regardless of the state of the cluster) and is designed to scale horizontally. Usually, for the high availability of clusters, it is recommended to have at least three instances to handle the load and fault tolerance better.  


API Server internal processes 
(image credit: Rini Thomas; source: https://medium.com/@rinithomas/the-kubernetes-api-server-430a39aec2d7)


worker node (minion):
  • is where the containers are hosted e.g. Docker containers. 
  • kubelet service (agent) [kubelet | Kubernetes]
    • the agent that runs on each node in the cluster
    • interacts with a master to provide health information of the worker node and carry out actions requested by the master on the worker nodes
    • makes sure that containers are running as expected
  • Container Runtime 
    • underlying software required for running containers on a system
    • Container Runtime can be be Docker, rkt or CRI-O
    • in our case it's Docker but there are other options as well 

Node components 
(credit: DevOps with Kubernetes by Hideto Saito, Hui-Chuan Chloe Lee and Cheng-Yang Wu)

On both master and worker nodes runs kube-proxy | Kubernetes.

Understanding what components constitute the master and worker nodes will help us install and configure the right components on different systems when we set up our infrastructure. 


What is a worker machine in Kubernetes known as?
A Node in Kubernetes can only be a physical machine and can never be a virtual machine.
Multiple nodes together form what?
Which of the following processes runs on Kubernetes Master Node?
Which of the following is a distributed reliable key-value store used by kubernetes to store all data used to manage the cluster?
Which of the following services is responsible for distributing work or containers across multiple nodes.  
Which of the following is the underlying framework that is responsible for running application in containers like Docker?
Which is the command line utility used to manage a kubernetes cluster?

Why do we need Kubernetes?

  • Let's for a moment keep Kubernetes out of our discussion and talk about simple Docker containers. Let's assume we were developing a process or a script to deploy our application on a Docker host. Then we would first simply deploy our application using a simple docker run command, and the application runs fine and our users are able to access it:
    • docker run python-app
  • When the load increases, we deploy more instances of our application by running the docker run commands many more times: 
    • docker run python-app --name app1
    • docker run python-app --name app2
    • docker run python-app --name app3
    • docker run python-app --name app4
  • Sometime in the future our application is further developed, undergoes architectural changes and grows and gets complex. We now have a new helper container that helps our web application by processing or fetching data from elsewhere (NOTE: --link is a legacy option for docker run; it is recommend using user-defined networks to facilitate communication between two containers instead of using --link; see Legacy container links | Docker Docs):
    • docker run helper --link app1
    • docker run helper --link app2
    • docker run helper --link app3
    • docker run helper --link app4
  • These helper containers maintain a 1 to 1 relationship with our application container and thus needs to communicate with the application containers directly and access data from those containers. For this, we need to (manually):
    • maintain a map of what app and helper containers are connected to each other
    • establish network connectivity between these containers ourselves using links and custom networks
    • create shareable volumes and share it among the containers. We would need to maintain a map of that as well. 
    • monitor the state of the application container
      • When it dies, manually kill the helper container as well as it's no longer required.
      • When a new container is deployed, we would need to deploy the new helper container as well with pods.
  • Kubernetes does all of this for us automatically. We just need to define what containers a pod consists of and the containers in a pod by default will have access to the same storage, the same network namespace and same fate as in they will be created together and destroyed together.
  • Even if our application didn't happen to be so complex and we could live with a single container, Kubernetes still requires you to create pods, but this is good in the long run as your application is now equipped for architectural changes and scale in the future.
  • However, multi-containers pods are a rare use case. Single containers per pod is the most common use case.
To learn more about pods, please read the next article in this series: Kubernetes Pods | My Public Notepad

Kubernetes Cloud-native Application Architecture

Typical cloud-native application architecture consists of 3-tiers:
  • frontend e.g. Nginx
  • backend e.g. Wordpress
  • persistence (database) e.g. MariaDB

We define and create multiple resources for each of these tiers in Kubernetes:

source: Get Started with Bitnami Charts using Minikube


Wednesday 24 April 2024

Installing Software on Linux


How to install software available in Package Repository?

Installing from Ubuntu Package Repository

Example: Installing VLC player

$ sudo apt-get update
$ sudo apt-get install vlc

It's best to run sudo apt-get update first as this updates local information about what packages are available from where in what versions. This can prevent a variety of installation errors (including some "unmet dependencies" errors), and also ensures you get the latest version provided by your enabled software sources.

There is also an apt version of this command:

sudo apt update
Reading package lists... Done
Building dependency tree       
Reading state information... Done
23 packages can be upgraded. Run 'apt list --upgradable' to see them.

To list all upgradable packages:

sudo apt list --upgradable

To upgrade all packages:

$ sudo apt upgrade

To see all installed packages:

$ sudo apt list --installed

To check if some package has already been installed:

$ sudo apt list --installed | grep package_name


If using Alpine distribution, you need to use apkComparison with other distros - Alpine Linux

Installing from non-default (3rd Party) Package Repository

Example: Installing PowerShell from Microsoft Package Repository

# Download the Microsoft repository GPG keys
wget -q https://packages.microsoft.com/config/ubuntu/18.04/packages-microsoft-prod.deb

# Register the Microsoft repository GPG keys
sudo dpkg -i packages-microsoft-prod.deb

# Update the list of products
sudo apt-get update

# Enable the "universe" repositories
sudo add-apt-repository universe

# Install PowerShell
sudo apt-get install -y powershell

# Start PowerShell

If you install the wrong version of packages-microsoft-prod.deb you can uninstall it with:

sudo dpkg -r packages-microsoft-prod
(Reading database ... 254902 files and directories currently installed.)
Removing packages-microsoft-prod (1.0-3) ...

How to install software distributed via Debian package (.deb) files?  

Installing the .deb package will automatically install the apt repository and signing key to enable auto-updating using the system's package manager. Alternatively, the repository and key can also be installed manually.

Some applications are not available in Debian Package Repository but can be downloaded as .deb files.

$ sudo dpkg -i /path/to/deb/file 
$ sudo apt-get install -f

The latter is necessary in order to fix broken packages (install eventual missing/unmet dependencies).

How to install a deb file, by dpkg -i or by apt?

Another example: Etcher

Debian and Ubuntu based Package Repository (GNU/Linux x86/x64)

Add Etcher debian repository:

echo "deb https://deb.etcher.io stable etcher" | sudo tee /etc/apt/sources.list.d/balena-etcher.list

Trust Bintray.com's GPG key:

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 379CE192D401AB61

Update and install:

sudo apt-get update
sudo apt-get install balena-etcher-electron


sudo apt-get remove balena-etcher-electron
sudo rm /etc/apt/sources.list.d/balena-etcher.list
sudo apt-get update

How to install applications distributed via snaps?

Snaps are containerised software packages. They're designed to install the programs within them on all major Linux systems without modification. Snaps do this by developers bundling a program's latest libraries in the containerized app.

Snap updates automatically and carries all its library dependencies, so is a better choice for users who want ease of deployment and to stay on top of the latest developments.

Snapcraft - Snaps are universal Linux packages

Installing snap on Ubuntu | Snapcraft documentation

How to Install and Use Snap on Ubuntu 18.04 - codeburst

Installing binaries

Some applications are distributed as binary files. We need to download them and set executable permissions. Instead of using cp and chmod commands, we can use install command which copies file to destination directory and automatically sets -rwxr-xr-x permissions over the file:

$ curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64

$ sudo install minikube-linux-amd64 /usr/local/bin/minikube

$ ls -la /usr/local/bin/minikube
-rwxr-xr-x 1 root root 95637096 Apr 24 01:01 /usr/local/bin/minikube

We can pass to install command flags to set the owner and the group and also permission mode (as in chmod), instead of rwxr-xr-x:

$ sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

0755 permissions mean the following: user can 7=read(4)+write(2)+execute(1); group can 5=read(4)+execute(1); anyone/world can 5=read(4)+execute(1) 

If you do not have root access on the target system, you can still install kubectl to the ~/.local/bin directory:

chmod +x kubectl
mkdir -p ~/.local/bin
mv ./kubectl ~/.local/bin/kubectl
# and then append (or prepend) ~/.local/bin to $PATH

It's worth noting here that install command can also just create a directory (all components of the path, like mkdir -p) and specify permissions on it:

$ install -m 0755 -d /etc/apt/keyrings


Thursday 18 April 2024

Cron Utility (Unix)

cron command-line utility is a job scheduler on Unix-like operating systems. It runs as a daemon (background process).

These scheduled jobs (essentially a commands) are called cron jobs. They are repetitive tasks, scheduled to be run periodically, at certain time or interval.

Cron jobs, together with frequency and time of their execution are defined in cron table (crontab) file.
Each job is defined in its own line which has the following format:

minute (0–59)
# │ ┌───────────── hour (0–23)
# │ │ ┌───────────── day of the month (1–31)
# │ │ │ ┌───────────── month (1–12)
# │ │ │ │ ┌───────────── day of the week (0–6) (Sunday to Saturday;
# │ │ │ │ │                                   7 is also Sunday on some systems)
    *   *   *   *   *  <command to execute>

* means "every"

* * * * * = every minute, every hour, every day, every month
0 * * * * = every full hour, every day (HH:MM = *:0)
0 0 * * * = every midnight (HH:MM=0:0)
0 0 1 * * = once a month on the midnight of the first day of the month
0 10 * * * = every day at 10:00h
*/10 * * * * = every 10 minutes of every hour, every day

$ crontab
crontab: usage error: file name or - (for stdin) must be specified
 crontab [options] file
 crontab [options]
 crontab -n [hostname]

 -u <user>  define user
 -e         edit user's crontab
 -l         list user's crontab
 -r         delete user's crontab
 -i         prompt before deleting
 -n <host>  set host in cluster to run users' crontabs
 -c         get host in cluster to run users' crontabs
 -T <file>  test a crontab file syntax
 -s         selinux context
 -V         print version and exit
 -x <mask>  enable debugging

Default operation is replace, per 1003.2

To list all cron jobs use:

$ crontab -l
* * * * * aws s3 sync ~/dir1/ s3://my-bucket/dir1 --region us-east-1 >> ~/logs/crons/s3_sync.log 2>&1
0 * * * * redis-cli -h redis-cache-group-123.cache.amazonaws.com -p 6345 flushall >> ~/logs/crons/flushRedisCache.log 2>&1
* * * * * ~/path/to/my_script1.sh >> ~/logs/crons/my_script1.sh.log 2>&1
0 0 * * * ~/path/to/my_script2.sh >> ~/logs/crons/my_script2.log 2>&1
0 0 1 * * ~/path/to/my_script3.sh >> ~/logs/crons/my_script3.log 2>&1
0 10 * * * ~/path/to/my_script4.sh >> ~/logs/crons/my_script4.log 2>&1
*/10 * * * * rsync -avhl --delete ~/path/to/source ~/path/to/dest/ >> ~/logs/crons/source_dest_rsync.log 2>&1

crontab file should not be edited with file editors but via crontab:

crontab -l

How to disable some cron job?

Simply comment its line in crontab with #.

How to disable all cron jobs?

Either comment all lines in crontab or 

$ crontab -l > crontab_backup.txt
$ crontab -r

-r = removes the current crontab 

To restore backup crontab:

$ crontab crontab_backup.txt
$ crontab -l


cron - Wikipedia

Thursday 11 April 2024

Docker Interview Questions

Here are some Docker Interview Questions with answers and/or links to answers. Good luck! 💪🤞

  • What is .dockerignore? 
  • What is a Docker file?
  • Explain Docker file command FROM.
  • Explain Docker file command WORKDIR.
  • Explain Docker file command COPY.
    • What are its arguments?
    • What's the use case for COPY and what's use case for volume attribute in docker-compose.yaml? Which one is used in development and which one in production and why?
  • Explain Docker file command CMD.
  • How to create a Docker image?
    • Which Docker command shall be used? Which are its two main arguments?
    • What is the "build context"?
    • What's the meaning of its arguments: -t, -f?
  • How to download Docker image from the remote repository?
    • What are the long and short form of the command used?
    • Which tag gets applied if no tag is specified?
    • How many layers are downloaded in parallel by default?
    • Can layers be shared among multiple images?
    • Where are these images downloaded to?
    • Where from are the images downloaded from by default?
    • How would look the command that pulls the image from the local registry?
    • Which protocol does Docker use to communicate with registry?
    • How to pull multiple images from a repository?
    • How to cancel downloading image(s)?
    • docker pull | Docker Docs
  • How to list all images (on the local host)?
  • How to see the content of the Docker image?
  • How to list all running containers? 
    • How to list all containers?
  • How to list all containers (including those which are not running)?
  • How to inspect a container? 
    • How to extract only its ID? 
    • How to extract details about ports which are at this level in the JSON output: NetworkSettings >> Ports?
  • Whose alias is docker run
  • Whose alias is docker create
    • What does it do? 
    • What is its main argument? 
    • In which state is the container after this command is executed? 
    • How to start the container after this?
    • docker create | Docker Docs
  • docker-compose up
    • What is the difference between --build and --force-recreate arguments?
  • To Be Continued...

Python Interview Questions

Here are some Python interview questions. Good luck!

  • What does the following expression do? if __name__ == "__main__"
    • https://stackoverflow.com/questions/419163/what-does-if-name-main-do

Terraform Interview Questions

Here are some Terraform Interview Questions with answers and/or links to answers. Good luck!

Linux Interview Questions

Here are some Linux Interview questions. Good luck!

  • How to check if any application is listening on port 80?
  • How to list all the environment variables?
  • How to list all the variables, including all local variables?
  • How to reference a variable?
  • How to print the value of a particular variable?
  • How to set an environment variable? How to export it to the global environment (available to other processes). How should variable be enclosed if it contains spaces?
  • How to set a local variable?
  • Where are local variables available?
  • How to unset a local variable?