Showing posts with label AWS EBS. Show all posts
Showing posts with label AWS EBS. Show all posts

Thursday, 30 May 2024

How to monitor EC2 instance metrics in Amazon CloudWatch



Amazon CloudWatch offers 2 types of monitoring:
  • Basic
  • Detailed
    • all metrics available in 1 minute periods
    • needs explicitly to be enabled for the instance; can be enabled on an instance upon launch or after the instance is running or stopped; enabling it does not affect the monitoring of the EBS volumes attached to the instance
    • charge per metric

Amazon CloudWatch can monitor two types of EC2 instance metrics:
  • Basic (Default) Metrics
    • AWS/EC2 namespace includes the following metrics:
      • Instance metrics:
        • CPUUtilization
        • DiskReadOps
        • DiskWriteOps
        • DiskReadBytes
        • DiskWriteBytes
        • MetadataNoToken
        • MetadataNoTokenRejected
        • NetworkIn
        • NetworkOut
        • NetworkPacketsIn
        • NetworkPacketsOut
      • CPU credit metrics
        • CPUCreditUsage
        • CPUCreditBalance
        • CPUSurplusCreditBalance
        • CPUSurplusCreditsCharged
      • Dedicated Host metrics
        • DedicatedHostCPUUtilization
      • EBS metrics for Nitro-based instances
        • EBSReadOps
        • EBSWriteOps
        • EBSReadBytes
        • EBSWriteBytes
        • EBSIOBalance%
        • EBSByteBalance%
      • Status check metrics
        • StatusCheckFailed
        • StatusCheckFailed_Instance
        • StatusCheckFailed_System
        • StatusCheckFailed_AttachedEBS
    • AWS/EBS namespace includes the following status check metric
      • VolumeStalledIOCheck
    • By default, Amazon EC2 sends metric data to CloudWatch in 5-minute periods
  • Additional (Custom) Metrics 
    • internal system-level metrics
    • e.g. RAM, Instance Swap details, EBS disks utilization etc...
    • require use of one of these technologies:
      • CloudWatch Agents (recommended way). It needs to be installed on our EC2 instances, and then configured to emit selected metrics.
      • CloudWatch Monitoring Scripts (legacy way)
    • Metrics collected by the CloudWatch agent are billed as custom metrics.

CloudWatch monitoring scripts are deprecated and recommended way is to use CloudWatch agent instead of script for collecting the logs and metrics. 


How to use CloudWatch Agent for EC2 instance monitoring?


First, CloudWatch Agent needs to be installed on the EC2 instance. CloudWatch agent is available as a package in Amazon Linux 2023 and Amazon Linux 2.


# yum install amazon-cloudwatch-agent

Amazon Linux 2023 repository                                                                                                                                                                                                                   31 kB/s | 3.6 kB     00:00    
Amazon Linux 2023 Kernel Livepatch repository                                                                                                                                                                                                  38 kB/s | 2.9 kB     00:00    
Dependencies resolved.
===============================================================================================================
 Package                 Architecture            Version                 Repository                  Size
 ===============================================================================================================
Installing:
 amazon-cloudwatch-agent   x86_64          1.300033.0-1.amzn2023        amazonlinux                   95 M

Transaction Summary
===============================================================================================================
Install  1 Package

Total download size: 95 M
Installed size: 360 M
Is this ok [y/N]: y
Downloading Packages:
amazon-cloudwatch-agent-1.300033.0-1.amzn2023.x86_64.rpm                                                                                                                                                                                       71 MB/s |  95 MB     00:01    
---------------------------------------------------------------------------------------------------------------
Total                                                                                                                                                                                                                                          67 MB/s |  95 MB     00:01     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                                                                                                                                                      1/1 
  Running scriptlet: amazon-cloudwatch-agent-1.300033.0-1.amzn2023.x86_64                                                                                                                                                                                                 1/1 
create group cwagent, result: 0
create user cwagent, result: 0

  Installing       : amazon-cloudwatch-agent-1.300033.0-1.amzn2023.x86_64                                                                                                                                                                                                 1/1 
  Running scriptlet: amazon-cloudwatch-agent-1.300033.0-1.amzn2023.x86_64                                                                                                                                                                                                 1/1 
  Verifying        : amazon-cloudwatch-agent-1.300033.0-1.amzn2023.x86_64                                                                                                                                                                                                 1/1 

Installed:
  amazon-cloudwatch-agent-1.300033.0-1.amzn2023.x86_64                                                                                                                                                                                                                        
Complete!

This was done on the EC2 instance with this OS:

# cat /etc/os-release

NAME="Amazon Linux"
VERSION="2023"
ID="amzn"
ID_LIKE="fedora"
VERSION_ID="2023"
PLATFORM_ID="platform:al2023"
PRETTY_NAME="Amazon Linux 2023.4.20240429"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2023"
HOME_URL="https://aws.amazon.com/linux/amazon-linux-2023/"
DOCUMENTATION_URL="https://docs.aws.amazon.com/linux/"
SUPPORT_URL="https://aws.amazon.com/premiumsupport/"
BUG_REPORT_URL="https://github.com/amazonlinux/amazon-linux-2023"
VENDOR_NAME="AWS"
VENDOR_URL="https://aws.amazon.com/"
SUPPORT_END="2028-03-15"

We then need to make sure that the IAM role attached to the EC2 instance has the CloudWatchAgentServerPolicy - AWS Managed Policy attached to it.



How to check if CloudWatch agent is installed on EC2 instance?



We need to SSH into EC2 instance and then run as root:

# /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status

{
  "status": "stopped",
  "starttime": "",
  "configstatus": "not configured",
  "version": "1.300033.0"
}

If service is configure and running, the output is like:

{
  "status": "running",
  "starttime": "2024-05-29T11:27:25+00:00",
  "configstatus": "configured",
  "version": "1.300033.0"
}


How to configure CloudWatch Agent?


Before running the CloudWatch agent on any server, we must create one or more CloudWatch agent configuration file(s) on the server. This configuration file define which metrics we want to be collected, on which resources and also around which descriptors (dimensions) we want to group these metrics for a visual representation.

Dimensions:
  • attributes that provide context for metrics by categorizing them according to specific criteria
  • describe and categorize
  • descriptive characteristic or attribute of data
Metrics:
  • they quantify and provide numerical details
  • assign numeric values to dimensions of our choice

Configuration file can have an arbitrary name e.g. amazon-cloudwatch-agent.json.


Example: Collecting metrics of EBS disks mounted on EC2 instance

We first need to find mounting points for root and data disks:

# lsblk

NAME          MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
nvme0n1       259:0    0   50G  0 disk 
├─nvme0n1p1   259:1    0   50G  0 part /
├─nvme0n1p127 259:2    0    1M  0 part 
└─nvme0n1p128 259:3    0   10M  0 part /boot/efi
nvme1n1       259:4    0  300G  0 disk /home/my-user/ebs-volume-1-mount-point-dir
nvme2n1       259:5    0  130G  0 disk /home/my-user/ebs-volume-2-mount-point-dir
nvme3n1       259:6    0   35G  0 disk /home/my-user/ebs-volume-3-mount-point-dir

We can then specify which metrics we want to collect on which disks (identified via their mount points) and then how to group them - in this case by InstanceId:

# vi /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json

{
  "agent": {
    "metrics_collection_interval": 60,
    "run_as_user": "cwagent"
  },
  "metrics": {
    "append_dimensions": {
        "InstanceId": "${aws:InstanceId}"
    },
    "metrics_collected": {
      "disk": {
        "measurement": [
  "free",
  "total",
  "used",
          "used_percent"
        ],
        "metrics_collection_interval": 60,
        "resources": [
          "/",
          "/home/my-user/ebs-volume-1-mount-point-dir",
          "/home/my-user/ebs-volume-2-mount-point-dir",
          "/home/my-user/ebs-volume-3-mount-point-dir"
        ]
      }
    }
  }
}


We can create a dedicated user (e.g. cwagent) or use one of existing users (e.g. my-user) on the instance.

We can also specify a namespace but if it's not specified, a default namespace (CWAgent) is used. That namespace will appear in AWS Console in CloudWatch >> Metrics >> All Metrics >> Custom namespaces.

If we use InstanceId as a dimension and don't specify any aggregation dimensions, CloudWatch will automatically use InstanceId, path, fdisk and device. For the example above, there will be 4 graphs as we have [InstanceId, path, fdisk, device], InstanceId is always the same and there are 4 unique combinations of [path, fdisk, device].

If our instances are managed by Auto-scaling Group (ASG) and we want to monitor disks within it, we need to specify aggregation dimensions as otherwise CloudWatch can't use just AutoScalingGroupName to uniquely identify disks (as it needs to draw one graph per disk). If we have max 1 instance in ASG, we can go with additionally specifying e.g. path:

    "metrics": {
      "append_dimensions": {
        "AutoScalingGroupName":"${aws:AutoScalingGroupName}"
      },
      "aggregation_dimensions": [
        [ "AutoScalingGroupName", "path" ]
      ],
      ...
    }


If we had more than 1 instance per ASG then we need to include InstanceId and path as only combination of InstanceId + path can uniquely identify the resource:

    "metrics": {
      "append_dimensions": {
        "AutoScalingGroupName":"${aws:AutoScalingGroupName}",
        "InstanceId": "${aws:InstanceId}"
      },
      "aggregation_dimensions": [
        [ "AutoScalingGroupName", "InstanceId", "path" ]
      ],
      ...
    }



After we change the agent configuration file, we must restart the agent to have the changes take effect. To restart the agent service:

# systemctl restart amazon-cloudwatch-agent

Upon service restart we can check the local CloudWatch Agent log (on that instance):

# tail -f /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log

2024-05-13T16:26:13Z I! {"caller":"service@v0.89.0/telemetry.go:77","msg":"Skipping telemetry setup.","address":"","level":"None"}
2024-05-13T16:26:13Z I! {"caller":"service@v0.89.0/service.go:143","msg":"Starting CWAgent...","Version":"1.300033.0","NumCPU":2}
2024-05-13T16:26:13Z I! {"caller":"extensions/extensions.go:34","msg":"Starting extensions..."}
2024-05-13T16:26:13Z I! {"caller":"extensions/extensions.go:37","msg":"Extension is starting...","kind":"extension","name":"agenthealth/metrics"}
2024-05-13T16:26:13Z I! {"caller":"extensions/extensions.go:45","msg":"Extension started.","kind":"extension","name":"agenthealth/metrics"}
2024-05-13T16:26:13Z I! cloudwatch: get unique roll up list []
2024-05-13T16:26:13Z I! {"caller":"ec2tagger/ec2tagger.go:435","msg":"ec2tagger: Check EC2 Metadata.","kind":"processor","name":"ec2tagger","pipeline":"metrics/host"}
2024-05-13T16:26:13Z I! cloudwatch: publish with ForceFlushInterval: 1m0s, Publish Jitter: 15.443407438s
2024-05-13T16:26:13Z I! {"caller":"ec2tagger/ec2tagger.go:411","msg":"ec2tagger: EC2 tagger has started, finished initial retrieval of tags and Volumes","kind":"processor","name":"ec2tagger","pipeline":"metrics/host"}
2024-05-13T16:26:13Z I! {"caller":"service@v0.89.0/service.go:169","msg":"Everything is ready. Begin running and processing data."}


For the case where I used [ "AutoScalingGroupName", "path" ] as aggregate dimensions, CloudWatch graphs look like here:



And if we click on the aggregate AutoScalingGroupName, path:


References:




Thursday, 29 February 2024

Amazon Elastic Block Store (EBS)

 

Amazon Elastic Block Store (EBS)




EBS is block storage that can be attached to an AWS instance and used as a virtual hard disk. An EBS volume can be up to 16TB in size.

  • Part of EC2 ecosystem
  • Manages 3 entities:
    • Volumes
    • Snapshots
    • Lifecycle Manager
  • system storage for AWS EC2 VMs
  • reduces risk
  • durable
  • secure
  • avoid risks of physical media handling
  • 2 types:
    • Solid State Drive (SSD) - backed:
      • general purpose
      • provisioned IOPS
    • Hard Disk Drive (HDD) - backed:
      • Throughput optimized
      • Cold
  • EBS can be attached only to EC2 instance which is in the same Availabilty Zone [amazon web services - Is it possible to change the EBS volume to different availability zones? - Server Fault]
  • Multi-Attach feature allows EC2 instances to share a single EBS volume for up to 16 instances and provide higher availability of your applications for Linux workloads

Data is broken down into blocks and stored as a separate piece. Each block has unique ID.  
Only a single EC2 instance, in a single AZ can access data on EBS.

When we're launching a new EC2 instance, we need to specify the storage for the:
  •  root volume
    • Contains the image used to boot the instance
    • Each instance has a single root volume
  • (optionally) more storage volumes
    • They can be added to EC2 instances when they are launched or after they are running
These volumes are basically "hard disks" which are used to persistently store OS and (our) applications, between (EC2) virtual machine restarts.





Storage type
The storage type used for the volume.

EBS volumes are block-level storage volumes that persist independently from the lifetime of an EC2 instance, so you can stop and restart your instance at a later time without losing your data. You can also detach an EBS volume from one instance and attach it to another instance. EBS volumes are billed separately from the instance’s usage cost.

Instance store volumes are physically attached to the host computer. These volumes provide temporary block storage that persists only during the lifetime of the instance. If you stop, hibernate, or terminate an instance, data on instance store volumes is lost. The instance type determines the size and number of the instance store volumes available and the type of hardware used for the instance store volumes. Instance store volumes are included as part of the instance's usage cost.

Device name
The available device names for the volume.

The device name that you assign is used by Amazon EC2. The block device driver for the instance assigns the actual volume name when mounting the volume. The volume name assigned by the block device driver might differ from the device name that you assign.

The device names that you're allowed to assign depends on the virtualization type of the selected instance.

Snapshot
The snapshot from which to create the volume. A snapshot is a point-in-time backup of an EBS volume.

When you create a new volume from a snapshot, it's an exact copy of the original volume at the time the snapshot was taken.

EBS volumes created from encrypted snapshots are automatically encrypted and you can’t change their encryption status. EBS volumes created from unencrypted snapshots can be optionally encrypted.

Size (GiB)
The size of the volume, in GiB.

If you are creating the volume from a snapshot, then the size of the volume can’t be smaller than the size of the snapshot.

Supported volume sizes are as follows:
io1: 4 GiB to 16,384 GiB
io2: 4 GiB to 65,536 GiB
gp2 and gp3: 1 GiB to 16,384 GiB
st1 and sc1: 125 GiB to 16,384 GiB
Magnetic (standard): 1 GiB to 1024 GiB


Volume type
The type of volume to attach. Volume types include:
  • General Purpose SSD (gp2 and gp3) volumes offer cost-effective storage that is ideal for a broad range of workloads.
  • Provisioned IOPS SSD (io1 and io2) volumes provide low latency and are designed to meet the needs of I/O-intensive workloads. They are best for EBS-optimized instances.
  • Throughput Optimized HDD (st1) volumes provide low-cost magnetic storage that is a good fit for large, sequential workloads.
  • Cold HDD (sc1) volumes provide low-cost magnetic storage that offers lower throughput than st1. sc1 is a good fit for large, sequential cold-data workloads that require infrequent access to data.
  • Magnetic (standard) volumes are best suited for workloads where data is accessed infrequently.
IOPS
The requested number of I/O operations per second that the volume can support.

It is applicable to Provisioned IOPS SSD (io1 and io2) and General Purpose SSD (gp2 and gp3) volumes only.

Provisioned IOPS SSD (io1 and io2) io1 volumes support between 100 and 64,000 IOPS, and io2 volumes support between 100 and 256,000 IOPS depending on the volume size. For io1 volumes, you can provision up to 50 IOPS per GiB. For io2 volumes, you can provision up to 1000 IOPS per GiB.

For General Purpose SSD (gp2) volumes, baseline performance scales linearly at 3 IOPS per GiB from a minimum of 100 IOPS (at 33.33 GiB and below) to a maximum of 16,000 IOPS (at 5,334 GiB and above). General Purpose SSD (gp3) volumes support a baseline of 3,000 IOPS. Additionally, you can provision up to 500 IOPS per GiB up to a maximum of 16,000 IOPS.

Magnetic (standard) volumes deliver approximately 100 IOPS on average, with a burst capability of up to hundreds of IOPS.

For Throughput Optimized HDD (st1) and Cold HDD (sc1) volumes, performance is measured in throughput (MiB/s).

Delete on termination
Indicates whether the volume should be automatically deleted when the instance is terminated.

If you disable this feature, the volume will persist independently from the running life of an EC2 instance. When you terminate the instance, the volume will remain provisioned in your account. If you no longer need the volume after the instance has been terminated, you must delete it manually.

You can also change the delete on termination behavior after the instance has been launched.

Encrypted
The encryption status of the volume.

Amazon EBS encryption is an encryption solution for your EBS volumes. Amazon EBS encryption uses AWS KMS keys to encrypt volumes.

Considerations:
  • If your account is enabled for encryption by default, you can't create unencrypted volumes.
  • If you selected an encrypted snapshot, the volume is automatically encrypted.
  • If your account is not enabled for encryption by default, and you did not select a snapshot or you selected an unencrypted snapshot, encryption is optional.
  • You can create an encrypted io2 volumes in any size and IOPS configuration. However, to create an encrypted volume that has a size greater than 16 TiB, or IOPS greater than 64,000 from an unencrypted snapshot, or a shared encrypted snapshot from an unencrypted snapshot, you must first create an encrypted snapshot in your account and then use that snapshot to create the volume.

KMS key
The KMS key that will be used to encrypt the volume.

Amazon EBS encryption uses AWS KMS keys when creating encrypted volumes and snapshots. EBS encrypts your volume with a data key using the industry-standard AES-256 algorithm. Your data key is stored on disk with your encrypted data, but not before EBS encrypts it with your KMS key. Your data key never appears on disk in plaintext. The same data key is shared by snapshots of the volume and any subsequent volumes created from those snapshots.

Throughput
Throughput that the volume can support specified for Streaming Optimized volumes


If we click on "Add new volume", Volume 2 (Custom) section appears:




EBS Volume Lifecycle



Here is the EBS Volume state diagram:

credit: View information about an Amazon EBS volume - Amazon EBS




Creating a volume snapshot


Why do we want to create an EBS volume snapshot?

If we terminate (intentionally or not) the EC2 instance, the root EBS volume (which might be the only one used by that EC2 instance) will be deleted:


If we take a snapshot of the root EBS volume, then we'll be able later to restore that EC2 instance.






Create a point-in-time snapshot to back up the data on an Amazon EBS volume to Amazon S3.

You can back up the data on your Amazon EBS volumes to Amazon S3 by taking point-in-time snapshots. Snapshots are incremental backups, which means that only the blocks on the device that have changed since the last snapshot are backed up. Each snapshot that you create contains all of the information that is needed to fully restore an EBS volume.

When you create a snapshot, only data that has already been written to the volume is backed up. This might exclude data that has been cached by any applications or the operating system. To ensure a consistent and complete snapshot, we recommend that you pause write operations to the volume or that you unmount the volume from the instance before creating the snapshot.

Snapshots that are taken from encrypted volumes are automatically encrypted. Volumes that are created from encrypted snapshots are also automatically encrypted.


---

References: