S3 = Simple Storage Service
What is Amazon S3? - Amazon Simple Storage Service
Amazon S3 provides:
- storage for data (objects)
- organized as key-value structure - each object has a unique key and url
- divided into multiple buckets; each bucket contains objects
- web service for upload/download
S3 doesn't know anything about files, directories, or symlinks. It's just objects in buckets. [source]
S3 also has no concept of symbolic links created by ln -s. By default, it will turn all links into real files by making real copies. You can use aws s3 cp --no-follow-symlinks ... to ignore links. [Use S3 as major storage — GEOS-Chem on cloud]
S3 also has no concept of symbolic links created by ln -s. By default, it will turn all links into real files by making real copies. You can use aws s3 cp --no-follow-symlinks ... to ignore links. [Use S3 as major storage — GEOS-Chem on cloud]
Buckets
Each bucket has its own subdomain. E.g.: the one named as my-bucket would have URL:
To list all objects;
https://my-bucket.s3.amazonaws.com
Objects
- Data stored in buckets, which are logical containers
- There is no official limit to the number of objects or amount of data that can be stored in a bucket
- The size limit for objects is 5 TB
- Every object has a key (object name). This is usually a name of the file.
Each object has its key which uniquely identifies it within a bucket. E.g. if we upload file and assign key my-root-dir/dirA/fileA1 to it, its URL will be:
https://my-bucket.s3.amazonaws.com/my-root-dir/dirA/fileA1
https://my-bucket.s3.amazonaws.com/my-root-dir/dirA/fileA1
To list all objects;
% aws s3api list-object-versions --bucket my-bucket --query='{Objects: Versions[].{Key:Key,VersionId:VersionId}}' --output=json --profile=my-profile
{
"Objects": [
{
"Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/data-1NMp6hagSAu8qORLSpyVxw.dat",
"VersionId": "KaKyog0yM41SG._aWTuDllb9kXp67vLr"
},
{
"Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/data-3v3MO1kOTBOl0idDYWLstA.dat",
"VersionId": "i3PJhBpFtHrdKEtiEZPU_MFHYb2GqX0s"
},
...
}
Versioning
- One of the bucket features
- Means of keeping multiple variants of an object in the same bucket
- Used to preserve, retrieve, and restore every version of every object stored in the bucket
- Helps recovering more easily from both unintended user actions and application failures
- Ff Amazon S3 receives multiple write requests for the same object simultaneously, it stores all of those objects
- Buckets can be in one of three states:
- Unversioned (the default)
- Versioning-enabled
- Versioning-suspended
- After you version-enable a bucket, it can never return to an unversioned state. But you can suspend versioning on that bucket.
- If versioning is enabled:
- If you overwrite an object, Amazon S3 adds a new object version in the bucket. The previous version remains in the bucket and becomes a noncurrent version. You can restore the previous version.
- Every object, apart from its key/name, also gets its VersionId. Each version of the same object has a different VersionID. When we overwrite an object (PUT command), a new version is crated - object with the same key but a new VersionId.
- When we delete an object, a delete marker is created and that becomes a current version. GET requests will return 404 - Not Found. But, if we pass the VersionId, GET command will return noncurrent version of the object.
- To delete old version of the object, we need to pass VersionId in DELETE command
- Versioning flows: How S3 Versioning works - Amazon Simple Storage Service
Delete Markers
- Placeholders for objects that have been deleted
- Created when versioning is enabled and a simple DELETE request is made
- The delete marker becomes the current version of the object, and the object becomes the previous version
- Delete markers have a key name and version ID, but they don't have data associated with them
- Delete markers don't retrieve anything from a GET request
- The only operation you can use on a delete marker is DELETE, and only the bucket owner can issue such a request
To list all delete markers:
% aws s3api list-object-versions --bucket my-bucket --query '{Objects: DeleteMarkers[].{Key:Key,VersionId:VersionId}}' --output=json --profile=my-profile
{
"Objects": [
{
"Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/",
"VersionId": "hYbkDv9egrv_WE2jI4y0Lys5Bc2dbvgb"
},
{
"Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/data-1NMp6hagSAu8qORLSpyVxw.dat",
"VersionId": "KdLBHImFlC23EXkfq4Ic0.2x6wGJ2FxR"
},
...
}
Bucket Lifecycle Policy
If versioning is enabled, each time an object is updated a new version becomes the current version while previous version becomes the most recent noncurrent version. Over the time the number of object versions grows, taking more storage and driving costs up. If we want to keep only last V versions and/or we want to delete noncurrent versions after D days we can define a lifecycle policy.
Terraform resource: aws_s3_bucket_lifecycle_configuration | Resources | hashicorp/aws | Terraform | Terraform Registry
Deleting a bucket
Before we attempt to delete a bucket we need to make sure that both all objects and all delete markers are deleted.
To delete all objects (current versions):
% aws s3api delete-objects --bucket my-bucket --delete "$(aws s3api list-object-versions --bucket my-bucket --query='{Objects: Versions[].{Key:Key,VersionId:VersionId}}' --output=json --profile=my-profile)" --profile=my-profile
{
"Deleted": [
{
"Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/data-IOT1xReRSWuNLjW0es4SPg.dat",
"VersionId": "FvX7ePtV5MOLK2cxsWE.smTwMgnLoFie"
},
{
"Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/data-o29VZ5YOTYWL_pZ0aATn4g.dat",
"VersionId": "rodR2lazpLXZf3p1cXrBEXnnQDRYDGRj"
},
...
}
To delete all delete markers:
% aws s3api delete-objects --bucket my-bucket --delete "$(aws s3api list-object-versions --bucket my-bucket --query '{Objects: DeleteMarkers[].{Key:Key,VersionId:VersionId}}' --output=json --profile=my-profile)" --profile=my-profile
{
"Deleted": [
{
"Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/data-o29VZ5YOTYWL_pZ0aATn4g.dat",
"VersionId": "KFGMc_HXfMxG6vt7vIvIxFoAUny9rDWT",
"DeleteMarker": true,
"DeleteMarkerVersionId": "KFGMc_HXfMxG6vt7vIvIxFoAUny9rDWT"
},
{
"Key": "tests-7H_N5XLAT2K_sW5aGZfM1g/data-IOT1xReRSWuNLjW0es4SPg.dat",
"VersionId": "en7k.4jeTceJNvQhZm0PlJUNa6pZEQiL",
"DeleteMarker": true,
"DeleteMarkerVersionId": "en7k.4jeTceJNvQhZm0PlJUNa6pZEQiL"
},
...
}
S3 Batch Operations examples using the AWS CLI - Amazon Simple Storage Service
Uploaded images to AWS S3 – TeamCity Support | JetBrains
Uploaded images to AWS S3 – TeamCity Support | JetBrains
No comments:
Post a Comment