Thursday 22 February 2024

Introduction to AWS CloudFront


 

What is AWS CloudFront?

  • Content delivery network (CDN) provided by AWS

Why to use it?

  • To speed up delivery of web content (dynamic, static, streaming, interactive)
  • Content is distributed with low latency and high data transfer speeds

How does it work?

  • Files are delivered to end-users using a global network of edge locations
  • Users who request web content are automatically routed to the edge location that gives them the lowest latency.

How to set it for selected content?

  • Create a distribution and specify settings for it
    • Amazon S3 bucket or HTTP server that we want CloudFront to get the content from
    • whether we want only selected users to have access to that content
    • whether we want users to use HTTPS
    • Alternate domain name (CNAME). This optional setting is a custom domain name that we use in URLs for the files served by this distribution. Example: my-content.example.com.
  • CloudFront then assigns a domain name to the distribution (e.g. abcdef0123456.cloudfront.net) but it's possible to use custom domain name (e.g. example.com)
  • We can now access our resource via URL:
    • http://abcdef0123456.cloudfront.net/index.html or
    • http://example.com/index.html
  • ...
CloudFront >> Distributions are not region-specific, they are global.

Creating a new distribution in AWS Console



Origin:
  • Origin domain: domain name of the origin server
    • Clicking into the field opens a drop-down list of items grouped as:
      • Amazon S3 - list of S3 buckets
      • Elastic Load Balancer - list of ELBs
      • API Gateway - list of API GWs
      • Mediastore container
      • Mediapackage container
    • Example: my-test-bucket.s3.us-east-1.amazonaws.com
  • Origin path (optional): this can be:
    • a path in S3 bucket e.g. /images
  • Name: origin name
    • it gets populated automatically e.g. S3-my-test-bucket/images

If we choose S3 bucket for origin domain the following options appear:

  • Origin access - You can limit the access to your origin to only authenticated requests from CloudFront.
    • Public
      • Not an option if bucket does not allow public access
    • Origin access control settings (OAC)
      • Recommended for its wider range of features, including support of S3 buckets in all AWS Regions
    • Legacy access identities 
      • Origin access identity (OAI)
      • Not recommended
If we choose Origin access control settings (OAC) S3 bucket can restrict access to only CloudFront (Block public access is ON) in which case S3 bucket policy must be updated (S3 bucket >> Permissions >> Bucket Policy). CloudFront provides the policy statement after creating the distribution (once distribution's ARN is known). Such policy JSON might look like this:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowCloudFrontServicePrincipalReadOnly",
            "Effect": "Allow",
            "Principal": {
                "Service": "cloudfront.amazonaws.com"
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::my-test-bucket/*",
            "Condition": {
                "StringEquals": {
                    "AWS:SourceArn": "arn:aws:cloudfront::431123456827:distribution/E1WH4T3V3RREB3"
                }
            }
        }
    ]
}

If S3 bucket has Block public access is ON but no attached policy that allows CloudFront to access it, requesting a resource like e.g. https://example.com/image/background.jpg will result with this:

This XML file does not appear to have any style information associated with it. The document tree is shown below.
<Error>
   <Code>AccessDenied</Code>
   <Message>Access Denied</Message>
   <RequestId>ATEESSECBE0DZ92Z</RequestId>
   <HostId>81spSu0E0+kk7wpvd/IiG/4VuU9u2I7D/a43d+378SHau9HDRKyc724ert5Rxhh3j3oNemNXkZA=</HostId>
</Error>

  • Add custom header (optional)
    • CloudFront includes this header in all requests that it sends to your origin
  • Enable Origin Shield (yes/no)
    • Origin shield is an additional caching layer that can help reduce the load on your origin and help protect its availability.
  • Additional settings
    • Connection attempts
      • The number of times that CloudFront attempts to connect to the origin, from 1 to 3. The default is 3.
    • Connection timeout
      • The number of seconds that CloudFront waits for a response from the origin, from 1 to 10. The default is 10.
    • Response timeout - only applicable to custom origins
      • The number of seconds that CloudFront waits for a response from the origin, from 1 to 60. The default is 30.
    • Keep-alive timeout - only applicable to custom origins
      • The number of seconds that CloudFront maintains an idle connection with the origin, from 1 to 60. The default is 5.





Default Cache Behavior:

  • Path pattern
    • Default (*)
  • Compress objects automatically (yes/no)
  • Viewer
    • Viewer protocol policy
      • HTTP and HTTPS
      • Redirect HTTP to HTTPS
      • HTTPS only
    • Allowed HTTP methods
      • GET, HEAD
      • GET, HEAD, OPTIONS
      • GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE
    • Restrict viewer access (yes/no)
      • If you restrict viewer access, viewers must use CloudFront signed URLs or signed cookies to access your content.
  • Cache key and origin requests. We recommend using a cache policy and origin request policy to control the cache key and origin requests.
    • Cache policy and origin request policy (recommended)
      • Cache policy - Choose an existing cache policy or create a new one.
        • Select cache policy


      • Origin request policy - optional. Choose an existing origin request policy or create a new one.
        • Select origin policy



    • Legacy cache settings
  • Response headers policy - optional
    • Choose an existing response headers policy or create a new one.
    • Select response headers
    • Create response headers policy
 


  • Additional settings
    • Smooth streaming (yes/no)
      • Choose No if your origin is configured to use Microsoft IIS for Smooth Streaming.
    • Field-level encryption
      • Choose a field-level encryption configuration.
      • Select a field-level encryption profile
    • Enable real-time logs (yes/no)


CloudFront itself does not provide a DNS service but some other DNS provider can be used together with CloudFront:
  • Route 53
  • CloudFare
  • Godaddy

We need to:
  • Define the domain
  • Create a certificate in AWS Certificate Manager (ACM) and associate it with Cloudfront
  • With DNS provider to point the CNAME to the Cloudfront distribution.

CloudFront and Cloudflare


Both CloudFront and Cloudflare provide CDN service. We might be using Cloudflare for other proxy capabilities (e.g. injecting headers like X-Country-Code) and also like a DNS registrar (contains DNS records which are mapping hostnames to IP addresses). 

Here are some examples of setups where Cloudflare is used as DNS registrar for underlying CloudFront distribution:

Example #1: CloudFront distribution targets entire S3 bucket


On AWS we have:
  • S3 bucket:
    • contains only content for the website's blog
    • Namemy-website-blog-bucket.s3.us-east-1.amazonaws.com
    • Example of the inner hierarchy: 
      • images/background.png
      • posts/1/text.txt
      • ...
  • CloudFront distribution:
    • e.g. We want to use CDN only for blog content - entire S3 bucket
    • Domain namea330nfs2u0uf9xj.cloudfront.net
    • Alternate domain namesblog.mywebsite.com
    • Origin
      • Domainmy-website-blog-bucket.s3.us-east-1.amazonaws.com
      • Origin path: (empty)

On Cloudfront we have:
  • Website: mywebsite.com
    • DNS record:  
      • TypeCNAME (maps subdomain, in our case blog.mywebiste.com into domain a330nfs2u0uf9xj.cloudfront.net)
      • Nameblog (name of the subdomain of the website)
      • Targeta330nfs2u0uf9xj.cloudfront.net
      • Proxy status: DNS only (don't cache, only serve as DNS service)
      • TTL: Auto

This means that url https://blog.mywebsite.com/ is now mapped to a330nfs2u0uf9xj.cloudfront.net which is mapped to my-website-blog-bucket.s3.us-east-1.amazonaws.com


Here is how the original request gets transformed:

https://blog.mywebsite.com/images/backgroud.jpg

https://a330nfs2u0uf9xj.cloudfront.net/images/backgroud.jpg 

s3://my-website-blog-bucket.s3.us-east-1.amazonaws.com/images/backgroud.jpg


Example #2: CloudFront distribution targets one folder in S3 bucket


On AWS we have:
  • S3 bucket:
    • e.g. contains entire content of the website
    • Namemy-website-bucket.s3.us-east-1.amazonaws.com
    • Example of the inner hierarchy: 
      • blog/images/background.png
      • store/itemA/description.txt
      • ...
  • CloudFront distribution:
    • e.g. We want to use CDN only for blog content - one folder within S3 bucket
    • Domain namea330nfs2u0uf9xj.cloudfront.net
    • Alternate domain namesblog.mywebsite.com
    • Origin:
      • Domainmy-website-bucket.s3.us-east-1.amazonaws.com
      • Origin path (optional but we need to specify it here) - a URL path to append to the origin domain name for origin requests. In our case that path is: /blog. This means that when request is sent to my-website-bucket.s3.us-east-1.amazonaws.com/path/to/file this turns into my-website-bucket.s3.us-east-1.amazonaws.com/blog/path/to/file.

On Cloudfront we have:
  • Websitemywebsite.com
    • DNS record:  
      • TypeCNAME (maps subdomain, in our case blog.mywebiste.com into domain a330nfs2u0uf9xj.cloudfront.net)
      • Nameblog
      • Targeta330nfs2u0uf9xj.cloudfront.net
      • Proxy status: DNS only
      • TTL: Auto

This means that url https://blog.mywebsite.com/ is now mapped to a330nfs2u0uf9xj.cloudfront.net which is mapped to my-website-bucket.s3.us-east-1.amazonaws.com/blog

Here is how the original request gets transformed:

https://blog.mywebsite.com/images/backgroud.jpg

https://a330nfs2u0uf9xj.cloudfront.net/images/backgroud.jpg 

s3://my-website-bucket.s3.us-east-1.amazonaws.com/
blog/images/backgroud.jpg




Distribution States

  • Enabled
  • Disabled
    • Distribution is offline and cannot respond to requests
    • We can enable the distribution later to restore it
    • A distribution must be disabled before it can be deleted.
    • It may take a few minutes for CloudFront to fully propagate the disabled status to all edge locations after disabling the distribution. During this period distribution is not in deletable state so "Delete" button is disabled.
    • Once this propagation is completed, we should then be able to select the distribution and click Delete on the CloudFront console.

---

No comments: