Showing posts with label GCP. Show all posts
Showing posts with label GCP. Show all posts

Tuesday, 9 July 2024

Google Cloud storage options

 


Most applications need to store data e.g. media to be streamed, sensor data from devices.
Different applications and workloads require different storage database solutions.

Google Cloud has storage options for different data types:
  • structured
  • unstructured
  • transactional
  • relational

Google Cloud has five core storage products:
  • Cloud Storage (like AWS S3)
  • Cloud SQL
  • Spanner
  • Firestore
  • Bigtable



(1) Cloud Storage


Object Storage


Let's first define Object Storage.

Object storage is a computer data storage architecture that manages data as “objects” and not as:
  • a file and folder hierarchy (file storage) or 
  • as chunks of a disk (block storage)


These objects are stored in a packaged format which contains:
  • binary form of the actual data itself
  • relevant associated meta-data (such as date created, author, resource type, and permissions)
  • globally unique identifier. These unique keys are in the form of URLs, which means object storage interacts well with web technologies. 

Data commonly stored as objects include:
  • video
  • pictures
  • audio recordings


Cloud Storage:

  • Service that offers developers and IT organizations durable and highly available object storage
  • Google’s object storage product
  • Allows customers to store any amount of data, and to retrieve it as often as needed
  • Fully managed scalable service

Cloud Storage Uses


Cloud Storage has a wide variety of uses. A few examples include:
  • serving website content
  • storing data for archival and disaster recovery
  • distributing large data objects to end users via Direct Download
Its primary use is whenever binary large-object storage (also known as a “BLOB”) is needed for:
  • online content such as videos and photos
  • backup and archived data
  • storage of intermediate results in processing workflows

Buckets


Cloud Storage files are organized into buckets

A bucket needs:
  • globally unique name
  • specific geographic location for where it should be stored
    • An ideal location for a bucket is where latency is minimized. For example, if most of our users are in Europe, we probably want to pick a European location, so either a specific Google Cloud region in Europe, or else the EU multi-region
The storage objects offered by Cloud Storage are immutable, which means that we do not edit them, but instead a new version is created with every change made. Administrators have the option to either allow each new version to completely overwrite the older one, or to keep track of each change made to a particular object by enabling “versioning” within a bucket. 
  • With object versioning:
    • Cloud Storage will keep a detailed history of modifications (overwrites or deletes) of all objects contained in that bucket
    • We can list the archived versions of an object, restore an object to an older state, or permanently delete a version of an object, as needed
  • Without object versioning:
    •  by default new versions will always overwrite older versions

Access Control


In many cases, personally identifiable information may be contained in data objects, so controlling access to stored data is essential to ensuring security and privacy are maintained. Using IAM roles and, where needed, access control lists (ACLs), organizations can conform to security best practices, which require each user to have access and permissions to only the resources they need to do their jobs, and no more than that. 

There are a couple of options to control user access to objects and buckets:
  • For most purposes, IAM is sufficient. Roles are inherited from project to bucket to object.
  • If we need finer control, we can create access control lists. Each access control list consists of two pieces of information:
    • scope, which defines who can access and perform an action. This can be a specific user or group of users
    • permission, which defines what actions can be performed, like read or write
Because storing and retrieving large amounts of object data can quickly become expensive, Cloud Storage also offers lifecycle management policies
  • For example, we could tell Cloud Storage to delete objects older than 365 days; or to delete objects created before January 1, 2013; or to keep only the 3 most recent versions of each object in a bucket that has versioning enabled 
  • Having this control ensures that we’re not paying for more than we actually need

Storage classes and data transfer


There are four primary storage classes in Cloud storage:
  • Standard storage
    • considered best for frequently accessed or hot data
    • great for data that's stored for only brief periods of time
  • Nearline storage
    • Best for storing infrequently accessed data, like reading or modifying data on average once a month or less 
    • Examples may include data backups, long term multimedia content, or data archiving. 
  • Coldline storage
    • A low cost option for storing infrequently accessed data. 
    • However, as compared to near line storage, coldline storage is meant for reading or modifying data at most, once every 90 days. 
  • Archive storage
    • The lowest cost option used ideally for data archiving, online backup and disaster recovery
    • It's the best choice for data that we plan to access less than once a year because it has higher costs for data access and operations in a 365 day minimum storage duration

Characteristics that apply across all of these storage classes:
  • unlimited storage
  • no minimum object size requirement
  • worldwide accessibility and locations
  • low latency and high durability
  • a uniform experience which extends to security tools and API's
  • geo-redundancy if data is stored in a multi-region or dual region. This means placing physical servers in geographically diverse data centers to protect against catastrophic events and natural disasters, and low balancing traffic for optimal performance. 

Auto-class


Cloud storage also provides a feature called auto-class, which automatically transitions objects to appropriate storage classes based on each object's access pattern. The feature:
  • moves data that is not accessed to colder storage classes to reduce storage costs
  • moves data that is accessed to standard storage to optimize future accesses
Auto-class simplifies and automates cost saving for our cloud storage data. 

Cloud storage has no minimum fee because we pay only for what we use. Prior provisioning of capacity isn't necessary.

Data Encryption


Cloud storage always encrypts data on the server side before it's written to disc at no additional charge. Data traveling between a customer's device and Google is encrypted by default using HTTPS/TLS, which is transport layer security. 


Data Transfer into Google Cloud Storage


Regardless of which storage class we choose, there are several ways to bring data into Cloud storage:

  • Online Transfer
    • by using Cloud storage, which is the Cloud storage command from the Cloud SDK
    • by using a Dragon Drop option in the Cloud console if accessed through the Google Chrome web browser
  • Storage transfer service
    • enables us to import large amounts of online data into Cloud storage quickly and cost effectively 
    •  if we have to upload terabytes or even petabytes of data 
    • Lets us schedule and manage batch transfers to cloud storage from:
      • another Cloud provider
      • a different cloud storage region
      • an HTTPS endpoint
  • Transfer Appliance
    • A rackable, high capacity storage server that we lease from Google Cloud
    • We connect it to our network, load it with data, and then ship it to an upload facility where the data is uploaded to cloud storage
    • We can transfer up to a petabyte of data on a single appliance
  • Moving data in internally, from Google Cloud services as Cloud storage is tightly integrated with other Google Cloud products and services. For example, we can:
    • import and export tables to and from both BigQuery and Cloud SQL
    • store app engine logs, files for backups, and objects used by app engine applications like images
    • store instance start up scripts, compute engine images, and objects used by compute engine applications
We should consider using Cloud Storage if we need to store immutable blobs larger than 10 megabytes, such as large images or movies. This storage service provides petabytes of capacity with a maximum unit size of 5 terabytes per object. 


Provisioning Cloud Storage Bucket


We can use e.g. Google Cloud console >> Activate Cloud Shell:









Then execute the following commands in it.

Create an env variables containing the location and bucket name:

$ export LOCATION=EU
$ export BUCKET_NAME=my-unique-bucket-name

or we can use the project ID as it is globally unique:

$ export BUCKET_NAME=$DEVSHELL_PROJECT_ID

To create a bucket with CLI:

$ gcloud storage buckets create -l $LOCATION gs://$BUCKET_NAME

We might be prompted to authorize execution of this command:


To download an item from a bucket to the local host:

$ gcloud storage cp gs://cloud-training/gcpfci/my-excellent-blog.png my-excellent-blog.png

To upload a file from a local host to the bucket:

$ gcloud storage cp my-excellent-blog.png gs://$BUCKET_NAME/my-excellent-blog.png

To modify the Access Control List of the object we just created so that it's readable by everyone:

$ gsutil acl ch -u allUsers:R gs://$BUCKET_NAME/my-excellent-blog.png



We can check in Google Console that bucket and the image in it:






(2) Cloud SQL


It offers fully managed relational databases as a service, including:
  • MySQL
  • PostgreSQL
  • SQL Server 

It’s designed to hand off mundane, but necessary and often time-consuming, tasks to Google, like 
  • applying patches and updates
  • managing backups
  • configuring replications

Cloud SQL:
  • Doesn't require any software installation or maintenance
  • Can scale up to 128 processor cores, 864 GB of RAM, and 64 TB of storage. 
  • Supports automatic replication scenarios, such as from:
    • Cloud SQL primary instance
    • External primary instance
    • External MySQL instances
  • Supports managed backups, so backed-up data is securely stored and accessible if a restore is required. The cost of an instance covers seven backups
  • Encrypts customer data when on Google’s internal networks and when stored in database tables, temporary files, and backups
  • Includes a network firewall, which controls network access to each database instance

Cloud SQL instances are accessible by other Google Cloud services, and even external services. 
  • Cloud SQL can be used with App Engine using standard drivers like Connector/J for Java or MySQLdb for Python. 
  • Compute Engine instances can be authorized to access Cloud SQL instances and configure the Cloud SQL instance to be in the same zone as our virtual machine
  • Cloud SQL also supports other applications and tools, like:
    • SQL Workbench
    • Toad
    • other external applications using standard MySQL drivers

Provisioning Cloud SQL Instance





SQL >> Create Instance:



...and then choose values for following properties:
  • Database engine:
    • MySQL
    • PostgreSQL
    • SQL Server
  • Instance ID - arbitrary string e.g. blog-db
  • Root user password: arbitrary string (There's no need to obscure the password because we use mechanisms to connect that aren't open access to everyone)
  • Choose a Cloud SQL edition:
    • Edition type:
      • Enterprise
      • Enterprise Plus
    • Choose edition preset:
      • Sandbox
      • Development
      • Production
  • Choose region - This should be the same region and zone into which we launched the Cloud Compute VM instance. The best performance is achieved by placing the client and the database close to each other.
  • Choose zonal availability
    • Single zone - In case of outage, no failover. Not recommended for production.
    • Multiple zones (Highly available) - Automatic failover to another zone within your selected region. Recommended for production instances. Increases cost.
  • Select Primary zone

click on image to zoom


During DB creation:

click on image to zoom


Once DB instance is created:



DB has root user created:


Default networking:





 Now we can:
  • see its Public IP address (e.g. 35.204.71.237)
  • Add User Account
    • username
    • password
  • set Connections
    • Networking >> Add a Network
      • Choose between Private IP connection and a Public IP connection
      • set Name
      • Network: <external_IP_of_VM_Instance>/32 (If chosen Public IP connection then use instance's external IP address)

Adding a user:


After user is added:



Adding a new network:


After new network is added:





(3) Spanner


Spanner:
  • Fully managed relational database service that scales horizontally, is strongly consistent, and speaks SQL
  • Service that powers Google’s $80 billion business (Google’s own mission-critical applications and services)
  • Especially suited for applications that require:
    • SQL relational database management system with joins and secondary indexes
    • built-in high availability
    • strong global consistency
    • high numbers of input and output operations per second (tens of thousands of reads and writes per second or more)

The horizontal scaling approach, sometimes referred to as "scaling out," entails adding more machines to further distribute the load of the database and increase overall storage and/or processing power. [A Guide To Horizontal Vs Vertical Scaling | MongoDB]

We should consider using Cloud SQL or Spanner if we need full SQL support for an online transaction processing system. 

Cloud SQL provides up to 64 terabytes, depending on machine type, and Spanner provides petabytes. 

Cloud SQL is best for web frameworks and existing applications, like storing user credentials and customer orders. If Cloud SQL doesn’t fit our requirements because we need horizontal scalability, not just through read replicas, we should consider using Spanner. 

(4) Firestore


Firestore is a flexible, horizontally scalable, NoSQL cloud database for mobile, web, and server development. 

With Firestore, data is stored in documents and then organized into collections. Documents can contain complex nested objects in addition to subcollections. Each document contains a set of key-value pairs. For example, a document to represent a user has the keys for the firstname and lastname with the associated values. 

Firestore’s NoSQL queries can then be used to retrieve:
  • individual, specific documents or 
  • all the documents in a collection that match our query parameters
Queries can include multiple, chained filters and combine filtering and sorting options. They're also indexed by default, so query performance is proportional to the size of the result set, not the dataset. 

Firestore uses data synchronization to update data on any connected device. However, it's also designed to make simple, one-time fetch queries efficiently. It caches data that an app is actively using, so the app can write, read, listen to, and query data even if the device is offline. When the device comes back online, Firestore synchronizes any local changes back to Firestore. 

Firestore leverages Google Cloud’s powerful infrastructure: 
  • automatic multi-region data replication
  • strong consistency guarantees
  • atomic batch operations
  • real transaction support
We should consider Firestore if we need massive scaling and predictability together with real time query results and offline query support. This storage service provides terabytes of capacity with a maximum unit size of 1 megabyte per entity. Firestore is best for storing, syncing, and querying data for mobile and web apps. 


(5) Bigtable

Bigtable:
  • Google's NoSQL big data database service
  • The same database that powers many core Google services, including Search, Analytics, Maps, and Gmail
  • Designed to handle massive workloads at consistent low latency and high throughput, so it's a great choice for both operational and analytical applications, including Internet of Things, user analytics, and financial data analysis. 

When deciding which storage option is best, we should choose Bigtable if: 
  • We work with more than 1TB of semi-structured or structured data
  • Data is fast with high throughput, or it’s rapidly changing
  • We work with NoSQL data. This usually means transactions where strong relational semantics are not required
  • Data is a time-series or has natural semantic ordering
  • We work with big data, running asynchronous batch or synchronous real-time processing on the data
  • We are running machine learning algorithms on the data

Bigtable can interact with other Google Cloud services and third-party clients. 

Using APIs, data can be read from and written to Bigtable through a data service layer like:
  • Managed VMs
  • HBase REST Server
  • Java Server using the HBase client
Typically this is used to serve data to applications, dashboards, and data services. 

Data can also be streamed in through a variety of popular stream processing frameworks like:
  • Dataflow Streaming
  • Spark Streaming
  • Storm
And if streaming is not an option, data can also be read from and written to Bigtable through batch processes like:
  • Hadoop MapReduce
  • Dataflow
  • Spark
Often, summarized or newly calculated data is written back to Bigtable or to a downstream database.

We should consider using Bigtable if we need to store a large number of structured objects. Bigtable doesn’t support SQL queries, nor does it support multi-row transactions. This storage service provides petabytes of capacity with a maximum unit size of 10 megabytes per cell and 100 megabytes per row. Bigtable is best for analytical data with heavy read and write events, like AdTech, financial, or IoT data. 


--- 

BigQuery hasn’t been mentioned in this section because it sits on the edge between data storage and data processing. The usual reason to store data in BigQuery is so we can use its big data analysis and interactive querying capabilities, but it’s not purely a data storage product.


Monday, 8 July 2024

Google Cloud DNS and Cloud CDN

 


DNS (Domain Name Service) is what translates internet hostnames to addresses.

8.8.8.8 is a free Google public DNS.


Cloud DNS
  • DNS service for internet hostnames and addresses of applications built in Google Cloud
  • managed DNS service (like AWS Route53)
  • runs on the same infrastructure as Google
  • has low latency
  • high availability
  • cost-effective way to make our applications and services available to our users
  • DNS information we publish is served from redundant locations around the world
  • programmable: we can publish and manage millions of DNS zones and records using the Cloud Console, the command-line interface, or the API. 


Edge caching refers to the use of caching servers to store content closer to end users. 


Cloud CDN

  • Google's global system of edge caches (like Amazon CloudFront)
  • Used to accelerate content delivery in our application
    • our customers will experience lower network latency
    • the origins of our content will experience reduced load
  • can be enabled with a single checkbox, after HTTP(S) Load Balancing is set up

Some other CDNs are part of Google Cloud’s CDN Interconnect partner program, and we can continue to use it.

Thursday, 4 July 2024

Google Cloud Compute Engine

Identity and API access This article extends my notes from Coursera course Google Cloud Fundamentals: Core Infrastructure | Coursera






Compute Engine

  • example of IaaS
  • like AWS EC2
  • with it, users can create and run virtual machines on Google infrastructure
  • no upfront investments
  • thousands of virtual CPUs can run on a system that’s designed to be fast and to offer consistent performance


Each virtual machine contains the power and functionality of a full-fledged operating system. This means a virtual machine can be configured much like a physical server, by specifying required:

  • amount of CPU power (virtual CPUs)
  • memory
  • amount and type of storage
  • operating system

We can use machine types which are:
  • predefined
  • custom, created by us
    • We pay for what we need with custom machine types.
    • We can in fact configure very large VMs, which are great for workloads such as in-memory databases and CPU-intensive analytics
    • most Google Cloud customers start off with scaling out (autoscaling - see below), not up
    • The maximum number of CPUs per VM is tied to its machine family and is also constrained by the quota available to the user, which is zone-dependent (see cloud.google.com/compute/docs/machine-types)


A virtual machine instance can be created via:

  • Google Cloud console, which is a web-based tool to manage Google Cloud projects and resources
  • Google Cloud CLI
  • Compute Engine API


The instance can run:

  • Linux and Windows Server images provided by Google or any customized versions of these images
  • images of other operating systems that we build


Cloud Marketplace

  • Offers software packages e.g. Bitnami LAMP stack (Bitnami package for LAMP – Marketplace – Google Cloud console)
  • A quick way to get started with Google Cloud
  • Offers solutions from both Google and third-party vendors
  • With these solutions, there’s no need to manually configure the software, virtual machine instances, storage, or network settings, although many of them can be modified before launch if that’s required
  • Most software packages are available at no additional charge beyond the normal usage fees for Google Cloud resources


Compute Engine’s pricing and billing structure:

  • use of virtual machines is billed by the second with a one-minute minimum
  • sustained-use discounts start to apply automatically to virtual machines the longer they run
    •  for each VM that runs for more than 25% of a month, Compute Engine automatically applies a discount for every additional minute
  • also offers committed-use discounts: for stable and predictable workloads, a specific amount of vCPUs and memory can be purchased with a discount in return for committing to a usage term of one year or three years. 
  • if we have a workload that doesn’t require a human to sit and wait for it to finish–such as a batch job analyzing a large dataset we can save money, in some cases up to 90%, by choosing Preemptible or Spot VMs to run the job. 
    • Preemptible or Spot VM is different from an ordinary Compute Engine VM in only one respect: Compute Engine has permission to terminate a job if its resources are needed elsewhere. Although savings are possible with preemptible or spot VMs, we'll need to ensure that our job can be stopped and restarted. 
    • Spot VMs differ from Preemptible VMs by offering more features
      • preemptible VMs can only run for up to 24 hours at a time
      • Spot VMs do not have a maximum runtime


Storage

  • high throughput between processing and persistent disks is default


Autoscaling

  • Compute Engine feature
  • VMs can be added to or subtracted from an application based on load metrics

Load Balancing

  • balancing the incoming traffic among the VM
  • Google’s Virtual Private Cloud (VPC) supports several different kinds of load balancing



Provisioning 



When we create a new Compute Engine VM Instance we can specify:

  • Name
  • Region
  • Zone
  • Machine type e.g. Series E2, N2 etc...
  • Boot disk image e.g. Debian GNU/Linux 11 (bullseye)
    • boot disk's architecture (x86/64 or ARM) must be compatible with the selected machine type
  • Identity and API access
    • Service Account: e.g. Compute Engine default service account
        • Requires the Service Account User role (roles/iam.serviceAccountUser) to be set for users who want to access VMs with this service account.
      • Applications running on the VM use the service account to call Google Cloud APIs. Use Permissions on the console menu to create a service account or use the default service account if available.
    • Access scopes: Select the type and level of API access to grant the VM. Default: read-only access to Storage and Service Management, write access to Stackdriver Logging and Monitoring, read/write access to Service Control.
      • Default access
      • Full access to Google APIs
      • Set access for each API
  • Firewall. By default all incoming traffic from outside a network is blocked. Select the type of network traffic you want to allow.  Add tags and firewall rules to allow specific network traffic from the Internet
    • Allow HTTP traffic
    • Allow HTTPS traffic
    • Allow Load Balancer Health Checks
  • Advanced options
    • Networking - Hostname and network interfaces
    • Disks - Additional disks
    • Security - Shielded VM and SSH keys
    • Management - Description, deletion protection, reservations, and automation
      • Automation  >> Startup script - You can choose to specify a startup script that will run when your instance boots up or restarts. Startup scripts can be used to install software and updates, and to ensure that services are running within the virtual machine.
Startup script can be e.g. for setting LAMP stack:

apt-get update
apt-get install apache2 php php-mysql -y
service apache2 restart

click to see the original image



Once instance is running we're able to see internal (e.g. 10.164.0.2) and external IP (e.g. 34.90.152.162) addresses assigned to it:




SSH-ing to VM:







References:


Google Cloud Fundamentals: Core Infrastructure | Coursera.


Disclaimer: The course content rights belong to course creators (Google Cloud Training).


Virtual Private Cloud networking in Google Cloud

This article extends my notes from Coursera course Google Cloud Fundamentals: Core Infrastructure | Coursera




Virtual Private Cloud networking

Virtual private cloud (VPC):
  • secure, individual, private cloud-computing model hosted within a public cloud (e.g. Google Cloud)
  • on it customers can run code, store data, host websites, and do anything else they could do in an ordinary private cloud, but this private cloud is hosted remotely by a public cloud provider
  • provides networking functionality to:
    • Compute Engine virtual machine (VM) instances
    • Kubernetes Engine containers
    • App Engine flexible environment
  • without it we cannot create VM instances, containers, or App Engine applications => each Google Cloud project has a default network to get us started
  • Users can define their own (custom) virtual private cloud inside their Google Cloud project or simply use default virtual private cloud
  • VPCs combine the scalability and convenience of public cloud computing with the data isolation of private cloud computing
  • VPC networks connect Google Cloud resources to each other and to the internet. This includes:
    • segmenting networks
    • using firewall rules to restrict access to instances
    • creating static routes to forward traffic to specific destinations
  • We can think of a VPC network as similar to a physical network, except that it is virtualized within Google Cloud
  • VPC network is a global resource that consists of a list of regional virtual subnetworks (subnets) in data centers, all connected by a global wide area network (WAN)
  • VPC networks are logically isolated from each other in Google Cloud
  • Each Google Cloud project has a default network with subnets, routes, and firewall rules. 
  • Without a VPC network:
    • there are no routes and no firewall rules
    • we cannot create a VM instance 

Regions have three or more zones. For example, the us-west1 region denotes a region on the west coast of the United States that has three zones: us-west1-a, us-west1-b, and us-west1-c. Resources that live in a zone, such as virtual machine instances or zonal persistent disks, are referred to as zonal resources. Other resources, like static external IP addresses, are regional. Regional resources can be used by any resource in that region, regardless of zone, while zonal resources can only be used by other resources in the same zone. [Regions and zones  |  Compute Engine Documentation  |  Google Cloud]

Google VPC networks are global (cross-regional, as opposed to AWS VPCs which are per-region). This architecture makes it easy to define network layouts with global scope. 
  • They can also have subnets, which is a segmented piece of the larger network, in any Google Cloud region worldwide. 
    • Subnets can span the zones that make up a region (=> subnets are per-region, they can span multiple zones - data centres). Resources can even be in different zones on the same subnet. 
    • Each subnet/region gets IP address range assigned. When an instance is created for our VPC network, it will be assigned an IP from the appropriate region’s address range.
    • The size of a subnet can be increased by expanding the range of IP addresses allocated to it, and doing so won’t affect virtual machines that are already configured
    • The default network has a subnet in each Google Cloud region
    • Each subnet is associated with a Google Cloud region and a private RFC 1918 CIDR block for its internal IP addresses range and a gateway
  • Example: VPC network named vpc1 has two subnets defined in the asia-east1 and us-east1 regions. If the VPC has three Compute Engine VMs attached to it, it means they’re neighbors on the same subnet even though they’re in different zones.
    • Subnet 10.0.0.0/24 is in asia-east1 region and ec2 instances can be in different zones, although in the same subnet:
      • ec2 with IP 10.0.0.2 is in asia-east1-a
      • ec2 with IP 10.0.0.3 is in asia-east1-b
      • ec2 with IP 10.0.0.4 is in asia-east1-c
    • Subnet 10.0.1.0/24 is in us-east1 region and ec2 instances can be in different zones, although in the same subnet:
      • ec2 with IP 10.0.1.2 is in us-east1-b
      • ec2 with IP 10.0.1.3 is in us-east1-c
      • ec2 with IP 10.0.1.4 is in us-east1-d
    •  This capability can be used to build solutions that are resilient to disruptions yet retain a simple network layout.

This image shows a similar VPC and subnets layout:


source: https://cloud.google.com/static/vpc/images/vpc-overview-example.svg



Subnets let us create our own private cloud topology within Google Cloud. For Subnet creation mode we can choose Automatic to create a subnet in each region, or Custom to manually define the subnets.

When we create a VPC, if for Subnet creation mode we choose Automatic, we'll get Auto mode VPC Network. Auto mode networks create subnets in each region automatically.

If we ever delete the default network, we can quickly re-create it by creating an auto mode network. 
Custom creation mode supports IPv4, or IPv4 and IPv6 (dual-stack). Automatic creation mode supports IPv4 (single-stack) only.

By default, when we create a VM instance in a VPC, its External IP address is ephemeral. If an instance is stopped, any ephemeral external IP addresses assigned to the instance are released back into the general Compute Engine pool and become available for use by other projects. When a stopped instance is started again, a new ephemeral external IP address is assigned to the instance. Alternatively, we can reserve a static external IP address, which assigns the address to our project indefinitely until we explicitly release it.
 

Virtual Private Cloud compatibility features


Routing Tables


Routes tell VM instances and the VPC network how to send traffic from an instance to a destination, either inside the network or outside Google Cloud. 

Each VPC network comes with some default routes to route traffic among its subnets and send traffic from eligible instances to the internet.

There is a route for each subnet. These routes are managed for us, but we can create custom static routes to direct some packets to specific destinations. For example, we can create a route that sends all outbound traffic to an instance configured as a NAT gateway.

Much like physical networks, VPCs have routing tables. VPC routing tables are:
  • built-in so we don’t have to provision or manage a router
  • used to forward traffic from one instance to another within the same network, across subnetworks, or even between Google Cloud zones, without requiring an external IP address

Firewall


Firewall rules control incoming or outgoing traffic to an instance. By default, incoming traffic from outside our network is blocked.

VPCs provide a global distributed firewall:
  • we don’t have to provision or manage it
  • it can be controlled to restrict access to instances through both incoming and outgoing traffic
  • Firewall rules can be defined through network tags on Compute Engine instances
    • Example: we can tag all our web servers with, say, “WEB,” and write a firewall rule saying that traffic on ports 80 or 443 is allowed into all VMs with the “WEB” tag, no matter what their IP address happens to be
Each VPC network implements a distributed virtual firewall that we can configure. 
Firewall rules allow us to control which packets are allowed to travel to which destinations. 
Every VPC network has two implied firewall rules that block all incoming connections and allow all outgoing connections.

Firewall rules table has the following columns: Targets, Filters, Protocols/ports, and Action.

In GCP Console, in the left pane, if we click Firewall, we'll notice that there are 4 Ingress firewall rules for the default network:
  • default-allow-icmp
  • default-allow-rdp
  • default-allow-ssh
  • default-allow-internal
These firewall rules allow ICMP, RDP, and SSH ingress traffic from anywhere (0.0.0.0/0) and all TCP, UDP, and ICMP traffic within the network (e.g. 10.128.0.0/9). 

When creating a new VPC, in Firewall section we can select all available rules. These are the same standard firewall rules that the default network has. The deny-all-ingress and allow-all-egress rules are also displayed, but you cannot check or uncheck them because they are implied. These two rules have a lower Priority (higher integers indicate lower priorities) so that the allow ICMP, custom, RDP and SSH rules are considered first.

After recreating the default network, allow-internal changes to allow-custom firewall rule. We can ping from within one VM instance another VM instance's internal IP (another instance can be in different region!) because of the allow-custom firewall rule. 

*-allow-icmp firewall rule allows the ping to VM's external IP.

We can SSH to VM instance we create because of the allow-ssh firewall rule, which allows incoming traffic from anywhere (0.0.0.0/0) for tcp:22. The SSH connection works seamlessly because Compute Engine generates an SSH key for us and stores it in one of the following locations:
  • By default, Compute Engine adds the generated key to project or instance metadata.
  • If our account is configured to use OS Login, Compute Engine stores the generated key with our our account.
Alternatively, we can control access to Linux instances by creating SSH keys and editing public SSH key metadata.

To test connectivity to VM's internal or external IP we can use:

ping -c 4 <IP>

VPC Peering


VPCs belong to Google Cloud projects. VPC Peering:
  • used when our company has several Google Cloud projects, and the VPCs need to talk to each other
  • with it, a relationship between two VPCs can be established to exchange traffic

Shared VPC


Uses the full power of Identity Access Management (IAM) to control who and what in one project can interact with a VPC in another.


Connecting networks to Google VPC


There are several effective ways to connect the Google Virtual Private Cloud networks to other networks such as on-premises networks or networks in other clouds:
  • Cloud VPN can be used to create a “tunnel” connection over the internet
    • To make the connection dynamic, a Google Cloud feature called Cloud Router can be used. Cloud Router lets other networks and Google VPC, exchange route information over the VPN using the Border Gateway Protocol. Using this method, if we add a new subnet to our Google VPC, our on-premises network will automatically get routes to it. 
    • Using the internet to connect networks isn't always the best option, either because of security concerns or because of bandwidth reliability
  • Direct Peering for “peering” with Google 
    • Peering means putting a router in the same public data center as a Google point of presence (PoP) and using it to exchange traffic between networks
    • Google has more than 100 points of presence around the world. 
  • Carrier Peering program - for customers who aren’t already in a point of presence
    • Gives direct access from on-premises network through a service provider's network to Google Workspace and to Google Cloud products that can be exposed through one or more public IP addresses
    • Downside: it isn’t covered by a Google Service Level Agreement
  • Dedicated Interconnect can be used if getting the highest uptimes for interconnection is important
    • Allows for one or more direct, private connections to Google
    • If these connections have topologies that meet Google’s specifications, they can be covered by an SLA of up to 99.99%.
    • These connections can be backed up by a VPN for even greater reliability. 
    • Service Level Agreement is available
  • Partner Interconnect provides connectivity between an on-premises network and a VPC network through a supported service provider. 
    • Useful if a data center is in a physical location that can't reach a Dedicated Interconnect colocation facility, or if the data needs don’t warrant an entire 10 GigaBytes per second connection. 
    • Depending on availability needs, Partner Interconnect can be configured to support mission-critical services or applications that can tolerate some downtime. 
    • As with Dedicated Interconnect, if these connections have topologies that meet Google’s specifications, they can be covered by an SLA of up to 99.99%
  • Cross-Cloud Interconnect helps establish high-bandwidth dedicated connectivity between Google Cloud and another cloud service provider
    • Google provisions a dedicated physical connection between the Google network and that of another cloud service provider. We can use this connection to peer our Google Virtual Private Cloud network with our network that's hosted by a supported cloud service provider. 
    • Supports our adoption of an integrated multicloud strategy. In addition to supporting various cloud service providers, Cross-Cloud Interconnect offers reduced complexity, site-to-site data transfer, and encryption. 
    • Connections are available in two sizes: 10 Gbps or 100 Gbps.


References:



Disclaimer: The course content rights belong to course creators (Google Cloud Training).

Saturday, 25 May 2024

Google Cloud Fundamentals: Core Infrastructure - Introduction

This article extends my notes from Google Cloud Fundamentals: Core Infrastructure | Coursera




Introducing Google Cloud


Pricing and billing Google Cloud offerings can be broadly categorized as:

  • compute
  • storage
  • big data
  • machine learning
  • application services

...for web, mobile, analytics, and backend solutions.

Cloud computing overview


Cloud computing is a way of using information technology (IT) that has these five equally important traits:
  • customers get computing resources that are on-demand and self-service.
    • Through a web interface, users get the processing power, storage, and network they need with no need for human intervention
  • customers get access to those resources over the internet
  • the cloud provider has a big pool of those resources and allocates them to users out of that pool.
    • That allows the provider to buy in bulk and pass the savings on to the customers. Customers don't have to know or care about the exact physical location of those resources.
  • resources are elastic–which means they’re flexible, so customers can be
    • If they need more resources they can get more, and quickly. If they need less, they can scale back
  • customers pay only for what they use, or reserve as they go
    • If they stop using resources, they stop paying

A bit of history

  • first wave of the trend towards cloud computing: colocation. Colocation gave users the financial efficiency of renting physical space, instead of investing in (private) data center real estate
  • Second wave: Virtualized data centers of today. Servers, CPUs, disks, load balancers, and so on—are virtual devices. With virtualization, enterprises still maintain the infrastructure; but it also remains a user-controlled and user-configured environment. 
  • Third-wave cloud: Google switched to a container-based architecture—a fully automated, elastic  that consists of a combination of automated services and scalable data. Services automatically provision and configure the infrastructure used to run applications. 
Google believes that, in the future, every company—regardless of size or industry— will differentiate itself from its competitors through technology. Increasingly, that technology will be in the form of software. Great software is based on high-quality data. This means that every company is, or will eventually become, a data company.


IaaS and PaaS

In virtualized data centres:

  • IaaS - Infrastructure as a Service
    • provides
      • Raw compute e.g. Compute Engine (which is like AWS EC2)
      • Storage
      • Network capabilities
    • organized virtually into resources that are similar to physical data centers
    • customers pay for the resources they allocate ahead of time
  • PaaS - Platform as a Service
    • lets us bind our application code to libraries that give access to the infrastructure our application needs
    • This allows more resources to be focused on application logic.
    • customers pay for the resources they actually use
  • Serverless 
    • yet another step in the evolution of cloud computing
    • allows developers to concentrate on their code, rather than on server configuration, by eliminating the need for any infrastructure management.
    • Google serverless: 
      • Cloud Functions - manages event driven code as a pay as you go service
      • Cloud Run - allows customers to deploy their containerized microservices based application, in a fully managed environment.
  • SaaS - Software as a Service
    • SaaS applications aren't installed on your local computer. Instead, they run in the cloud as a service and are consumed directly over the internet by end users
    • Google SaaS applications: Gmail, Docs and Drive

The Google Cloud network

  • highest possible throughput
  • lowest possible latencies
  • 100+ content caching nodes worldwide
  • 5 major geographic locations (5 continents)
    • multiple regions at each location e.g.  europe-west2 (London)
    • regions are composed of zones e.g.  europe-west2-a, europe-west2-b and europe-west2-c 
  • resources can be run in different regions and/or zones within the same region
    • this brings application closer to the user (lowers the latency) - Google Cloud customer use resources in several regions around the world for this reason
    • redundancy (improves the availability)
    • improves fault tolerance -  primary benefit to a Google Cloud customer of using resources in several zones within a region
    • some GC resources can be run in multi-region e.g. eur3-Europe
  • currently: 118 zones in 39 regions

Environmental impact

  • GC data centers use 2% of the world's electricity

Google Infrastructure Security

  • Hardware Infrastructure Layer
    • HW design 
    • custom chips, hardware security chips
    • secure boot stack - to ensure that servers are booting the correct software stack
      • cryptographic signatures over the bios, bootloader, kernel and OS
    • Physical security of data centers
  • Service Deployment Layer
    • Encryption of inter-service communication
    • Google services communicate with each other using RPC (Remote Procedure Calls) calls
      • all RPC traffic between and inside data centers is encrypted
  • User Identity Layer
    • User Identity
      • 2FA, U2F
  • Storage Services Layer
    • Encryption at rest
  • Internet Communication Layer
    • Google Front End (GFE)
    • Denial of Service (DoS) protection
  • Operational Security Layer
    • Intrusion detection
      • red team exercises
    • Reducing insider risk
    • Employee Universal Second Factor (U2F) use
    • Software Development practices
      • peer reviews
      • libraries
      • vulnerability rewards

Open source ecosystems

  • Tensorflow
  • Kubernetes and Google Kubernetes Engine give the ability to mix and match microservices running across different clouds
  • Operations Suite

Pricing and billing

  • per-second billing for IaaS compute engine etc
  • Budgets
    • fix limit
    • bound to some metric like usage in previous month
  • Alerts
  • Reports
  • Quotas 
    • prevent over-consumption
    • applied at project level
    • Rate quotas
      • reset after a specific time
      • e.g. GKE allows 3000 calls to its API from each Google Cloud project every 100 seconds; after that time the limit is reset
    • Allocation quotas
      • governs number of resources in projects
      • e.g. each project can have max 15 virtual private cloud networks
    • increase on request to Google support team

Resources and Access in the Cloud

Google Cloud resource hierarchy


Functional structure of Google Cloud: 
  • Level 4 (top): organization node
    • encompasses all the projects, folders, and resources in your organization
  • Level 3: folders and subfolders
  • Level 2: projects
    • Projects can be organized into folders, or even subfolders.
  • Level 1 (bottom): resources
    • virtual machines, Cloud Storage buckets, tables in BigQuery, or anything else in Google Cloud
    • Resources are organized into projects
  • This resource hierarchy directly relates to how policies are managed and applied when you use Google Cloud. 
  • Policies can be defined at the project, folder, and organization node levels. 
  • Some Google Cloud services allow policies to be applied to individual resources, too. 
  • Policies are also inherited downward. This means that if you apply a policy to a folder, it will also apply to all of the projects within that folder. 


Projects
  • the basis for enabling and using Google Cloud services, like managing APIs, enabling billing, adding and removing collaborators, and enabling other Google services  
  • Each project is a separate entity under the organization node, and each resource belongs to exactly one project
  • Projects can have different owners and users because they’re billed and managed separately
  • Each Google Cloud project has three identifying attributes:
    • project ID
      • globally unique identifier assigned by Google that can’t be changed after creation (they are immutable)
      • can be modified by the customer during creation
      • used in different contexts to inform Google Cloud of the exact project to work with
    • project name
      • user-created
      • don’t have to be unique and they can be changed at any time, so they are not immutable
    • project number
      • assigned by Google Cloud also assigns each project
      • unique
      • mainly used internally by Google Cloud to keep track of resources


Google Cloud’s Resource Manager tool
  • designed to programmatically help you manage projects
  • an API that can gather a list of all the projects associated with an account, create new projects, update existing projects, and delete projects
  • it can even recover projects that were previously deleted
  • can be accessed through the RPC API and the REST API

Folders
  • let you assign policies to resources at a level of granularity you choose
  • folder can contain projects, other folders, or a combination of both
  • You can use folders to group projects under an organization in a hierarchy. 
    • For example, your organization might contain multiple departments, each with its own set Google Cloud resources. 
    • Folders allow you to group these resources on a per-department basis.
    • Folders also give teams the ability to delegate administrative rights so that they can work independently. 
  • resources in a folder inherit policies and permissions from that folder
    • For example, if you have two different projects that are administered by the same team, you can put policies into a common folder so they have the same permissions. Doing it the other way--putting duplicate copies of those policies on both projects–could be tedious and error-prone. if you needed to change permissions on both resources, you would now have to do that in two places instead of just one.
Organization 
  • Topmost resource in the Google Cloud hierarchy
  • To use folders, we must have an organization node
  • Everything else attached to that account goes under this node, which includes folders, projects, and other resources. 
  • There are some special roles associated with this top-level organization node. 
    • For example, we can designate an organization policy administrator so that only people with privilege can change policies. 
    • We can also assign a project creator role, which is a great way to control who can create projects and, therefore, who can spend money. 
  • How a new organization node is created depends on whether our company is also a Google Workspace customer
    • If we have a Workspace domain, Google Cloud projects will automatically belong to our organization node. Otherwise, we can use Cloud Identity, Google’s identity, access, application, and endpoint management platform, to generate one. 
  • Once created, a new organization node will let anyone in the domain create projects and billing accounts, just as they could before.  We'll also be able to create folders underneath it and put projects into it. 
  • Both folders and projects are considered to be “children” of the organization node.

​Identity and Access Management (IAM)

  • used by administrators to apply IAM policies that define who can do what and on which resources
  • “who” part of an IAM policy
    • identity 
    • can be:
      • Google account
      • Google group
      • service account
      • Cloud Identity domain
    • also called a principal. Each principal has its own identifier, usually an email address.
  • “can do what” part of an IAM policy is defined by a role
    • An IAM role is a collection of permissions.
    • When you grant a role to a principal, you grant all the permissions that the role contains.
      • For example, to manage virtual machine instances in a project, we must be able to create, delete, start, stop and change virtual machines. So these permissions are grouped into a role to make them easier to understand and easier to manage. 
    • When a principal is given a role on a specific element of the resource hierarchy, the resulting policy applies to both the chosen element and all the elements below it in the hierarchy. 
    • We can define deny rules that prevent certain principals from using certain permissions, regardless of the roles they're granted. This is because IAM always checks relevant deny policies before checking relevant allow policies. Deny policies, like Allow policies, are inherited through the resource hierarchy. 
    • There are three kinds of roles in IAM: 
      • Basic
        • Broad in scope
        • When applied to a Google Cloud project, they affect all resources in that project
        • Include:
          • Owner - project owners can:
            • access a resource
            • make changes to a resource
            • manage the associated roles and permissions and set up billing
          • Editor - Project editors can:
            • access a resource
            • make changes to a resource
          • Viewer - Project viewers:
            • can access resources
            • can’t make changes
          • Billing Administrator
            • can control the billing for a project
            • can't change the resources in the project
        • If several people are working together on a project that contains sensitive data, basic roles are probably too broad
      • Predefined
        • Assigns permissions that are more specifically tailored to meet the needs of typical job roles
        • Specific Google Cloud services offer sets of predefined roles, and they even define where those roles can be applied
          • Example: With Compute Engine (a Google Cloud product that offers virtual machines as a service) we can apply specific predefined roles such as instanceAdmin to Compute Engine resources in a given project, a given folder, or an entire organization. This then allows whoever has these roles to perform a specific set of predefined actions e.g.
            • compute.instances.delete
            • compute.instances.get
            • compute.instances.list
            • compute.instances.setMachineType
            • compute.instances.start
            • compute.instances.stop
      • Custom
        • Finest-grained
        • Used when we need to assign a role that has even more specific permissions
        • Least-privilege model in which each person in our organization is given the minimal amount of privilege needed to do their job
        • Example: we want to define an instanceOperator role to allow some users to stop and start Compute Engine virtual machines, but not reconfigure them:
          • compute.instances.get
          • compute.instances.list
          • compute.instances.start
          • compute.instances.stop
        • we need to manage the permissions that define the custom role we created. Because of this, some organizations decide they’d rather use the predefined roles
        • can only be applied to either the project level or organization level
        • can’t be applied to the folder level

Service accounts

  • Type of an identity so it gets granted a role (roles can be attached to it)
  • Used when we want to give permissions to a Compute Engine virtual machine, rather than to a person.
    • Example: we have an application running in a virtual machine that needs to store data in Cloud Storage, but we don’t want anyone on the internet to have access to that data–just that particular virtual machine. We can create a service account to authenticate that VM to Cloud Storage. 
  • Named with an email address, but instead of passwords they use cryptographic keys to access resources. 
  • If a service account has been granted Compute Engine’s Instance Admin role, this would allow an application running in a VM with that service account to create, modify, and delete other VMs.
  • Service accounts do need to be managed so they are both identities and also resources, so they can have IAM policies of their own attached to them
    • Example: 
      • Alice needs to manage which Google accounts can act as service accounts so she can have the editor role on a service account
      • Bob just needs to be able to view a list of service accounts so he can have the viewer role. 
      • This is just like granting roles for any other Google Cloud resource.

Cloud Identity


New Google Cloud customers can log into the Google Cloud Console with a Gmail account and then use Google Groups to collaborate with teammates who are in similar roles. This approach is:
  • easy to start with
  • can present challenges later because the team's identities are not centrally managed
    • if someone leaves the organization there's no easy way to immediately remove a user's access to the team's cloud resources
Cloud Identity tool:
  •  allows organizations to define policies and manage their users and groups using the Google Admin Console
  • Admins can log in and manage Google Cloud resources using the same user names and passwords they already used in existing Active Directory or LDAP systems.
  • when someone leaves an organization, an administrator can use the Google Admin Console to disable their account and remove them from groups
  • available in a free and a premium edition that provides capabilities to manage mobile devices.
    • free for Google Workspace customers


Interacting with Google Cloud


There are four ways to access and interact with Google Cloud:
  • Cloud Console - Google Cloud’s graphical user interface (GUI)
    • simple web-based interface
    • helps you deploy, scale, and diagnose production issues
    • you can easily find your resources, check their health, have full management control over them, and set budgets to control how much you spend on them
    • also provides a search facility to quickly find resources and connect to instances via SSH in the browser
    • similar to AWS Console
  • Cloud SDK and Cloud Shell
    • The Cloud SDK is a set of tools that you can use to manage resources and applications hosted on Google Cloud. When installed, all of the tools within the Cloud SDK are located under the bin directory. These tools are:
      • Google Cloud CLI (gcloud command) which provides the main command-line interface for Google Cloud products and services
      • bq, a command-line tool for BigQuery
    • Cloud Shell provides command-line access to cloud resources directly from a browser.
      • it is a Debian-based virtual machine with a persistent 5 gigabyte home directory, which makes it easy to manage Google Cloud projects and resources
      • With Cloud Shell, the Cloud SDK gcloud command and other utilities are always installed, available, up to date, and fully authenticated
  • APIs
    • The services that make up Google Cloud offer APIs so that code you write can control them. 
    • The Cloud Console includes a tool called the Google APIs Explorer that shows which APIs are available, and in which versions
    • Google provides Cloud Client libraries and Google API Client libraries in many popular languages (Java, Python, PHP, C#, Go, Node.js, Ruby, and C++)
  • Google Cloud App
    • can be used to start, stop, and use SSH to connect to Compute Engine instances and see logs from each instance
    • also lets you stop and start Cloud SQL instances
    • you can administer applications deployed on App Engine by viewing errors, rolling back deployments, and changing traffic splitting
    • provides up-to-date billing information for your projects and billing alerts for projects that are going over budget. You can set up customizable graphs showing key metrics such as CPU usage, network usage, requests per second, and server errors
    • offers alerts and incident management
    • you can download it at cloud.google.com/app

References:


Disclaimer:  The course content rights belong to course creators (Google Cloud).