My Public Notepad: AWS Lambda

Showing posts with label AWS Lambda. Show all posts

Friday, 13 June 2025

Introduction to Serverless Framework

Serverless Framework is a tool designed to streamline the development and deployment of serverless applications, including functions and infrastructure, by abstracting away the need to manage servers.

We define desired infrastructure in serverless yaml files and then deploy it by executing:

sls deploy

This command parses serverless yaml file into larger AWS CloudFormation template which automatically gets filled with values from the yaml.

The sls deploy command in the Serverless Framework is effectively idempotent at the infrastructure level, but with important nuances:

How it works:

sls deploy packages our service and deploys it via AWS CloudFormation. CloudFormation itself is designed to be idempotent: if we deploy the same stack with the same configuration and code, AWS will detect no changes and will not modify our resources. If there are changes, only those changes are applied.

What this means:

Repeated runs of sls deploy with no changes will not create duplicate resources or apply unnecessary updates.

If we make changes (to code, configuration, or infrastructure), only the differences are deployed.

Side effects in Lambda code: While infrastructure deployment is idempotent, our Lambda functions themselves must be written to handle repeated invocations safely if we want end-to-end idempotency. The deployment command itself does not guarantee idempotency at the application logic level.

Limitations:

If we use sls deploy function (to update a single function without CloudFormation), this command simply swaps out the function code and is also idempotent in the sense that re-uploading the same code does not cause issues.

If we use plugins or custom resources, their behavior may not always be idempotent unless explicitly designed that way.

To conclude:

sls deploy is idempotent for infrastructure: Re-running it with no changes is safe and does not cause duplicate resources or unintended side effects at the CloudFormation level.
Application-level idempotency is our responsibility: Ensure our Lambda functions and integrations handle repeated events if that is a requirement for our use case

Serverless Yaml Configuration File

serverless yaml file defines a serverless service. It is a good idea to break up the serverless project into multiple services, each of which is defined by its own serverless yaml file. We don't want to have everything in one big infrastructure stack.

Example:

database e.g. DynamoDB
Rest API e.g. which handles the submitted web form and stores data in DynamoDB
front-end website which e.g. stores React app website in s3 bucket

Services can be deployed in multiple regions. (Multi-region architecture is supported)

serverless.yml example:

service: my-service

frameworkVersion: "3"

useDotenv: true

plugins:

- serverless-plugin-log-subscription

- serverless-dotenv-plugin

provider:

name: aws

runtime: nodejs14.x

region: eu-east-1

memorySize: 512

timeout: 900

deploymentBucket:

name: my-serverless-deployments

vpc:

securityGroupIds:

- "sg-0123cf34f6c6354cb"

subnetIds:

- "subnet-01a23493f9e755207"

- "subnet-02b234dbd7d66d33c"

- "subnet-03c234712e99ae1fb"

iam:

role:

statements:

- Effect: Allow

Action:

- lambda:InvokeFunction

Resource: arn:aws:lambda:eu-east-1:123456789099:function:my-database

package:

patterns:

- "out/**"

- "utils.js"

- "aws-sdk"

functions:

my-function:

handler: lambda.handler

events:

- schedule:

name: "my-service-${opt:stage, self:provider.stage}"

description: "Periodically run my-service lambdas"

rate: rate(4 hours)

inputTransformer:

inputTemplate: '{"Records":[{"EventSource":"aws:rate","EventVersion":"1.0","EventSubscriptionArn":"arn:aws:sns:eu-east-1:{{accountId}}:ExampleTopic","Sns":{"Type":"Notification","MessageId":"95df01b4-1234-5678-9903-4c221d41eb5e","TopicArn":"arn:aws:sns:eu-east-1:123456789012:ExampleTopic","Subject":"example subject","Message":"example message","Timestamp":"1970-01-01T00:00:00.000Z","SignatureVersion":"1","Signature":"EXAMPLE","SigningCertUrl":"EXAMPLE","UnsubscribeUrl":"EXAMPLE","MessageAttributes":{"type":{"Type":"String","Value":"populate_unsyncronised"},"count":{"Type":"Number","Value":"400"}}}}]}'

- sns:

arn: arn:aws:sns:us-east-2:123456789099:trigger-my-service

- http:

custom:

dotenv:

dotenvParser: env.loader.js

logSubscription:

enabled: true

destinationArn: ${env:KINESIS_SUBSCRIPTION_STREAM}

roleArn: ${env:KINESIS_SUBSCRIPTION_ROLE}

service: - name of the service
useDotenv: boolean (true|false)
configValidationMode: error
frameworkVersion: e.g. "3"
provider -

name - provider name e.g. aws
runtime - e.g. nodejs18.x
region e.g. us-east-1
memorySize - how much memory will have the machine on which Lambda will be running e.g. 1024 (MB). It is good to check the actual memory usage and adjust the required memory size - downsizing can lower the costs!
timeout: (number) e.g. 60 [seconds] - the maximum amount of time, in seconds, that a serverless function (such as an AWS Lambda function) is allowed to run before it is forcibly terminated by the AWS platform. This setting ensures that our function does not run indefinitely. If the function execution exceeds 60 seconds, the serverless platform will automatically stop it and return a timeout error. The timeout property is commonly used to control resource usage and prevent runaway executions. It is especially important for functions that interact with external services or perform long-running tasks. If not specified, most serverless platforms (like AWS Lambda) use a default timeout (for AWS Lambda, the default is 3 seconds, and the maximum is 900 seconds or 15 minutes).
httpApi:

apiGateway:

minimumCompressionSize: 1024
shouldStartNameWithService: true
restApiId: ""
restApiRootResourceId: ""

stage: - name of the environment e.g. production;
iamManagedPolicies: a list of ARNs of policies that will be associated to the Lambda's computing instance e.g. policy which allows access to S3 buckets etc...
lambdaHashingVersion
environment: dictionary of environment variable names and values
vpc

securityGroupIds: list
subnetIds - typically a list of private subnets with NAT gateway.

functions: a dictionary which defines the AWS Lambda functions that are deployed as part of this Serverless service. This is where we define the AWS Lambda functions that our Serverless service will deploy.

<function_name>: string, a logical name of the function (e.g., my-function). This name is used to reference the function within the Serverless Framework and in deployment outputs. A name of the provisioned Lambda function is in format: <service_name>-<stage>-<function_name>. Each function entry under functions specifies:

handler - tells Serverless which file and exported function to execute as the Lambda entry point (e.g., src/fn/lambda.handler which points to handler export in the src/fn/lambda module). Specifies the entry point for the Lambda function. When the function is invoked, AWS Lambda will execute this handler.
events - (optional, array) a list of events that trigger this function

Some triggers:

schedule, scheduled events: for periodic invocation (cron-like jobs)
sns: for invocation via an AWS SNS topic
HTTP endpoints,
S3 events
messages from a Kafka topic in an MSK cluster (msk)

If the array is empty, that means that the function currently has no event sources configured and will not be triggered automatically by any AWS event.

plugins: a list of serverless plugins e.g.

serverless-webpack
serverless-esbuild
serverless-offline [https://www.serverless.com/plugins/serverless-offline, https://github.com/dherault/serverless-offline]

emulates AWS Lambda and API Gateway. It starts an HTTP server that handles the request's lifecycle like APIG does and invokes the handlers.
sls offline --help

serverless-plugin-log-subscription

custom: - section for serverless plugins settings e.g. for esbuild, logSubscription, webpack etc...

example: serverless-plugin-log-subscription plugin has the settings:

logSubscription: {
   enabled: true,
   destinationArn: process.env.SUBSCRIPTION_STREAM,
   roleArn: process.env.SUBSCRIPTION_ROLE,
}

example: serverless-domain-manager - used to define stage-specific domains.

domains: {
production: {
url: "app.api.example.com",
certificateArn: "arn:aws:acm:us-east-2:123456789012:certificate/a8f8f8e2-95fe-4934-abf2-19dc08138f1f",
},
staging: {
url: "app.staging.example.com",
certificateArn: "arn:aws:acm:us-east-2:123456789012:certificate/a32e9708-7aeb-495b-87b1-8532a2592eeb",
},
dev: {

url: "",

certificateArn: ""

},
},

Introduction to AWS Lambda

Falls under Compute category of AWS Services (among which are EC2, EBS, Elastic Load Balancing).

We only need to provide the code that needs to run on hardware. Servers are automatically provided so we don't need to provision or manage them.

AWS Lambda platform provides automatic scaling, based on the workload, in response to each trigger it receives.

We are charged only for the time that our application is running. 1 ms granularity is used.

Lambda can run any type of application or backend services. It supports many programming languages like C++, C#, Java, JavaScript, Python, Go etc...

It can run the code (functions) in response to events received from other applications or AWS services. These events are actually requests to AWS Lambda. Requests are handled by containers which run the code written in such way to serve the query. If number of requests grows, so grows the number of containers spawned and assigned to this lambda. If number of request decreases, the smaller number of containers gets used.

Use Case Example: Processing images uploaded to S3

image is uploaded to S3 bucket
this triggers AWS Lambda
lambda function processes the image and formats it into a thumbnail adjusted for the device it will be showed on (mobile, tablet, PC)

Use Case Example: Extracting trending social media hashtags

social media data e.g. hashtags is added to Amazon Kinesis (streaming data processing platform)
this triggers AWS Lambda
data is stored in DBs for further processing

Use Case Example: A near real-time data backup system

the goal is to save a copy of a document in a temporary storage system as soon as it's uploaded to server
create two S3 buckets: one where data is uploaded and another one for storing its copy
to allow these buckets talk to each other we need to set up Identity and Access Management (IAM) roles and policies
the code which copies data between buckets will be in Lambda function