2. Who Am I?
● Senior Software Engineer @ OpsGenie
● Co-organizer of Serverless Meetup Turkey
● Oracle Open-Source Contributor
● PhD. Cand. @ METU Computer Eng.
● 8+ years in software development
● Hard-core JVM ninja
● Actively working on Serverless and AWS Lambda
● Part-time Big Data researcher
● Building a new product Thundra
2
6. “If your PaaS can efficiently start
instances in 20 ms that run for half
a second, then call it serverless.”
Adrian Cockcroft, VP Cloud Architecture Strategy at AWS
6
7. What is AWS Lambda?
- AWS’s FaaS (Function as a Service)
- Run code without provisioning or managing servers
- Support invocation types:
- Request/response (sync)
- Event driven (async)
- Supported languages:
- Go
- C#
- Java
- Node.js
- Python
7
9. Why AWS Lambda?
- PAYG - pay as you go
- Highly available
- Scale fast
- Horizontally
- Vertically
- Don’t manage servers
- Built-in integration with other AWS services
- Security
9
11. The Devil is in the Detail
- Container reuse
- Container freeze
- More memory => More CPU
- One execution per container at any time
- At least one delivery guarantee
- New container for
- new deployments
- configuration updates
- even for environment variable updates
- Destroy container when timeout 11
12. How to Limit Container?
- cgroup (Control Group)
- Engineers at Google started the work in 2006
- Merged into Linux kernel in January 2008
- Can limit
- CPU
- Memory
- Disk bandwidth
- Network bandwidth
12
13. CPU Throttling
- cgcreate
- cgcreate -g cpu:/cg1
- cgcreate -g cpu:/cg2
- cpu.cfs_quota_us / cpu.cfs_period_us
- how to configure cg1 to run 0.2 seconds out of every 1 second?
- cgset -r cpu.cfs_quota_us=200000 cpu.cfs_period_us=1000000 cg1
- cpu.shares
- how to configure 1:2 CPU usage ratio between cg1 and cg2?
- cgset -r cpu.shared=512 cg1
- cgset -r cpu.shared=1024 cg2
- why not used by AWS Lambda? 13
17. Resource Limits
Max execution time 5 minutes
Memory allocation range 128 MB - 3 GB
Ephemeral disk capacity ("/tmp" space) 512 MB
Invoke request body payload size (sync invocation) 6 MB
Invoke request body payload size (async invocation) 128 K
Number of file descriptors 1024
Number of processes and threads (combined total) 1024
17
18. Deployment Limits
Function deployment package size (compressed .zip/.jar file) 50 MB
Size of code/dependencies that you can zip into a deployment
package (uncompressed .zip/.jar size)
250 MB
Total size of all the deployment packages per region 75 GB
Total size of environment variables set 4 KB
18
19. Execution Limits
- Account level concurrent execution limit is 1000
- It per region
- It is soft limit
- Function level concurrent execution limit
- It is reserved
- the value is deducted from the unreserved concurrency pool
- ENI Limit is 350
- It is for Lambda function in VPC
- It is per region
- It is soft limit
19
21. Writing Logs
- Logs are written to CloudWatch asynchronously
- Log group per function
- /aws/lambda/my-func
- Log stream per container under log group
- 2018/01/27/[$LATEST]f95da1aaf0384ed6ad642d8299f7503d
- How to log
- Standard output/error
- Lambda API
21
23. Collecting Logs
- Subscribe to CloudWatch log groups
- Only one subscription per log group
- Filter by pattern
- Stream to AWS Lambda
- Stream to AWS Elasticsearch
23
25. Environment Variables
- No limit to the number of env. variables
- Max total size is 4 KB
- Must start with letters [a-zA-Z]
- Can only contain alphanumeric char. and “_” [a-zA-Z0-9_]
- KMS
- Encrypt at rest (default)
- Encrypt in transit
25
26. SSM
- Centralized config management
- share between functions
- update once
- Fine-grained access to sensitive data via IAM
- Integrates with KMS out-of-the-box
- Records a history of changes
26
28. VPC
- Define/select VPC and configure
- Subnets (recommended one subnet in each AZ)
- Security Groups
- To be able to access internet
- NAT Gateway
- Internet Gateway
- Route table configuration
- Be aware of ENI limit (default 350)
- Sure that subnet has enough IP address range for ENI
28
29. Role
- Each Lambda function has an associated IAM role
- For accessing AWS resources
- grant the role the necessary permissions that your Lambda function needs
- for ex. permission to Lambda for putting item to DynamoDB table
- For non-stream based event sources
- grant the event source permissions to invoke function
- for ex. perm. to S3 bucket for invoking Lambda on upload
- For stream based event sources
- grant AWS Lambda permissions for the relevant stream actions
- for ex. perm. to Lambda for getting Kinesis stream records to be invoked
29
30. Others
- Inbound connections are blocked
- For outbound connections only TCP/IP sockets are
supported
- “ptrace” (debugging) system calls are blocked
- TCP port 25 is also blocked as an anti-spam measure
30
32. Retries
- For sync invocations (Lambda API call, …)
- client is responsible for retries
- For async invocations
- Non-Stream based events (S3, SNS, CloudWatch, …)
- retry a few times (2 or more) with delays
- If still fails, put in to DLQ (if specified=
- Stream based events (Kinesis , DynamoDB streams)
- retry until succeeded or
- retry until data expires
32
33. DLQ - Dead Letter Queue
- Can be
- SNS topic
- SQS queue
- Requests are redirected if the invocation is
- Asynchronous and
- Event source is non-stream based (S3, SNS, …)
- Requires permission to access to the DLQ resource
- Monitor “DeadLetterErrors” metrics
33
37. CloudWatch Metrics
- Following metrics are supported per function basis:
- Invocation
- Errors
- Duration
- Dead Letter Error
- Throttles
- Iterator Age
- Following metrics are supported across all functions:
- Concurrent Executions
- Unreserved Concurrent Executions
37
38. Distributed Tracing with AWS X-Ray
- Shows durations, responses and errors
- Segment for Lambda invocation
- Sub-Segments for
- initialization
- calls to external services
- custom ones
- Custom properties
- can be queried over “Annotation”
- can be stored on “Metadata” as raw
38
41. API Logging with CloudTrail
- CloudTrail can log
- function definition/configuration CRUD
- function invocations
- log entry contains information about
- who generated the request
- the requested action
- the action parameters
- ...
- CloudTrail logs can be published to
- S3
- SNS 41
42. Full Observability with Thundra
- Provides three pillars of observability:
- Trace
- Metric
- Log
- Zero overhead with async data publishing
- Has automated instrumentation and profiling support
- Integrated with AWS X-Ray
- www.thundra.io
42
46. Creating Alarm
- Create alarm on CloudWatch by metrics
- Following metrics are supported per function basis:
- Following metrics are supported across all functions:
- Concurrent Executions
- Unreserved Concurrent Executions
- Notify through SNS
- E-Mail
- Lambda
- ...
- Duration
- Errors
- Invocations
- Throttles
46
48. Writing Test
- Unit Test
- do our objects works as expected themselves?
- Integration Test
- does our objects work well together?
- Functional Test
- does the whole system work from end to end?
- Local Lambda development
- SAM Local
- LocalStack
- Cloud9 48
49. SAM Local
- Works with SAM template
- Simulates some AWS service events (not services)
- S3, Kinesis, DynamoDB, Cloudwatch, Scheduled Event, API GW
- Runs API Gateway locally
- Allows debugging on local
- https://github.com/awslabs/aws-sam-local
49
50. LocalStack
- Spins up the many core Cloud APIs on your local
- Lambda, API GW, DynamoDB, Kinesis, Firehose, S3, SNS, SQS, ...
- Supports error injection
- ProvisionedThroughputExceededException, ...
- Can be run on docker
- Integrated with some test frameworks
- JUnit for Java
- nosetests for Python
- https://github.com/localstack/localstack
50
53. Versioning
- Each deploy/upload is a new version
- Aliases map to versions
- There is N:1 relation
- An alias can only be mapped to only one version
- A version can be mapped by multiple aliases
- By default latest version (“$LATEST”) is invoked
- Shift traffic using aliases with weighted versions
53
64. What Does Affect Cold Start?
- Depends on language
- Java and C# has more cold start overhead
- Depends on code size
- Smaller artifact size = less cold start (not significantly)
- Depends on memory size
- More memory = less cold start
- Depends on network configuration
- VPC has more cold start overhead (because of ENI)
- SSL handshake has more cold start overhead
- Depends on application and 3rd party libs 64
65. Cold start times by language + memory
read.acloud.guru/does-coding-language-memory-or-package-size-affect-cold-starts-of-aws-lambda-a15e26d12c76
65
66. Response times by language
read.acloud.guru/does-coding-language-memory-or-package-size-affect-cold-starts-of-aws-lambda-a15e26d12c76
Average response time
Maximum response time
66
67. Cold Start on JVM
- Loading and initializing
- Application classes
- Core JDK classes
- Security (SSL, encryption, …) related JDK classes
- Initializing 3rd party libraries/frameworks
- AWS SDK
- Spring, Jackson, ...
67
68. How to Startup Faster on JVM? [1]
- Enable CDS (Class Data Sharing)
- -Xshare:on
- Already enabled by AWS Lambda
- Enable AppCDS (Application Class Data Sharing)
- -XX:+UseAppCDS -XX:SharedArchiveFile=hello.jsa
- For OpenJDK only available at Java 9 :(
- Use AOT (Ahead of Time Compilation)
- Build custom runtime image with “jlink”
- Only available at Java 9 :(
68
69. How to Startup Faster on JVM? [2]
- Use Tiered Compilation
- Tiered compilation is disabled on AWS Lambda
- -XX:+TieredCompilation -XX:TieredStopAtLevel=1
- Disable bytecode verification
- -Xverify:none
- No classpath scan
- Prefer programmatic or XML configuration for Spring
- Prefer lightweight libraries if possible
- Spring => Guava, Dagger, ...
- Jackson => Gson, ...
69
70. Warmup
- Periodically send empty messages
- So AWS Lambda might think that container is active
- Not perfect solution for cold start
- AWS’s new experimental container pre-initializer
- How to keep multiple containers up?
- https://github.com/opsgenie/sirocco
70