Vadym Kazulkin | @VKazulkin |ip.labs GmbH 1 High performance Serverless Java on AWS
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Vadym Kazulkin ip.labs GmbH Bonn, Germany Co-Organizer of the Java User Group Bonn v.kazulkin@gmail.com @VKazulkin https://dev.to/vkazulkin https://github.com/Vadym79/ https://de.slideshare.net/VadymKazulkin/ https://www.linkedin.com/in/vadymkazulkin https://www.iplabs.de/ Contact
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Java Popularity Vadym Kazulkin | @VKazulkin | ip.labs GmbH
Vadym Kazulkin | @VKazulkin |ip.labs GmbH https://distantjob.com/blog/programming-languages-rank/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Life of the Java (Serverless) Developer on AWS 6
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ Corretto Java 8 ▪ With extended long-term support until 2026 ▪ Coretto Java 11 (since 2019) ▪ Coretto Java 17 (since April 2023) ▪ Corretto Java 21(since November 2023) ▪ Waiting for the support of Java 25 ▪ Only Long Term Support (LTS) by AWS AWS Java Versions Support for AWS Lambda 7
Vadym Kazulkin | @VKazulkin |ip.labs GmbH … but serverless adoption of Java looks like this! 8 Java is a very fast and mature programming language…
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Percent of AWS Lambda Invocations by Language 2021 vs 2023 https://www.datadoghq.com/state-of-serverless-2021 https://www.datadoghq.com/state-of-serverless/ PHYTON IS THE MOST POPULAR LAMDA RUNTIME
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Developers love Java and will be happy to use it for Serverless applications But what are the challenges ? 10
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ “cold start” times (latencies) ▪ memory footprint (high cost in AWS) Serverless with Java Challenges
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Demo Application https://github.com/Vadym79/AWSLambdaJavaSnapStart
Vadym Kazulkin | @VKazulkin |ip.labs GmbH https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html#api-gateway-simple-proxy-for-lambda-input-format API Gateway Proxy Request Event JSON
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS Lambda Function with Java runtime 14
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Challenge No. 1 A Big Cold-Start
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Lambda function lifecycle – a full cold start 16 Sources: Ajay Nair „Become a Serverless Black Belt” https://www.youtube.com/watch?v=oQFORsso2go Tomasz Łakomy "Notes from Optimizing Lambda Performance for Your Serverless Applications“ https://tlakomy.com/optimizing-lambda-performance-for-serverless-applications
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ When Lambda function has been invoked for the first time ▪ After a new Lambda function was deployed ▪ After the existing Lambda function code was modified and re-deployed ▪ When there are not enough warm execution environments in the pool ▪ More concurrent Lambda invocation requests as execution environments in the pool ▪ When the execution environment was destroyed by AWS ▪ For cost saving reasons as the execution environment wasn’t in use for a long time ▪ For security and other reasons to patch the execution environment(s) New Lambda function execution environment required/Lambda function cold starts
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ Start Firecracker VM (execution environment) ▪ AWS Lambda starts the Java runtime ▪ Java runtime loads and initializes Lambda function code (Lambda handler Java class) ▪ Class loading ▪ Static initializer block of the handler class is executed (i.e. AWS service client creation) ▪ Runtime dependency injection ▪ Just-in-Time (JIT) compilation ▪ Lambda invokes the handler method 18 Sources: Ajay Nair „Become a Serverless Black Belt” https://www.youtube.com/watch?v=oQFORsso2go Tomasz Łakomy "Notes from Optimizing Lambda Performance for Your Serverless Applications“ https://tlakomy.com/optimizing-lambda-performance-for-serverless-applications Michael Hart: „Shave 99.93% off your Lambda bill with this one weird trick“ https://hichaelmart.medium.com/shave-99-93-off-your-lambda-bill-with-this-one-weird-trick-33c0acebb2ea Lambda function lifecycle
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS Lambda Function with Java runtime 19 Invocation of the handeRequest method is the warm start
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Demo Application https://github.com/Vadym79/AWSLambdaJavaSnapStart ▪ Lambda has 1024 MB memory setting ▪ Lambda uses x86 architecture ▪ Default (Apache) Http Client for communication with DynamoDB ▪ 14 MB artifact size, , all dependencies in the POM file ▪ Java compilation option - XX:+TieredCompilation - XX:TieredStopAtLevel=1 ▪ Info about the experiments: ▪ Approx. 1 hour duration ▪ Approx. first* 100 cold starts ▪ Approx. first 100.000 warm starts *after Lambda function being re-deployed
Vadym Kazulkin | @VKazulkin |ip.labs GmbH https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k Measurements in ms p50 p75 p90 p99 p99.9 max Amazon Corretto Java 21 cold start 3158 3214 3270 3428 3601 3725 Amazon Corretto Java 21 warm start 5,77 6,50 7,81 20,65 90,20 1423,63 Cold and warm starts with Java 21
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ AWS SnapStart ▪ GraalVM (Native Image) Options To Reduce Cold Start Time 22
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS Lambda SnapStart
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ Lambda SnapStart for Java can improve startup performance for latency- sensitive applications ▪ SnapStart is fully managed AWS Lambda SnapStart https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ Currently available for Lambda managed Java Runtimes (Java 11, 17 and 21), Python and .NET ▪ Not available for all other Lambda runtimes: ▪ Docker Container Image ▪ Custom (Lambda) Runtime (a way to ship GraalVM Native Image) AWS Lambda SnapStart https://github.com/Vadym79/AWSLambdaJavaDockerImage/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS SnapStart Deployment & Invocation 26 https://aws.amazon.com/de/blogs/compute/reducing-java-cold-starts-on-aws-lambda-functions-with-snapstart/ Vadym Kazulkin @VKazulkin , ip.labs GmbH C Create Snapshot Firecracker microVM create & restore snapshot
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS SnapStart Deployment & Invocation https://dev.to/vkazulkin/measuring-java-11-lambda-cold-starts-with-snapstart-part-1-first-impressions-30a4 https://aws.amazon.com/de/blogs/compute/using-aws-lambda-snapstart-with-infrastructure-as-code-and-ci-cd-pipelines/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS Lambda SnapStart with Priming
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ Pre-load as many Java classes as possible before the SnapStart takes the snapshot ▪ Java loads classes on demand (lazy-loading) ▪ Pre-initialize as much as possible before the SnapStart takes the snapshot ▪ Http Clients (Apache, UrlConnection) and JSON Marshallers (Jackson) require expensive one-time initialization per (Lambda) lifecycle. They both are used when creating Amazon DynamoDbClient Ideas behind priming 29
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS SnapStart Deployment & Invocation 30 https://aws.amazon.com/de/blogs/compute/reducing-java-cold-starts-on-aws-lambda-functions-with-snapstart/ Vadym Kazulkin @VKazulkin , ip.labs GmbH Lambda uses the CRaC APIs for runtime hooks for Priming C Create Snapshot Firecracker microVM create & restore snapshot
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ Prime dependencies during initialization phase (when it worth doing) ▪ „Fake“ the calls to pre-initialize „some other expensive stuff“ or execute some critical code paths (this technique is called Priming) Priming
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Product Repository DAO class Expensive initialization of the HTTP Client Expensive initialization of the Jackson Marshaller (ObjectMapper)
Vadym Kazulkin | @VKazulkin |ip.labs GmbH SnapStart Enabled with DynamoDB Request Priming 33 https://dev.to/aws-builders/measuring-java-11-lambda-cold-starts-with-snapstart-part-5-priming-end-to-end-latency-and-deployment-time-jem
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Lambda SnapStart Priming Guide ▪ SnapStart Priming guide aims to explain techniques for priming Java applications. ▪ It assumes a base understanding of AWS Lambda, Lambda SnapStart, and CRaC. https://github.com/marksailes/snapstart-priming-guide
Vadym Kazulkin | @VKazulkin |ip.labs GmbH 0 500 1000 1500 2000 2500 3000 3500 4000 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming Cold starts of Lambda function with Java 21 runtime with 1024 MB memory setting, Apache Http Client, compilation -XX:+TieredCompilation -XX:TieredStopAtLevel=1 p50 p75 p90 p99 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
Vadym Kazulkin | @VKazulkin |ip.labs GmbH 0,00 5,00 10,00 15,00 20,00 25,00 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming Warm starts of Lambda function with Java 21 runtime with 1024 MB memory setting, Apache Http Client compilation - XX:+TieredCompilation -XX:TieredStopAtLevel=1 p50 p75 p90 p99 https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
Vadym Kazulkin | @VKazulkin |ip.labs GmbH 0 200 400 600 800 1000 1200 1400 1600 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming Warm starts of Lambda function with Java 21 runtime with 1024 MB memory setting, Apache Http Client compilation - XX:+TieredCompilation -XX:TieredStopAtLevel=1 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS Lambda Function with Java runtime
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Product Repository DAO class 39
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Demo Application 40 https://github.com/Vadym79/AWSLambdaJavaSnapStart ▪ Lambda has 1024 MB memory setting ▪ Lambda uses x86 architecture ▪ Default (Apache) Http Client for communication with DynamoDB ▪ 18 MB artifact size, , all dependencies in the POM file ▪ Java compilation option - XX:+TieredCompilation - XX:TieredStopAtLevel=1 ▪ Info about the experiments: ▪ Approx. 1 hour duration ▪ Approx. first* 100 cold starts ▪ Approx. first 100.000 warm starts *after Lambda function being re-deployed
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Experiment with: ▪ Lambda memory settings ▪ Java compilation options ▪ HTTP Client implementations (sync and async) ▪ Lambda architecture (x86 vs arm64) ▪ Lambda SnapStart (with priming techniques) To find the right trade-off between Lambda cost and performance for your particular use case Lambda Performance Tuning Approaches 41 https://aws.amazon.com/de/blogs/developer/preview-release-of-theaws-sdk-java-2-x-http-client-built-on-apache-httpclient-5-5-x/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Preview Release of the AWS SDK Java 2.x HTTP Client built on Apache HttpClient 5.5.x https://aws.amazon.com/de/blogs/developer/preview-release-of-theaws-sdk-java-2-x-http-client-built-on-apache-httpclient-5-5-x/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Lambda Deployment Artifact Size 43
Vadym Kazulkin | @VKazulkin |ip.labs GmbH 0 500 1000 1500 2000 2500 3000 3500 4000 4500 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming Cold starts of Lambda function with Java 21 runtime using deployment artifact sizes for p90 p90 small p90 medium p90 big ms https://dev.to/aws-builders/aws-snapstart-part-11-measuring-cold-starts-with-java-21-using-different-deployment-artifact-sizes-4g29 ▪ Small -137 KB (“Hello World”) ▪ Medium – 14 MB (our sample application) ▪ Big -50 MB (our sample application + additional dependencies other to AWS services)
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ Less (dependencies, classes) is more ▪ Include only required dependencies (e.g. not the whole AWS SDK 2.0 for Java, but the dependencies to the clients to be used in Lambda) ▪ Exclude dependencies, which you don‘t need at runtime i.e. test frameworks like Junit Best Practices & Recommendations 45 <dependency> <groupId>org.junit.jupiter</groupId> <artifactId>junit-jupiter-api</artifactId> <version>5.4.2</version> <scope>test</scope> </dependency> <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>dynamodb</artifactId> <version>2.22.2</version> </dependency> <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>bom</artifactId> <version>2.22.2</version> <type>pom</type> <scope>import</scope> </dependency>
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Demo Application 46 https://github.com/Vadym79/AWSLambdaJavaSnapStart ▪ Lambda has 1024 MB memory setting ▪ Lambda uses x86 architecture ▪ Default (Apache) Http Client for communication with DynamoDB ▪ 14 MB artifact size, all dependencies in the POM file ▪ Java compilation option - XX:+TieredCompilation - XX:TieredStopAtLevel=1 ▪ Info about the experiments: ▪ Approx. 1 hour duration ▪ Approx. first* 100 cold starts ▪ Approx. first 100.000 warm starts *after Lambda function being re-deployed
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS SnapStart Deployment & Invocation https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html https://aws.amazon.com/de/blogs/compute/reducing-java-cold-starts-on-aws-lambda-functions-with-snapstart/ • Lambda stores function snapshots in Amazon S3, dividing them into 512 KB chunks to optimize retrieval latency. • Retrieval latency from Amazon S3 can take up to hundreds of milliseconds for each 512 KB chunk. • Therefore, Lambda uses a two-layer cache to speed-up snapshot retrieval.
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Storing snapshots for low-latency retrieval at Lambda scale 48 https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/ ▪ Lambda also maintains a layer one (L1) cache located on Lambda worker nodes, the (Amazon EC2) instances handling function invocations. ▪ This layer is available locally, thus it provides the fastest performance, typically 1 millisecond for a 512 KB chunk. ▪ Functions with more frequent invocations are more likely to have their snapshot chunks cached in this layer. ▪ Functions with fewer invocations are automatically evicted from this cache, because it is bound by the worker instance disk capacity. ▪ When a snapshot chunk is not available in the L1 cache, Lambda retrieves the chunk from the L2 cache layer, if not available there from S3.
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Storing snapshots for low-latency retrieval at Lambda scale 49 https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/ ▪ Resuming execution from snapshots with low latency is the final SnapStart stage. This involves loading the retrieved snapshot chunks into your function execution environment. ▪ Typically, only a subset of the retrieved snapshot is needed to serve an invocation. Storing snapshots as chunks lets Lambda optimize the resume process by proactively loading only the necessary subset of chunks. ▪ To achieve this, Lambda tracks and records the snapshot chunks that the function accesses during each function invocation.
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Storing snapshots for low-latency retrieval at Lambda scale 50 https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/ ▪ After the first function invocation, Lambda refers to this recorded chunk access data for subsequent invokes, as shown in the following figure. ▪ Lambda proactively retrieves and loads this “working set” of chunks before they are needed for execution. This significantly speeds up cold-start latency.
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ The speed of restoring a snapshot depends on its contents, size, and the caching tier used. As a result, SnapStart performance can vary across individual functions. ▪ Frequently invoked functions are more likely to have their snapshots cached in the L1 layer, which provides the fastest retrieval latency. ▪ Infrequently accessed portions of snapshots for functions with sporadic invokes are less likely to be present in the L1 layer, resulting in slower retrieval latency from the L2 and S3 cache layers. ▪ Chunk access data for functions with more invocations is also more likely to be “complete”, which speeds up snapshot restore latency. SnapStart function performance https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS SnapStart tiered cache 52 https://dev.to/aws-builders/aws-snapstart-part-17-impact-of-the-snapshot-tiered-cache-on-the-cold-starts-with-java-21-52ef • Due to the effect of snapshot tiered cache, cold start times reduces with the number of invocations • After certain number of invocations reached the cold start times becomes stable
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS Lambda under the Hood https://www.infoq.com/articles/aws-lambda-under-the-hood/ https://www.infoq.com/presentations/aws-lambda-arch/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH 0 500 1000 1500 2000 2500 with SnapStart w/o Primingwith SnapStart w/o Priming (last 70) with SnapStart with Priming with SnapStart with Priming (last 70) Comparison between all approx.100 vs last 70 cold start of the Lambda function p50 p75 p90 p99 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-17-impact-of-the-snapshot-tiered-cache-on-the-cold-starts-with-java-21-52ef Due to the effect of snapshot tiered cache, cold start times reduces with the number of invocations ms
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS Lambda Profiler Extension for Java https://github.com/aws/aws-lambda-java-libs/tree/main/experimental/aws-lambda-java-profiler • The Lambda profiler extension allows you to profile your Java functions invoke by invoke, with high fidelity, and no code changes. • It uses the async-profiler project to produce profiling data and automatically uploads the data as HTML flame graphs to S3.
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS Lambda Implementation
Vadym Kazulkin | @VKazulkin |ip.labs GmbH 57 https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html#api-gateway-simple-proxy-for-lambda-input-format API Gateway Proxy Request Event JSON
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS Lambda Profiler Extension for Java https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html#api-gateway-simple-proxy-for-lambda-input-format https://github.com/aws/aws-lambda-java-libs/tree/main/experimental/aws-lambda-java-profiler
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Full Priming including APIGatewayProxyRequestEvent Deserialization https://dev.to/aws-heroes/aws-lambda-profiler-extension-for-java-part-2-improving-lambda-performance-with-lambda-snapstart-4p06 This priming technique leads to up to 25% reduction of the cold start times vs. DynamoDB request priming alone
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AWS SnapStart Pricing 60 https://aws.amazon.com/lambda/pricing/?nc1=h_ls
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ Avoid saving state that depends on uniqueness during initialization ▪ Avoid UUID uniqueSandboxId = UUID.randomUUID() or long envCreationTime = System.currentTimeMillis() im Lambda constructor ▪ Use cryptographically secure pseudorandom number generators ▪ Software that always gets random numbers from /dev/random or /dev/urandom also maintains randomness with SnapStart. ▪ Use java.security.SecureRandom instead of new Random() ▪ Avoid logic relying on time-based caches AWS SnapStart Challenges around uniqueness 61 https://docs.aws.amazon.com/lambda/latest/dg/snapstart-uniqueness.html
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ SnapStart supports the Java 11, 17 and 21 (Corretto), Python and .NET managed runtime only ▪ Deployment with SnapStart enabled takes more than 2-2,5 minutes additionally ▪ Snapshot is deleted from cache if Lambda function is not invoked for 14 days ▪ SnapStart currently does not support : ▪ Provisioned concurrency ▪ Amazon Elastic File System (Amazon EFS) ▪ Ephemeral storage greater than 512 MB AWS SnapStart Challenges & Limitations https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html
Vadym Kazulkin | @VKazulkin |ip.labs GmbH 63
Vadym Kazulkin | @VKazulkin |ip.labs GmbH GraalVM Architecture
Vadym Kazulkin | @VKazulkin |ip.labs GmbH GraalVM Ahead-of-Time Compilation Source: Oleg Šelajev, Thomas Wuerthinger, Oracle: “Deep dive into using GraalVM for Java and JavaScript” https://www.youtube.com/watch?v=a-XEZobXspo
Vadym Kazulkin | @VKazulkin |ip.labs GmbH AOT vs JIT Source: „Everything you need to know about GraalVM by Oleg Šelajev & Thomas Wuerthinger” https://www.youtube.com/watch?v=ANN9rxYo5Hg
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Promise: Java Function compiled into a native executable using GraalVM Native Image significantly reduces ▪ “cold start” times ▪ memory footprint GraalVM Native Image 67
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ AWS doesn’t provide GraalVM (Native Image) as Java Runtime out of the box ▪ AWS provides Custom Runtime Option Current Challenges with Native Executable using GraalVM
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Custom Lambda Runtimes https://github.com/Vadym79/AWSLambdaGraalVMNativeImage
Vadym Kazulkin | @VKazulkin |ip.labs GmbH 0 500 1000 1500 2000 2500 3000 3500 4000 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming GraalVM 23 Native Image p50 p75 p90 p99 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
Vadym Kazulkin | @VKazulkin |ip.labs GmbH 0,00 5,00 10,00 15,00 20,00 25,00 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming GraalVM 23 Native Image p50 p75 p90 p99 https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
Vadym Kazulkin | @VKazulkin |ip.labs GmbH 0 200 400 600 800 1000 1200 1400 1600 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming GraalVM Native Image 23 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Frameworks and libraries Ready for GraalVM Native Image https://www.graalvm.org/native-image/libraries-and-frameworks/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH GraalVM Native Image 74 https://github.com/Vadym79/AWSLambdaGraalVMNativeImage/blob/master/pure-lambda-graalvm-jdk-21-native-image/src/main/reflect.json You can run into runtime errors (ClassNotFoundExceptions ) when configuration is missing
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Particulary logging configuration in GraalVM Native Image is complex 76
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Log4j natively supports GraalVM Native since 2.0.25 77 https://logging.staged.apache.org/log4j/2.x/graalvm.html
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Assisted Configuration with GraalVM Tracing Agent https://www.graalvm.org/latest/reference-manual/native-image/metadata/AutomaticMetadataCollection/ https://www.graalvm.org/latest/reference-manual/native-image/guides/configure-with-tracing-agent/ Run the GraalVM tracing agent during the execution of your tests
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ GraalVM is really powerful and has a lot of potential ▪ GraalVM Native Image improves cold starts and memory footprint significantly ▪ GraalVM Native Image is currently not without challenges ▪ Complex GraalVM Native Image configuration files ▪ AWS Lambda Custom Runtime requires Linux executable only ▪ Building Custom Runtime requires some additional effort ▪ e.g. you need a scalable CI/CD pipeline to build memory-intensive native image ▪ Build time is a factor ▪ You need to carefully test to avoid runtime errors GraalVM Conclusion
Vadym Kazulkin | @VKazulkin |ip.labs GmbH ▪ With AWS SnapStart and GraalVM Native Image you can reduce cold start times of the AWS Lambda with Java 21 runtime to the acceptable values ▪ If you’re willing to accept slightly higher cold and warm start times for certain the Lambda function(s) and solid priming is applicable -> use fully managed AWS SnapStart with priming ▪ If a very high performance for certain the Lambda function(s) is really crucial for your business -> go for GraalVM Native Image Wrap up and personal suggestions 80
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Powertools for AWS Lambda (Java) v2 https://docs.powertools.aws.dev/lambda/java/2.4.0/ https://github.com/Vadym79/AWSPowertoolsForLambdaJavaV2
Vadym Kazulkin | @VKazulkin |ip.labs GmbH The Future of GraalVM 82 https://blogs.oracle.com/java/post/detaching-graalvm-from-the-java-ecosystem-train
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Project Leyden The primary goal of this Project is to improve the startup time, time to peak performance, and footprint of Java programs. https://www.youtube.com/watch?v=teXijm79vno https://openjdk.org/projects/leyden/
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Word of Caution 84 Re-measure for your use case! Even with my examples measurements might already produce different results due to: ▪ Lambda Amazon Corretto Java 21 managed runtime minor version changes ▪ Lambda SnapStart snapshot create and restore improvements ▪ Firecracker microVM improvements ▪ GraalVM (major and minor version) and Native Image improvements ▪ There are still servers behind Lambda ▪ Java Memory Model impact (L or RAM caches hits and misses) ▪ Upgrading dependencies (AWS SDK for Java) tend to make them bigger increasing the cold start time
Vadym Kazulkin | @VKazulkin |ip.labs GmbH „AWS Lambda SnapStart „ series 85 https://dev.to/vkazulkin/series/24979 Article series covers the why and what behind Lambda SnapStart and priming techniques including measurements for the cold and warm starts with different settings for: ▪ Java 11 ▪ Java 17 ▪ Java 21
Vadym Kazulkin | @VKazulkin |ip.labs GmbH “Spring Boot 3.4/ Quarkus 3/ Micronaut 4 application on AWS Lambda” series 86 Article series covers different ways to write, run and optimize Spring Boot 3.4 / Quarkus 3 / Micronaut 4 applications on AWS Lambda using: ▪ Managed Java 21 Lambda runtime + SnapStart+ priming ▪ GraalVM Native Image Cold and warm start time measurements are also provided https://dev.to/vkazulkin/series/30408 https://dev.to/vkazulkin/series/31519 https://dev.to/vkazulkin/series/26067
Vadym Kazulkin | @VKazulkin |ip.labs GmbH “Data API for Amazon Aurora Serverless v2 with AWS SDK for Java” series 87 Article series covers pure Java 21 cold and warm start time measurements and optimization techniques for Amazon Aurora Serverless v2 database with JDBC and Data API https://dev.to/vkazulkin/series/26067
Vadym Kazulkin | @VKazulkin |ip.labs GmbH “Serverless applications with Java and Aurora DSQL” series https://dev.to/vkazulkin/series/32326 Article series covers pure Java 21 cold and warm start time measurements and optimization techniques (SnapStart+priming vs GraalVM Native Image) for Amazon Aurora DSQL database
Vadym Kazulkin | @VKazulkin |ip.labs GmbH Thank you

Practical Performance Tuning for Serverless Java on AWS- InfoQ Dev Summit

  • 1.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 1 High performance Serverless Java on AWS
  • 2.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Vadym Kazulkin ip.labs GmbH Bonn, Germany Co-Organizer of the Java User Group Bonn v.kazulkin@gmail.com @VKazulkin https://dev.to/vkazulkin https://github.com/Vadym79/ https://de.slideshare.net/VadymKazulkin/ https://www.linkedin.com/in/vadymkazulkin https://www.iplabs.de/ Contact
  • 3.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Java Popularity Vadym Kazulkin | @VKazulkin | ip.labs GmbH
  • 4.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH https://distantjob.com/blog/programming-languages-rank/
  • 5.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Life of the Java (Serverless) Developer on AWS 6
  • 6.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Corretto Java 8 ▪ With extended long-term support until 2026 ▪ Coretto Java 11 (since 2019) ▪ Coretto Java 17 (since April 2023) ▪ Corretto Java 21(since November 2023) ▪ Waiting for the support of Java 25 ▪ Only Long Term Support (LTS) by AWS AWS Java Versions Support for AWS Lambda 7
  • 7.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH … but serverless adoption of Java looks like this! 8 Java is a very fast and mature programming language…
  • 8.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Percent of AWS Lambda Invocations by Language 2021 vs 2023 https://www.datadoghq.com/state-of-serverless-2021 https://www.datadoghq.com/state-of-serverless/ PHYTON IS THE MOST POPULAR LAMDA RUNTIME
  • 9.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Developers love Java and will be happy to use it for Serverless applications But what are the challenges ? 10
  • 10.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ “cold start” times (latencies) ▪ memory footprint (high cost in AWS) Serverless with Java Challenges
  • 11.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Demo Application https://github.com/Vadym79/AWSLambdaJavaSnapStart
  • 12.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html#api-gateway-simple-proxy-for-lambda-input-format API Gateway Proxy Request Event JSON
  • 13.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda Function with Java runtime 14
  • 14.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Challenge No. 1 A Big Cold-Start
  • 15.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Lambda function lifecycle – a full cold start 16 Sources: Ajay Nair „Become a Serverless Black Belt” https://www.youtube.com/watch?v=oQFORsso2go Tomasz Łakomy "Notes from Optimizing Lambda Performance for Your Serverless Applications“ https://tlakomy.com/optimizing-lambda-performance-for-serverless-applications
  • 16.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ When Lambda function has been invoked for the first time ▪ After a new Lambda function was deployed ▪ After the existing Lambda function code was modified and re-deployed ▪ When there are not enough warm execution environments in the pool ▪ More concurrent Lambda invocation requests as execution environments in the pool ▪ When the execution environment was destroyed by AWS ▪ For cost saving reasons as the execution environment wasn’t in use for a long time ▪ For security and other reasons to patch the execution environment(s) New Lambda function execution environment required/Lambda function cold starts
  • 17.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Start Firecracker VM (execution environment) ▪ AWS Lambda starts the Java runtime ▪ Java runtime loads and initializes Lambda function code (Lambda handler Java class) ▪ Class loading ▪ Static initializer block of the handler class is executed (i.e. AWS service client creation) ▪ Runtime dependency injection ▪ Just-in-Time (JIT) compilation ▪ Lambda invokes the handler method 18 Sources: Ajay Nair „Become a Serverless Black Belt” https://www.youtube.com/watch?v=oQFORsso2go Tomasz Łakomy "Notes from Optimizing Lambda Performance for Your Serverless Applications“ https://tlakomy.com/optimizing-lambda-performance-for-serverless-applications Michael Hart: „Shave 99.93% off your Lambda bill with this one weird trick“ https://hichaelmart.medium.com/shave-99-93-off-your-lambda-bill-with-this-one-weird-trick-33c0acebb2ea Lambda function lifecycle
  • 18.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda Function with Java runtime 19 Invocation of the handeRequest method is the warm start
  • 19.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Demo Application https://github.com/Vadym79/AWSLambdaJavaSnapStart ▪ Lambda has 1024 MB memory setting ▪ Lambda uses x86 architecture ▪ Default (Apache) Http Client for communication with DynamoDB ▪ 14 MB artifact size, , all dependencies in the POM file ▪ Java compilation option - XX:+TieredCompilation - XX:TieredStopAtLevel=1 ▪ Info about the experiments: ▪ Approx. 1 hour duration ▪ Approx. first* 100 cold starts ▪ Approx. first 100.000 warm starts *after Lambda function being re-deployed
  • 20.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k Measurements in ms p50 p75 p90 p99 p99.9 max Amazon Corretto Java 21 cold start 3158 3214 3270 3428 3601 3725 Amazon Corretto Java 21 warm start 5,77 6,50 7,81 20,65 90,20 1423,63 Cold and warm starts with Java 21
  • 21.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ AWS SnapStart ▪ GraalVM (Native Image) Options To Reduce Cold Start Time 22
  • 22.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda SnapStart
  • 23.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Lambda SnapStart for Java can improve startup performance for latency- sensitive applications ▪ SnapStart is fully managed AWS Lambda SnapStart https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html
  • 24.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Currently available for Lambda managed Java Runtimes (Java 11, 17 and 21), Python and .NET ▪ Not available for all other Lambda runtimes: ▪ Docker Container Image ▪ Custom (Lambda) Runtime (a way to ship GraalVM Native Image) AWS Lambda SnapStart https://github.com/Vadym79/AWSLambdaJavaDockerImage/
  • 25.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS SnapStart Deployment & Invocation 26 https://aws.amazon.com/de/blogs/compute/reducing-java-cold-starts-on-aws-lambda-functions-with-snapstart/ Vadym Kazulkin @VKazulkin , ip.labs GmbH C Create Snapshot Firecracker microVM create & restore snapshot
  • 26.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS SnapStart Deployment & Invocation https://dev.to/vkazulkin/measuring-java-11-lambda-cold-starts-with-snapstart-part-1-first-impressions-30a4 https://aws.amazon.com/de/blogs/compute/using-aws-lambda-snapstart-with-infrastructure-as-code-and-ci-cd-pipelines/
  • 27.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda SnapStart with Priming
  • 28.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Pre-load as many Java classes as possible before the SnapStart takes the snapshot ▪ Java loads classes on demand (lazy-loading) ▪ Pre-initialize as much as possible before the SnapStart takes the snapshot ▪ Http Clients (Apache, UrlConnection) and JSON Marshallers (Jackson) require expensive one-time initialization per (Lambda) lifecycle. They both are used when creating Amazon DynamoDbClient Ideas behind priming 29
  • 29.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS SnapStart Deployment & Invocation 30 https://aws.amazon.com/de/blogs/compute/reducing-java-cold-starts-on-aws-lambda-functions-with-snapstart/ Vadym Kazulkin @VKazulkin , ip.labs GmbH Lambda uses the CRaC APIs for runtime hooks for Priming C Create Snapshot Firecracker microVM create & restore snapshot
  • 30.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Prime dependencies during initialization phase (when it worth doing) ▪ „Fake“ the calls to pre-initialize „some other expensive stuff“ or execute some critical code paths (this technique is called Priming) Priming
  • 31.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Product Repository DAO class Expensive initialization of the HTTP Client Expensive initialization of the Jackson Marshaller (ObjectMapper)
  • 32.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH SnapStart Enabled with DynamoDB Request Priming 33 https://dev.to/aws-builders/measuring-java-11-lambda-cold-starts-with-snapstart-part-5-priming-end-to-end-latency-and-deployment-time-jem
  • 33.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Lambda SnapStart Priming Guide ▪ SnapStart Priming guide aims to explain techniques for priming Java applications. ▪ It assumes a base understanding of AWS Lambda, Lambda SnapStart, and CRaC. https://github.com/marksailes/snapstart-priming-guide
  • 34.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0 500 1000 1500 2000 2500 3000 3500 4000 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming Cold starts of Lambda function with Java 21 runtime with 1024 MB memory setting, Apache Http Client, compilation -XX:+TieredCompilation -XX:TieredStopAtLevel=1 p50 p75 p90 p99 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
  • 35.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0,00 5,00 10,00 15,00 20,00 25,00 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming Warm starts of Lambda function with Java 21 runtime with 1024 MB memory setting, Apache Http Client compilation - XX:+TieredCompilation -XX:TieredStopAtLevel=1 p50 p75 p90 p99 https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
  • 36.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0 200 400 600 800 1000 1200 1400 1600 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming Warm starts of Lambda function with Java 21 runtime with 1024 MB memory setting, Apache Http Client compilation - XX:+TieredCompilation -XX:TieredStopAtLevel=1 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
  • 37.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda Function with Java runtime
  • 38.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Product Repository DAO class 39
  • 39.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Demo Application 40 https://github.com/Vadym79/AWSLambdaJavaSnapStart ▪ Lambda has 1024 MB memory setting ▪ Lambda uses x86 architecture ▪ Default (Apache) Http Client for communication with DynamoDB ▪ 18 MB artifact size, , all dependencies in the POM file ▪ Java compilation option - XX:+TieredCompilation - XX:TieredStopAtLevel=1 ▪ Info about the experiments: ▪ Approx. 1 hour duration ▪ Approx. first* 100 cold starts ▪ Approx. first 100.000 warm starts *after Lambda function being re-deployed
  • 40.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Experiment with: ▪ Lambda memory settings ▪ Java compilation options ▪ HTTP Client implementations (sync and async) ▪ Lambda architecture (x86 vs arm64) ▪ Lambda SnapStart (with priming techniques) To find the right trade-off between Lambda cost and performance for your particular use case Lambda Performance Tuning Approaches 41 https://aws.amazon.com/de/blogs/developer/preview-release-of-theaws-sdk-java-2-x-http-client-built-on-apache-httpclient-5-5-x/
  • 41.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Preview Release of the AWS SDK Java 2.x HTTP Client built on Apache HttpClient 5.5.x https://aws.amazon.com/de/blogs/developer/preview-release-of-theaws-sdk-java-2-x-http-client-built-on-apache-httpclient-5-5-x/
  • 42.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Lambda Deployment Artifact Size 43
  • 43.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0 500 1000 1500 2000 2500 3000 3500 4000 4500 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming Cold starts of Lambda function with Java 21 runtime using deployment artifact sizes for p90 p90 small p90 medium p90 big ms https://dev.to/aws-builders/aws-snapstart-part-11-measuring-cold-starts-with-java-21-using-different-deployment-artifact-sizes-4g29 ▪ Small -137 KB (“Hello World”) ▪ Medium – 14 MB (our sample application) ▪ Big -50 MB (our sample application + additional dependencies other to AWS services)
  • 44.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Less (dependencies, classes) is more ▪ Include only required dependencies (e.g. not the whole AWS SDK 2.0 for Java, but the dependencies to the clients to be used in Lambda) ▪ Exclude dependencies, which you don‘t need at runtime i.e. test frameworks like Junit Best Practices & Recommendations 45 <dependency> <groupId>org.junit.jupiter</groupId> <artifactId>junit-jupiter-api</artifactId> <version>5.4.2</version> <scope>test</scope> </dependency> <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>dynamodb</artifactId> <version>2.22.2</version> </dependency> <dependency> <groupId>software.amazon.awssdk</groupId> <artifactId>bom</artifactId> <version>2.22.2</version> <type>pom</type> <scope>import</scope> </dependency>
  • 45.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Demo Application 46 https://github.com/Vadym79/AWSLambdaJavaSnapStart ▪ Lambda has 1024 MB memory setting ▪ Lambda uses x86 architecture ▪ Default (Apache) Http Client for communication with DynamoDB ▪ 14 MB artifact size, all dependencies in the POM file ▪ Java compilation option - XX:+TieredCompilation - XX:TieredStopAtLevel=1 ▪ Info about the experiments: ▪ Approx. 1 hour duration ▪ Approx. first* 100 cold starts ▪ Approx. first 100.000 warm starts *after Lambda function being re-deployed
  • 46.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS SnapStart Deployment & Invocation https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html https://aws.amazon.com/de/blogs/compute/reducing-java-cold-starts-on-aws-lambda-functions-with-snapstart/ • Lambda stores function snapshots in Amazon S3, dividing them into 512 KB chunks to optimize retrieval latency. • Retrieval latency from Amazon S3 can take up to hundreds of milliseconds for each 512 KB chunk. • Therefore, Lambda uses a two-layer cache to speed-up snapshot retrieval.
  • 47.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Storing snapshots for low-latency retrieval at Lambda scale 48 https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/ ▪ Lambda also maintains a layer one (L1) cache located on Lambda worker nodes, the (Amazon EC2) instances handling function invocations. ▪ This layer is available locally, thus it provides the fastest performance, typically 1 millisecond for a 512 KB chunk. ▪ Functions with more frequent invocations are more likely to have their snapshot chunks cached in this layer. ▪ Functions with fewer invocations are automatically evicted from this cache, because it is bound by the worker instance disk capacity. ▪ When a snapshot chunk is not available in the L1 cache, Lambda retrieves the chunk from the L2 cache layer, if not available there from S3.
  • 48.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Storing snapshots for low-latency retrieval at Lambda scale 49 https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/ ▪ Resuming execution from snapshots with low latency is the final SnapStart stage. This involves loading the retrieved snapshot chunks into your function execution environment. ▪ Typically, only a subset of the retrieved snapshot is needed to serve an invocation. Storing snapshots as chunks lets Lambda optimize the resume process by proactively loading only the necessary subset of chunks. ▪ To achieve this, Lambda tracks and records the snapshot chunks that the function accesses during each function invocation.
  • 49.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Storing snapshots for low-latency retrieval at Lambda scale 50 https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/ ▪ After the first function invocation, Lambda refers to this recorded chunk access data for subsequent invokes, as shown in the following figure. ▪ Lambda proactively retrieves and loads this “working set” of chunks before they are needed for execution. This significantly speeds up cold-start latency.
  • 50.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ The speed of restoring a snapshot depends on its contents, size, and the caching tier used. As a result, SnapStart performance can vary across individual functions. ▪ Frequently invoked functions are more likely to have their snapshots cached in the L1 layer, which provides the fastest retrieval latency. ▪ Infrequently accessed portions of snapshots for functions with sporadic invokes are less likely to be present in the L1 layer, resulting in slower retrieval latency from the L2 and S3 cache layers. ▪ Chunk access data for functions with more invocations is also more likely to be “complete”, which speeds up snapshot restore latency. SnapStart function performance https://aws.amazon.com/blogs/compute/under-the-hood-how-aws-lambda-snapstart-optimizes-function-startup-latency/
  • 51.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS SnapStart tiered cache 52 https://dev.to/aws-builders/aws-snapstart-part-17-impact-of-the-snapshot-tiered-cache-on-the-cold-starts-with-java-21-52ef • Due to the effect of snapshot tiered cache, cold start times reduces with the number of invocations • After certain number of invocations reached the cold start times becomes stable
  • 52.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda under the Hood https://www.infoq.com/articles/aws-lambda-under-the-hood/ https://www.infoq.com/presentations/aws-lambda-arch/
  • 53.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0 500 1000 1500 2000 2500 with SnapStart w/o Primingwith SnapStart w/o Priming (last 70) with SnapStart with Priming with SnapStart with Priming (last 70) Comparison between all approx.100 vs last 70 cold start of the Lambda function p50 p75 p90 p99 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-17-impact-of-the-snapshot-tiered-cache-on-the-cold-starts-with-java-21-52ef Due to the effect of snapshot tiered cache, cold start times reduces with the number of invocations ms
  • 54.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda Profiler Extension for Java https://github.com/aws/aws-lambda-java-libs/tree/main/experimental/aws-lambda-java-profiler • The Lambda profiler extension allows you to profile your Java functions invoke by invoke, with high fidelity, and no code changes. • It uses the async-profiler project to produce profiling data and automatically uploads the data as HTML flame graphs to S3.
  • 55.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda Implementation
  • 56.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 57 https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html#api-gateway-simple-proxy-for-lambda-input-format API Gateway Proxy Request Event JSON
  • 57.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS Lambda Profiler Extension for Java https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html#api-gateway-simple-proxy-for-lambda-input-format https://github.com/aws/aws-lambda-java-libs/tree/main/experimental/aws-lambda-java-profiler
  • 58.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Full Priming including APIGatewayProxyRequestEvent Deserialization https://dev.to/aws-heroes/aws-lambda-profiler-extension-for-java-part-2-improving-lambda-performance-with-lambda-snapstart-4p06 This priming technique leads to up to 25% reduction of the cold start times vs. DynamoDB request priming alone
  • 59.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AWS SnapStart Pricing 60 https://aws.amazon.com/lambda/pricing/?nc1=h_ls
  • 60.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ Avoid saving state that depends on uniqueness during initialization ▪ Avoid UUID uniqueSandboxId = UUID.randomUUID() or long envCreationTime = System.currentTimeMillis() im Lambda constructor ▪ Use cryptographically secure pseudorandom number generators ▪ Software that always gets random numbers from /dev/random or /dev/urandom also maintains randomness with SnapStart. ▪ Use java.security.SecureRandom instead of new Random() ▪ Avoid logic relying on time-based caches AWS SnapStart Challenges around uniqueness 61 https://docs.aws.amazon.com/lambda/latest/dg/snapstart-uniqueness.html
  • 61.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ SnapStart supports the Java 11, 17 and 21 (Corretto), Python and .NET managed runtime only ▪ Deployment with SnapStart enabled takes more than 2-2,5 minutes additionally ▪ Snapshot is deleted from cache if Lambda function is not invoked for 14 days ▪ SnapStart currently does not support : ▪ Provisioned concurrency ▪ Amazon Elastic File System (Amazon EFS) ▪ Ephemeral storage greater than 512 MB AWS SnapStart Challenges & Limitations https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html
  • 62.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 63
  • 63.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH GraalVM Architecture
  • 64.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH GraalVM Ahead-of-Time Compilation Source: Oleg Šelajev, Thomas Wuerthinger, Oracle: “Deep dive into using GraalVM for Java and JavaScript” https://www.youtube.com/watch?v=a-XEZobXspo
  • 65.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH AOT vs JIT Source: „Everything you need to know about GraalVM by Oleg Šelajev & Thomas Wuerthinger” https://www.youtube.com/watch?v=ANN9rxYo5Hg
  • 66.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Promise: Java Function compiled into a native executable using GraalVM Native Image significantly reduces ▪ “cold start” times ▪ memory footprint GraalVM Native Image 67
  • 67.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ AWS doesn’t provide GraalVM (Native Image) as Java Runtime out of the box ▪ AWS provides Custom Runtime Option Current Challenges with Native Executable using GraalVM
  • 68.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Custom Lambda Runtimes https://github.com/Vadym79/AWSLambdaGraalVMNativeImage
  • 69.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0 500 1000 1500 2000 2500 3000 3500 4000 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming GraalVM 23 Native Image p50 p75 p90 p99 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
  • 70.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0,00 5,00 10,00 15,00 20,00 25,00 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming GraalVM 23 Native Image p50 p75 p90 p99 https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
  • 71.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH 0 200 400 600 800 1000 1200 1400 1600 w/o SnapStart with SnapStart w/o Priming with SnapStart with Priming GraalVM Native Image 23 p99.9 max https://dev.to/aws-builders/aws-snapstart-part-13-measuring-warm-starts-with-java-21-using-different-lambda-memory-settings-160k ms
  • 72.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Frameworks and libraries Ready for GraalVM Native Image https://www.graalvm.org/native-image/libraries-and-frameworks/
  • 73.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH GraalVM Native Image 74 https://github.com/Vadym79/AWSLambdaGraalVMNativeImage/blob/master/pure-lambda-graalvm-jdk-21-native-image/src/main/reflect.json You can run into runtime errors (ClassNotFoundExceptions ) when configuration is missing
  • 74.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Particulary logging configuration in GraalVM Native Image is complex 76
  • 75.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Log4j natively supports GraalVM Native since 2.0.25 77 https://logging.staged.apache.org/log4j/2.x/graalvm.html
  • 76.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Assisted Configuration with GraalVM Tracing Agent https://www.graalvm.org/latest/reference-manual/native-image/metadata/AutomaticMetadataCollection/ https://www.graalvm.org/latest/reference-manual/native-image/guides/configure-with-tracing-agent/ Run the GraalVM tracing agent during the execution of your tests
  • 77.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ GraalVM is really powerful and has a lot of potential ▪ GraalVM Native Image improves cold starts and memory footprint significantly ▪ GraalVM Native Image is currently not without challenges ▪ Complex GraalVM Native Image configuration files ▪ AWS Lambda Custom Runtime requires Linux executable only ▪ Building Custom Runtime requires some additional effort ▪ e.g. you need a scalable CI/CD pipeline to build memory-intensive native image ▪ Build time is a factor ▪ You need to carefully test to avoid runtime errors GraalVM Conclusion
  • 78.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH ▪ With AWS SnapStart and GraalVM Native Image you can reduce cold start times of the AWS Lambda with Java 21 runtime to the acceptable values ▪ If you’re willing to accept slightly higher cold and warm start times for certain the Lambda function(s) and solid priming is applicable -> use fully managed AWS SnapStart with priming ▪ If a very high performance for certain the Lambda function(s) is really crucial for your business -> go for GraalVM Native Image Wrap up and personal suggestions 80
  • 79.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Powertools for AWS Lambda (Java) v2 https://docs.powertools.aws.dev/lambda/java/2.4.0/ https://github.com/Vadym79/AWSPowertoolsForLambdaJavaV2
  • 80.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH The Future of GraalVM 82 https://blogs.oracle.com/java/post/detaching-graalvm-from-the-java-ecosystem-train
  • 81.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Project Leyden The primary goal of this Project is to improve the startup time, time to peak performance, and footprint of Java programs. https://www.youtube.com/watch?v=teXijm79vno https://openjdk.org/projects/leyden/
  • 82.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Word of Caution 84 Re-measure for your use case! Even with my examples measurements might already produce different results due to: ▪ Lambda Amazon Corretto Java 21 managed runtime minor version changes ▪ Lambda SnapStart snapshot create and restore improvements ▪ Firecracker microVM improvements ▪ GraalVM (major and minor version) and Native Image improvements ▪ There are still servers behind Lambda ▪ Java Memory Model impact (L or RAM caches hits and misses) ▪ Upgrading dependencies (AWS SDK for Java) tend to make them bigger increasing the cold start time
  • 83.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH „AWS Lambda SnapStart „ series 85 https://dev.to/vkazulkin/series/24979 Article series covers the why and what behind Lambda SnapStart and priming techniques including measurements for the cold and warm starts with different settings for: ▪ Java 11 ▪ Java 17 ▪ Java 21
  • 84.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH “Spring Boot 3.4/ Quarkus 3/ Micronaut 4 application on AWS Lambda” series 86 Article series covers different ways to write, run and optimize Spring Boot 3.4 / Quarkus 3 / Micronaut 4 applications on AWS Lambda using: ▪ Managed Java 21 Lambda runtime + SnapStart+ priming ▪ GraalVM Native Image Cold and warm start time measurements are also provided https://dev.to/vkazulkin/series/30408 https://dev.to/vkazulkin/series/31519 https://dev.to/vkazulkin/series/26067
  • 85.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH “Data API for Amazon Aurora Serverless v2 with AWS SDK for Java” series 87 Article series covers pure Java 21 cold and warm start time measurements and optimization techniques for Amazon Aurora Serverless v2 database with JDBC and Data API https://dev.to/vkazulkin/series/26067
  • 86.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH “Serverless applications with Java and Aurora DSQL” series https://dev.to/vkazulkin/series/32326 Article series covers pure Java 21 cold and warm start time measurements and optimization techniques (SnapStart+priming vs GraalVM Native Image) for Amazon Aurora DSQL database
  • 87.
    Vadym Kazulkin |@VKazulkin |ip.labs GmbH Thank you