Python on Serverless: Strategies for Peak Performance

Abstract

We deployed Python in serverless environments and quickly saw the performance limits. Serverless systems suffer from startup latency, memory overhead, and repeated object creation as their executions start almost from scratch. Those extra seconds made our user experience painfully slow—and we couldn’t afford it.

We looked into the performance of our flight search engine, ran profilers, and applied optimizations. We found real gains when tuning Python’s GC, reducing stop-the-world pauses, and introducing an asynchronous post-execution process that runs after the handler returns. We reduced execution time from 1.2 seconds to 300 milliseconds—a 4× speedup with just a few tweaks.

Understanding Python’s memory model and its runtime behavior was essential, and it’s something we’ll dive into during the talk. What we share is based on AWS Lambda, but it can be applied to any short-lived Python system—whether serverless or containerized.

Another big gain came from replacing Pydantic with TypedDict for faster parsing, using Redis strategically to distribute operations, and restructuring code to eliminate duplicated transformations. Our web API processes thousands of flight offers—deserialize, enrich, transform—and every millisecond counts.

While covering optimization techniques, when they matter, and how we measured their impact, the main idea is that you will walk away with tactics that help your system stay fast while it grows.