There’s a lot of advantages to primarily using serverless architectures. The engineering team can focus on points of differentiation for the business and delivering business value quicker. If you’re not aware of what Serverless Computing is, I recommend reading my blog article “What is Serverless Computing“
The pay per use pricing model of function as a service makes it appear to be really cheap. For a lot of workloads, it is really a lot cheaper and maybe even free. There’s a lot of costs that can quickly add up and surprise businesses.
AWS Lambda Free Tier
AWS Lambda has a really interesting pricing model with an excellent free tier of 1M free requests per month and 400,000 GB-seconds of computing time per month. And, it doesn’t expire after 1 year like a lot of AWS service’s free tiers.
This means the lambda pricing model is all about invocations and execution time. AWS Lambda currently charges $0.20USD per 1 million invocations which isn’t really the expensive part of the processing.
Pricing Execution Time
The amount of memory allocated to a function is configurable and has a direct impact on the cost of the function and how fast it will execute.
AWS bills in 100ms increments for execution time. We know that AWS provides faster network access and faster more powerful CPUs based on the memory that’s been selected. The price increases though with the amount of configured memory regardless of how much is actually consumed during execution time.
For example, 128mb is the cheapest option but also the slowest option which means it will take longer to respond. 3gb is the fast option at time of writing, but is over 20x the cost for every 100ms.
So, how do we reduce our AWS Lambda Bill?
Optimizing execution time is really important, we need to try and make the most efficient use of resources and terminate requests as soon as possible. API responses don’t necessarily have to finish in 250ms, it might be okay to reduce a lambda down to 512mb of RAM and half the execution bill.
A lambda function should never be waiting for anything, instead you should try and use triggers, queues, or step functions to decouple.
If you need to wait for things, you should try and use callbacks instead of async/await or promises everywhere so a few things could execute at “the same time.”
If we’re using a Queue like SQS, we should try and fill the queue as much as we can and then process as many messages as possible.