Also, if you're using Lambda to ingest data consider using one lambda as a dispatcher to SQS and another hanging off that SQS queue for processing. That allows you to control concurrency better since you can set the batch size + concurrency on the SQS trigger.
By doing the above makes your critical path super-fast, because you're doing no processing. That works great if the incoming data needs no response from the back-end.
This confused me. Wouldn't 50 / 0.5 be 100?
Remember that PC is nothing but total number of lambda instances that AWS keeps warm and ready to go.
From my experience, they are clumsy, complex to set up, to manage, you can't easily have CI/CD (I still don't know how you get the code of a lambda from and to git ?!?!), etc, etc...
Is it just me ? Am I alone ?
:)
Seriously, the reason you use lambdas is that they're small self-contained chunks of functionality that you need to scale out.
Let's take an easy example: you want to ingest tracking metrics.
You can write this as a server or as a lambda. For a server you'd listen on port 80 for POST or GET requests, then take that and write it to your data store. You can do this pretty easily in express/node.
Now the question is, how do you scale that up to a few hundred requests per second? How concurrent is your express app? Are you going to run out of memory on your server because you didn't set the TCP buffer sizes for server sized usage? How do you take your service offline for upgrades/updates? Are you going to crush your database/datastore with hundreds of writes a second? If your ISP barfs are you OK with losing data for that time period? What happens when all of your clients try to connect at once because of a regional power failure?
From a code point of view, what if some random change in your code somewhere causes your ingestion stuff to fail?
Seriously, you don't have to deal with any of this crap if you don't want to. For 1 request every few seconds who cares what you use. But once you start scaling your problems with a server side solution become more and more work to handle.
Again, if you don't need scale don't use AWS. You can always do it cheaper with a server from lowendbox.com.
As for vendor lock-in, well, it's trivial to design your lambdas so you can drop them into a server-side solution (or vice versa). And when you think of vendor lock-in you have to consider also that a bespoke solution is locking in you as the vendor.
I guess what initially converted me was dealing with a data flow that hovered at about 10 requests/min and then every hour jumped to 10k requests/min. Being able to do that without thinking about scaling or cost really changed the way I imagined data flows in the backend.
I guess now that I have boilerplate of code that works and its quickly deployed and tested with GH actions, I consider it a no brainer for most workflows.
I had to go through a couple of attempts at it before settling on building and pushing Docker images via GitHub Actions. Now deployment is a breeze.
However, I don't know if it's optimum to use Docker images in terms of performance.
I have this setup myself, let me dig it up and will reply in a later post