We're two programmers who have worked in core/platform engineering roles for most of our working lives. During that time, one of the main problems we've solved time and time again is to let people run their ad-hoc jobs and scripts on remote compute without hassle.
To solve this once and for everyone, we made Meadowrun, an open source tool that automates the tedious details of running Python code on cloud VMs. It runs in your AWS or Azure account, nothing else required.
No need to mess around with containers, SSH into remote machines, copy code across, set up images or look up instance types that sound like Starbucks orders ("t3.venti.oatmilk.latte") and what they cost.
All with the same experience as you'd have running on your laptop - just change the code or dependencies locally and run - meadowrun takes care of the rest.
We welcome any and all feedback!
But I think there’s scope for both, data jobs needs ec2.
Any plans to implement Fargate as an option? You mention the limitations of Lambda and Fargate pretty much takes care of all of those, without needing to provision EC2.
Fargate is more of a maybe for us as it doesn't seem to offer a ton of advantages over EC2. It still takes about 30 seconds to launch a Fargate job, and as far as I can tell there's no way to "keep an instance around". With Meadowrun-on-EC2 or Lambda, when you run two jobs run one after another with the same libraries and the same code (or even slightly different code), there's almost 0 overhead for running the second job. So Fargate is only slightly better for a cold start (30s compared to 45-60s for an EC2 instance in my experience), and significantly worse for a warm start (still 30s). And that's the core experience we're trying to make amazing--run some code, look at the results/data, tweak it, run it again, repeat.
Meadowrun is taking care of all the messy details of provisioning and managing the EC2 instances, so Meadowrun-on-Fargate won't be any easier to use than Meadowrun-on-EC2, and I don't see a ton of advantages to make up for the inability to get a warm start on Fargate. That said, AWS is super dynamic, so we're definitely keeping an eye on Fargate.
https://news.ycombinator.com/item?id=28191450
We got quite a few comments on hacker news but unfortunately we didn't see a lot of uptake.
If you want to connect, I'm danbmil99 at gmail etc
Another pro is if your workflows aren't already container-based, not running on Kubernetes means we can build your containers for you on Meadowrun so you don't need to e.g. install Docker locally to get your libraries/code running on Meadowrun (it's hard to build containers in Kubernetes itself).
I mentioned this in another comment, but this also means we can e.g. use AWS Lambda as the compute layer, or if you have software that's hard to containerize, you can even use a custom AMI. (Both of these are features on the roadmap, so this is a bit theoretical at this point.)
The biggest con is probably that a lot of people already use Kubernetes, especially if they have an on-prem/hybrid deployment, or maybe if they have services with e.g. a load balancer that interact with their ad-hoc/batch jobs.
We are planning on adding the ability for Meadowrun to target Kubernetes as well, so Kubernetes takes care of the resource scheduling, but you still get the benefits of Meadowrun--a really simple API for running ad-hoc/batch jobs.
[1] http://pywren.io
We've looked at the code for PyWren and in our opinion it's not practically usable as-is, even if you wanted to target only Lambda. Also we initially focussed more on the deployment aspect (i.e. getting environment + code on the target machines reproducibly), and EC2 because we figured to make this general enough people would need an escape hatch anyway if Lambdas didn't cut it for some reason.
[1] https://github.com/Vaishaal/numpywren
[2] "From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers" https://www.usenix.org/conference/atc19/presentation/fouladi
i4i instances recently launched. so much fast local disk. so much bandwidth to s3. needs more data processing.
subscribing to the commits on github.
Another signal of cloud rot: I see myself and my peers are migrating away from AWS to smaller, less complicated, cheaper providers like Linode, but also Hetzner, Wasabi.
Nowadays the cloud fatigue is higher than the burden of self hosting your services.