I’m sure there are a bunch of things that make it the right choice for Stripe. Obviously if you just have too many things to run at a time and a dev laptop can’t handle it then it’s a dealbreaker. What’s the size of the cloud instances you have to run on?
I don't think there's confusion. I only have total access when the VM is provisioned, but I need to update the dev machine constantly.
Part of what makes a VM work well is that you can make changes and they're sticky. Folks will edit stuff in /etc, add dotfiles, add little cron jobs, build weird little SSH tunnels, whatever. You say "I can know versions", but with a VM, I can't! Devs will run update stuff locally.
As the person who "deploys" the VM, I'm left in a weird spot after you've made those changes. If I want to update everyone's VM, I blow away your changes (and potentially even the branches you're working on!). I can't update anything on it without destroying it.
In constrast, the dev servers update constantly. There's a dozen moving parts on them and most of them deploy several times a day without downtime. There's a maximum host lifetime and well-documented hooks for how to customize a server when it's created, so it's clear how devs need to work with them for their customizations and what the expectations are.
I guess its possible you could have a policy about when the dev VM is reset and get developers used to it? But I think that would be taking away a lot of the good parts of a VM when looking at the tradeoffs.
> What’s the size of the cloud instances you have to run on?
We have a range of options devs can choose, but I don't think any of them are smaller than a high-end laptop.
In terms of needing to reset, it’s just a matter of git branch, push, reset, merge. In your world that sync complexity happens all the time, in mine just on reset.
Just to be clear, I think it’s interesting to have a healthy discussion about this to see where the tradeoffs are. Feels like the sort of thing where people try to emulate you and buy themselves a bunch of complexity where other options are reasonable.
I have no doubt Stripe does what makes sense for Stripe. I’d also wager than on balance it’s not the best option for most other teams.
PS thanks for chiming in. I appreciate the extra insights and context.
They do, but I can see those changes if I'm helping debug, and more importantly, we can set up the most important parts of the dev processes as services that we can update. We can't ssh into a VM on your laptop to do that.
For example, if you start a service on a stripe machine, you're sending an RPC to a dev-runner program that allocates as many ports as are necessary, updates a local envoy to make it routable, sets up a systemd unit to keep it running, and so forth. If I need to update that component, I just deploy it like anything else. If someone configures their host until that dev runner breaks, it fails a healthcheck and that's obvious to me in a support role.
> Just to be clear, I think it’s interesting to have a healthy discussion about this to see where the tradeoffs are. Feels like the sort of thing where people try to emulate you and buy themselves a bunch of complexity where other options are reasonable.
100% Agree! I think we've got something pretty cool, but this stuff is coming from a well-resourced team; keeping the infra for it all running is larger than many startups. There's tradeoffs involved: cost, user support, flexibility on the dev side (i.e. it's harder to add something to our servers than to test out a new kind of database on your local VM) come immediately to mind, but there are others.
There are startups doing lighter-weight, legacy-free versions of what we're doing that are worth exploring for organizations of any size. But remote dev isn't the right call for every company!