Kubernetes for the basics is actually pretty easy despite what people say, I got a fairly simple cluster running with little pain but...it doesn't take long before you want multiple clusters, or vlan peering, or customising the DNS or.... and that is when it becomes complex.
What will fly.io do? Probably what everyone else does, starts simple, becomes popular and then caves in to the deluge of feature requests until they end up with an Azure/AWS/GCP clone. If it stays simple, lots of people will say that you will outgrow it quickly and need something else, if you increase functionality, you lose your USP of making infrastructure easy.
I think perhaps the abstractions are the problem, if you are abstracting at the same level as everyone else e.g. docker images, orchestration etc. then I don't understand how it can ever work differently.
To make my point, the very first comment below (above?) is about container format, a really fundamental thing that noobs are not likely to know about, they will just immediately have some kind of error.
What we did, instead, was built low level primitives, then built opinionated PaaS-like magic on top of those.
If you're running a Phoenix app, `fly launch` gets you going, then `fly deploy` gets you updated.
If you want to skip the PaaS layer and do something more intense, you can use our Machines API (or use Terraform to power it) and run basically anything you want: https://fly.io/docs/reference/machines/
We are very, very different than k8s. In some ways, we're lower level with more powerful primitives.
We probably won't build an AWS clone. I don't think devs want that. I also don't think devs want a constrained PaaS that only works for specific kinds of apps.
I think want devs want is an easy way to deploy a boring apps right now, and underlying infrastructure they can use to solve bigger problems over time.
I also don't want to set up my own log aggregator, grafana, and prometheus/alert manager, but for a quick "show everyone your app", I don't need those. I can add that harder crap later when the app shows promise and I actually need to debug performance.
No, mrkurt will not cave, I can guarantee you that. Fly will be a platform that says no to feature requests that don't make sense for their customer base.
I have no affiliation with Fly, other than I've used it on and off since the beginning of the platform's existence. They're a veteran team that knows how to build platforms. I definitely trust them to go in the right direction with their roadmap, and all my new projects go on Fly.
I concur.
Fly is not your typical startup with dreams of becoming the next big corp monster.
They are just a bunch of talented people with a vision having fun making cool stuff.
Then, Fly is not for such an application. Just not yet. I mean, we wouldn't buy a snowboard and complain we couldn't go skiing. Different tools.
The point really is, for a similar thing that which Fly is capable of (and other NewCloud services to an extent like railway.app, render.com, replit.com, convex.dev, workers.dev, deno.com, pages.dev, vercel.com, temporal.io etc), you're better off NOT using AWS/GCP/Azure. I certainly have found it to be true for whatever toy apps I build.
There's certainly a limit, but it makes me so sad that developers see the current state of orchestration and say “welp, it's a complex problem, guess this is as good as it gets” (not you specifically, but it's a common sentiment on HN.)
Sure, there will always be use cases that require getting down to a lower level, but there's definitely space for reducing complexity for common use cases.
The question is whether customers need or want that "functionality".
https://docs.podman.io/en/latest/markdown/podman-manifest-pu...
Wonder why they don't use OCI format.
# podman inspect quay.io/my/image:latest | jq ".[].ManifestType"
"application/vnd.oci.image.manifest.v1+json"
# podman push --format v2s2 quay.io/my/image:latest registry.fly.io/my-app:latestYour server doesn't have a static IP for outgoing requests, so to use it with RDS, you can't just open up a port on the RDS side. (They want you to set up your own proxy) https://community.fly.io/t/set-get-outgoing-ip-address-for-w...
The NFS kernel module isn't installed, so you can't use EFS. (They suggest some 3rd party userland tool)
They expect you to set up VPN access with Wireguard for any connections to your containers. You can't just SCP your files to a volume. It's so much more hassle than connecting with kubectl cp or scp, especially if you're hoping to script things.
All that said, I'm happy to see competition in the "we'll run your docker image" space.
It's a rough edge, to be sure! I just wouldn't want to leave it as "we think the status quo is the right way to handle getting files to and from instances".
[processes]
web = "bundle exec puma -C config/puma.rb"
worker = "config/cron_entrypoint.sh"
And the shell script looks like this: #!/bin/sh
printenv | grep -v "no_proxy" >> /etc/environment
cron -f
Although I would say that cron isn’t a great solution for containerized apps on most platforms, it seems like scheduled processes need a rethink for todays infra.If you just want to run a container in multiple regions with anycast, Fly is really the best option out there IMHO. Nothing comes close.
There are some rough edges for certain use cases but they keep polishing the service and the DX keeps getting better.
Personally the only features I'm missing today are:
- PG backups/snapshots. AFAIK these are coming in the form of virtual disk snapshots.
- Scale apps from zero to say 100 VMs like Cloud Run does. There's some autoscaling right now and the machines API, but still needs more polish. Specially for certain use cases like concurrent CPU heavy tasks (video encoding, etc). AFAIK some form of this is also coming in the next months.
I am very skeptical of this claim.
Just curious about other's people definition, as I imagined Fly.io as a more self-served BaaS (mostly for web applications)
- startups, small teams, no devops.
- horizontally scalable monoliths.
- no complex infra needs.
Haven't used fly.io but strongly considering migrating over from Heroku and above is roughly what I would say is a good fit for Heroku so hopefully it's the same on fly.io
Choose a docker image and just docker-compose up your application.
If you outgrow that, you might aswell switch to kubernetes and aws/gcp/azure
But if you do no amount of Kubernetes on the old school cloud providers is going to get you there. You will encounter the hard problems fly solves for.
Unless data is geographically sharded as well, is there really a benefit? Collaborative apps perhaps?
If you're trying all the things, https://railway.app/ is also a good option.
(I think I read Fly is planning to add scale to zero to their normal service, they currently have it at the api level with “fly machines”)
The other thing I want is a completely hands off managed DB with point in time restore. None of those three have that yet. Crunchy Data looks perfect but are not “in cloud” with them, only being on AWS/Azure/GCP. If one of those three added that capability in house I would probably just go for it.
koyeb was too hard to setup for me. railway easier, but the images were extremely unstable.
fly was easiest to setup, esp. DNS for custom domains and let's encrypt, and works fine with docker images. there's no GitHub app, but the docs for a deploy action were good enough.
I would also be remiss not to mention Coherence (withcoherence.com) [I'm a cofounder] where we're trying to deliver some of the same magic as the best in class PaaS's above, but we are running your workloads in your own AWS or GCP account. We're really excited about the potential future of a great developer experience that can be delivered as a service instead of rebuilt over and again by platform teams in-house.
It has bee pretty great and painless. I have my own docker servers, and so on, but I don't have local registry setup, so dealing with getting images moved around, and all that hassle was going to be annoying.
Made a free fly.io account, did the `flyctl deploy` after making the config file and it just worked moments later. Really a nice flow.
Not sure yet if I would use it for anything else, but this was nice and easy so its definitely on my list of things to check for new projects.
There's one thing that's straight out frustrating: no support for easy environment variables management. Yes you can add secrets but it's hard to read them back. Not everything is a secret, e.g. log level.
https://fly.io/docs/reference/configuration/#the-env-variabl...
You can't read secrets back once you've set them. That's because we can't read secrets back once we've set them --- at least, our API can't. The API has write-only access to our secret storage.
Environment is not code. And Heroku had it figured out.
And mind you, Fly.io is trying to sell itself as natural Heroku competitor.
We didn't know about Fly.io when we chose GCP for this setup.
The initial setup on GCP was exceedingly painful. We used CloudRun for our app server, with the value prop being that "it just works". It didn't. Our container failed to start with zero logs from our servers. Stackdriver was of no help. Eventually we found a Stackoverflow thread revealing that CloudRun didn't like Docker images built from Macs. As always, GCP's official docs and resources are incoherent. GCP docs address a hundred things you don't care about, and the signal-to-noise percentage is in the low teens, if we're being generous. We had to chase down half a dozen bureaucratic things to get our CloudRun app to see and talk to CloudSQL. Apparently with Fly.io, you just run a command to provision Postgres, and pass in an environment variable to your app.
We consoled ourselves that GCP was difficult to setup but now it's set-and-forget. This is also a lie. This week we saw elevated and unexplained 5xx. First was CloudRun randomly disconnecting from CloudSQL. As AWS measures reliability in terms of 9s, the way GCP DevRel responded to this bug is that this is a distributed system and therefore acceptable that things just fail a reliably human-reproducable 1%+ error rate. Yesterday we saw botnet traffic scanning for vulnerabilities on our app. This happens if you're on the web, not inherently GCP's fault. We have GCP's Cloud Load balancer setup but it's not very smart. We were able to manually block specific IP addresses but it's no where as meaningful as Cloudflare. Not a fan of Cloudflare the company but their products address a need. The botnet somehow knocked over our "Serverless VPC connection" to CloudSQL. Basically what that is is a proxy server that you are forced to setup because CloudRun can't actually talk to cloudSQL. All the auto-scaling claims of GCP's serverless are diminished if we are forced to introduce a single point of failures like this in the loop. That serverless VPC connection requires a minimum of 2+ VMs, so the scale-to-zero of CloudRun is no more.
Our experience with GCP is constantly having to come up with workarounds and addressing their accidental complexity. This should not be the customer's problem. For example, CloudSQL doesn't have an interface to query your databases. If you use a private IP for security, you can't even use GCP's command line tools to access this. We found out that GCE VMs are automatically networked to talk to CloudSQL. We ended up creating a "bastion" GCE VM instance and setup Postgres CLI tools in order to do ad-hoc queries of our DB state. For this, we just needed to the cheapest VM but GCP makes even this difficult. As for Stackdriver, it's still been an annoyingly painful UI.
* support for abstract apps, not only HTTP/web apps like heroku, let's say I want to deploy a SIP app
* support for HTTP/2 and potentially HTTP/3
If they do support these two I would say it's enough to be considered Heroku killer.
We do support HTTP/2.
Yes, fly employees, I will file a bug somewhere - or email me.
The rest is just deploying and running containers. There are lots of ways to do that. I loved using Google Cloud Run a few years ago. Stupidly easy to get started with and flexible enough for many things. With some service discovery on top, it's perfect for a lot of stuff. Add some managed middleware & databases to the mix and you essentially have a close to zero ops CI/CD capable environment. No devops needed for this either. When I did this for the first time, I was up and running with our dockerized app in about 15 minutes. Most of that was just waiting for builds to finish.
I'm CTO of a company currently and I've gotten sidetracked with enough lengthy and super expensive devops type stuff in past projects that I'm on purpose avoiding to go near certain things not because I can't do it but because I don't think these things are worth spending any time on for us right now. So, no terraform, no kubernetes, no microservices. I just don't have the time or patience for that stuff. We run a monolith. So, there's not a lot I actually need from my infrastructure. I need it to be fast, secure, and resilient and be able to run my monolith. But I don't need to have things like service discovery, complicated network setup (bog standard vpc is fine), and all the other stuff that devops people obsess about.
We use a load-balancer, I clicked one together in the Google UI. It's fine. Ten minute job. Doesn't need terraform scripting. We have two of them. And we have a couple buckets and our monolith behind that. I could grab the gcloud command that recreates this thing and put it somewhere. But I have more urgent things to do.
For deployment we use simple gcloud commands from github actions to update vms with new instance templates to tell them to run the latest container that our build produced. We started with cloud run but our monolith has a few worker threads that we don't want killed so we moved it to proper vms. Very easy to do in Google Cloud.
Our deploy command does a rolling restart. We have health checks, logging, monitoring, alerting, etc. Could be better but it works. Initial provisioning of the environment was manual and we scripted together all the commands that are part of our deploy process for automation. We added a managed redis, database, and elasticsearch to this. None of that was particularly hard or worth automating to me. Yes, it's bit of a snowflake. But not that complicated and I documented it. So, we can do it again in a few hours if we ever need to.
The dirty little secret of a lot of devops that it's a lot over over-engineered YAGNY stuff that is super labor intensive to setup and maintain and you end up using it a lot less often than people think.
This is why freelance devops engineers are so in demand: this stuff just requires a lot of manual work! Companies need these people full time and usually more than just one. The devops alone can add up to hundreds of thousands of dollars/euros per year.
It's a lot of manual work that probably should be automated. However, hiring a lot of people at great expense to automate things that are cheap and not that complicated is not always the best use of resources. I've seen companies that spend an order of magnitude more on devops salaries than on the actual hosting bills. If you think about it, that's kind of weird to be spending so much for so little gains. And most of these companies are not particularly big or experience enormous scaling issues.
This is especially true when you realize you want a QA env, US/EU env, staging, etc. If it's one server (or server and DB), it's much much easier to create more environments.
They are also missing a proper managed database with point-in-time backups through the web UI like those offered by most proper PaaS services.
We did bake nixpacks into our CLI recently, they seem better for our particular environment than buildpacks. Railway.app did a great job with these: https://nixpacks.com/docs/getting-started
We're working on managed databases, but we're not doing them like Heroku did. We just launched a preview of managed Redis with Upstash: https://fly.io/docs/reference/redis/
This seems like the future of managed databases on a platform like ours. There are companies that build very good managed database services. We're getting to the size where these people will work with us. Getting well managed DBs onto the platform is basically what I'm spending all my time on these days.
Incidentally, we're a lot cheaper than Heroku because we run our own infrastructure.
If you happen to have a service beta testing anything I would be interested in joining it.
Is `fly postgres create --name restoredDb --snapshot-id backupId` that hard that's it's a deal breaker?
> support for Heroku buildpacks
I haven't tried it but there's some buildpack support: https://fly.io/docs/reference/configuration/#builder
As for the managed DBs, one of their founders was from Compose, so yeah they know how these things work. But AFAIK Fly doesn't have much interest in DBs, their focus is really in VMs.