Seriously though, what is your RTO and RPO? We are talking systems that when they are down you are on the news. Systems where minutes of downtime are millions of dollars. I encourage you to setup some time with your CTO at Arist and talk through these questions.
2. I mentioned API Gateway and Lambda because OP asked if in general it is difficult to go multi-region (not specifically asking about Roblox), and most startups, and most companies in general, do not have the same technical requirements in terms of managing game state that Roblox has (and are web app based), and thus in general doing a series of load balancers + latency based routing or API Gateway + Lambda + latency based routing is good approach for most companies especially now with ala carte solutions like Ruby on Jets, serverless framework, etc. that will do all the work for you.
3. That said, I do think that we are on the verge of seeing a really strong viable serverless-style option for game servers in the next few years, and when that happens costs are going to go way way down because the execution context will live for the life of the game, and that's it. No need to over-provision. The only real technical limitation is the hard 15 minute execution time limit and mapping users to the correct running instance of the lambda. I have a side project where I'm working on resolving the first issue but I've resolved the second issue already by having the lambda initiate the connection to the clients directly to ensure they are all communicating with the same instance of the lambda. The first problem I plan to solve by pre-emptively spinning up a new lambda when time is about to run out and pre-negotiate all clients with the new lambda in advance before shifting control over to the new lambda. It's not done yet but I believe I can also solve the first issue with zero noticable lag or stuttering during the switch-over, so from a technical perspective, yes, I think serverless can be a panacea if you put in the effort to fully utilize it. If you're at the point where you're spinning up tens of thousands of servers that are doing something ephemeral that only needs to exist for 5-30 minutes, I think you're at the point where it's time to put in that effort.
4. I am in fact the CTO at Arist. You shouldn't assume people don't know what they're talking about just because they find the status quo of devops at [insert large gaming company here] a little bit antiquated. In particular, I think you're fighting a losing battle if you have to even think about what instance type is cheapest for X workload in Y year. That sounds like work that I'd rather engineer around with a solution that can handle any scale and do so as cheaply as possible even if I stop watching it for 6 months. You may say it's crazy, but an approach like this will completely eat your lunch if someone ever gets it working properly and suddenly can manage a Roblox-sized workload of game states without a devops team. Why settle for anything less?
5. Regarding the systems I work with -- we send ~50 million messages a day (at specific times per day, mostly all at once) and handle ~20 million user responses a day on behalf of more than 15% of the current roster of fortune 500 companies. In that case, going 100% lambda works great and scales well, for obvious reasons. This is nowhere near the scale Roblox deals with, but they also have a completely different problem (managing game state) than we do (ensuring arbitrarily large or small numbers of messages go out at exactly the right time based on tens of thousands of complex messaging schedules and course cadences)
Anyway, I'm quite aware devops at scale is hard -- I just find it puzzling when small orgs have it perfectly figured out (plenty of gaming startups with multi-region support) but a company on the NYSE is still treating us-east-1 or us-east-2 like the only region in existence. Bad look.
Also, still sounding like you don’t understand how large systems like Roblox/Twitter/Apple/Facebook/etc are designed, deployed, and maintained-which is fine; most people don’t–but saying they should just move to llamda shows inexperience in these systems. If it is "puzzling" to you, maybe there is something you are missing in your understanding of how these systems work.