As I've mentioned before, I'm writing a high performance metaverse client. Here's a demo video.[1] It's about 40,000 lines of Rust so far.
If you are doing a non-crappy metaverse, which is rare, you need to wrangle a rather excessive amount of data in near real time. In games, there's heavy optimization during game development to prevent overloading the play engine. In a metaverse, as with a web browser, you have to take what the users create and deal with it. You need 2x-3x the VRAM a comparable game would need, a few hundred megabits per second of network bandwidth to load all the assets from servers, a half dozen or so CPUs running flat out, and Vulkan to let you put data into the GPU from one thread while another thread is rendering.
So there will be some parallelism involved.
This is not like "web-scale" concurrency, which is typically a large number of mini-servers, each doing their own thing, that just happen to run in the same address space. This is different. There's a high priority render thread drawing the graphics. There's a update thread processing incoming events from the network. There are several asset loading and decompression threads, which use up more CPU time than I'd like. There are about a half dozen other threads doing various miscellaneous tasks - handling moving objects, updating levels of detail, purging caches, and such.
There's considerable locking, but no "static" data other than constants. No globals. Channels are used where appropriate to the problem. The main object tree is single ownership, and used mostly by the update thread. Its links to graphics objects are Arc reference counted, and those are updated by both the update thread and the asset loading threads. They in turn use reference counted handles into the Rend3 library, which, via WGPU and Vulkan, puts graphics content (meshes and textures) into the GPU. Rendering is a loop which just tells Rend3 "Go", over and over.
This works out quite well in Rust. If I had to do this in C++, I'd be fighting crashes all the time. There's a reason most of the highly publicized failed metaverse projects didn't reach this level of concurrency. In Rust, I have about one memory related crash per year, and it's always been in someone else's "unsafe" code. My own code has no "unsafe", and I have "unsafe" locked out to prevent it from creeping in. The normal development process is that it's hard to get things to compile, and then it Just Works. That's great! I hate using a debugger, especially on concurrent programs. Yes, sometimes you can get stuck for a day, trying to express something within the ownership rules. Beats debugging.
I have my complaints about Rust. The main ones are:
- Rust is race condition free, but not deadlock free. It needs a static deadlock analyzer, one that tracks through the call chain and finds that lock A is locked before lock B on path X, while lock B is locked before path A on path Y. Deadlocks, though, tend to show up early and are solid problems, while race conditions show up randomly and are hard to diagnose.
- Async contamination. Async is all wrong when there's considerable compute-bound work, and incompatible with threads running at multiple priorities. It keeps creeping in. I need to contact a crate maintainer and get them to make their unused use of "reqwest" dependent on a feature, so I don't pull in Tokio. I'm not using it, but it's there.
- Single ownership with a back reference is a very common need, and it's too hard to do. I use Rc and Weak for that, but shouldn't have to. What's needed is a set of traits to manage consistent forward and back links (that's been done by others) and static analysis to eliminate the reference counts. The basic constraints are ordinary borrow checker restrictions - if you have mutable access to either parent or child, you can't have access to the other one. But you can have non-mutable access to both. If I had time, I'd go work on that.
- I've learned to live without objects, but the trait system is somewhat convoluted. There's one area of asset processing that really wants to be object oriented, and I have more duplicate code there than I like. I could probably rewrite it to use traits more, but it would take some bashing to make it fit the trait paradigm.
- The core graphics crates aren't finished. There was an article on HN a few days ago about this. "Rust has 5 games and 50 game engines". That's not a language problem, that's an ecosystem problem. Not enough people are doing non-toy graphics in Rust. Watch my video linked below.[1] Compared to a modern AAA game title, it's not that great. Compared to anything else being done in Rust (see [2]) it's near the front. This indicates a lack of serious game dev in Rust. I've been asked about this by some pro game devs. My comment is that if you have a schedule to meet, the Rust game ecosystem isn't ready. It's probably about five people working for a year from being ready.
That sounds like a great idea. Something in the style of lockdep, that (when enabled) analyzes what locks are currently held while any other lock is taken, and reports any potential deadlocks (even if they haven't actually deadlocked).
That would require some annotation to handle cases of complex locking, so that the deadlock detection knows (for instance) that a given class of locks are always obtained in address order so they can't deadlock. But it's doable.
parking_lot has a deadlock detection feature for when you deadlock that iirc tells you what deadlocked (so you're not trying to figure it out with a debugger and a lot of time) https://amanieu.github.io/parking_lot/parking_lot/deadlock/i...
I also just found out about https://github.com/BurtonQin/lockbud which seems to detect deadlocks and a few other issues statically? (seems to require compiling your crate with the same version of rust as lockbud uses, which from the docs is an old 1.63 nightly build?)
It's quite nice, but for cpp not rust
If locks can be numbered or otherwise ordered, it would be easy to enforce a strict order of taking locks and an inverse strict order of releasing them, by looking up in the registry which locks your thread is currently holding. This would prevent deadlocks.
This, of course, would require to have an idea of all the locks you may want to hold, and their relative order (at least partial), as Dijkstra described back in the day. But thinking about locks is ahead of time is a good idea anyway.
One quibble though. Rust isn't race condition free, it's data race free. You can still end up with race conditions outside of data access. https://news.ycombinator.com/item?id=23599598
The priority thing is relatively easy to fix:
Either create multiple thread pools, and route your futures to them appropriately.
Or, write your own event loop, and have it pull from more than one event queue (each with a different priority).
It should be even easier than that, but I don’t know of a crate that does the above out of the box.
One advantage of the second approach is (if your task runtime is bounded) that you can have soft realtime guarantees for high priority stuff even when you are making progress on low priority stuff and running at 100% CPU.
I've been collecting a list[1] of what memory-management policies programmers actually want in their code; it is far more extensive than any particular language actually implements. Contributions are welcome!
I already had back reference on the list, but added some details. When the ownership is indirect (really common) it is difficult to automate.
One thing that always irritates me: Rust's decision to make all objects moveable really hurts it at times.
[1] https://gist.github.com/o11c/dee52f11428b3d70914c4ed5652d43f...
One challenge with rust is that (for better or worse) most gamedev talent is C++. If you ever open source it I’d be interested in contributing, though I’m not sure how effective the contributions would be.
Good luck!
I'm not that interested in self-promotion here as I am in getting more activity on Rust graphics development. I think the Rust core graphics ecosystem needs about five good graphics people for a year to get unstuck. Rust is a good language for this sort of thing, but you've got to have reliable heavy machinery down in the graphics engine room.
Until that exists, nobody can bet a project with a schedule and a budget on Rust. The only successful commercial high-detail game title I know of that uses Rust is a sailing race simulator. They simply linked directly to "good old DX11" (Microsoft Direct-X 11) and wrote the game logic in Rust. Bypassed Rust's own 3D ecosystems completely.
"I've learned to live without objects, but the trait system is somewhat convoluted. There's one area of asset processing that really wants to be object oriented, and I have more duplicate code there than I like. I could probably rewrite it to use traits more, but it would take some bashing to make it fit the trait paradigm."
Can you expand on this? I come from the C# world and the Rust trait system feels expressive enough to implement the good parts of OOP.I've always wondered why the "color" of a function can't be a property of its call site instead of its definition. That would completely solve this problem - you declare your functions once, colorlessly, and then can invoke them as async anywhere you want.
If you have a non-joke type system (which is to say, Haskell or Scala) you can. I do it all the time. But you need HKT and in Rust each baby step towards that is an RFC buried under a mountain of discussion.
It isn't like JavaScript where there is truly only one thread of execution at a time and blocking it will block everything.
This is a far superior workflow when you factor in outcomes. More up front time to get a "correct"/more-reliable output scales infinitely better than than churning out crap that you need to wrap in 10,000 lines of tests to keep from breaking/validate (See: the dumpster-fire that is Rails)
I’m a strong-typing enthousiast, too, but still, I’m not fully convinced that’s true.
It seems you can’t iterate fast at all in Rust because the code wouldn’t compile, but can iterate fast in C++, except for the fact that the resulting code may be/often is unstable.
If you need to try out a lot of things before finding the right solution, the ability to iterate fast may be worth those crashes.
Maybe, using C++ for fast iterations, and only using various tools to hunt down issues the borrow checker would catch on the iteration you want to keep beats using Rust.
Or do Rust programmers iterate fast using unsafe where needed and then fix things once they’ve settled on a design?
This is a big problem. Fast iteration time is very valuable.
And who likes doing this to themselves anyway? Isn't it a very frustrating experience? How is this the most loved language?
The thing is, these dependencies do exist no matter what language you use if they stem from an underlying concept. In that case rust just makes you explicitly write them which is a good thing since in C++ all these dependencies would be more or less implicit and everytime somebody edits the code he needs to think all these cases through and get a mental model (if he sees it at all!). In Rust you at least have the lifetime annotations which make it A: obvious there is some special dependency going on and B: show the explicit lifetimes etc.
So what I'm saying, you need to put in this work no matter which language you choose, writing it down is then not a big problem anymore. If you don't think about these rules your program will probably work most of the time but only most of the time, and that can be very bad for certain scenarios.
Personal preference and pain tolerance. Just like learning Emacs[1] - there's lots of things that programmers can prioritize, ignore, enjoy, or barely tolerate. Some people are alright with the fact that they're prototyping their code 10x more slowly than in another language because they enjoy performance optimization and seeing their code run fast, and there's nothing wrong with that. I, myself, have wasted a lot of time trying to get the types in some of my programs just right - but I enjoy it, so it's worth it, even though my productivity has decreased.
Plus, Rust seems to have pushed out the language design performance-productivity-safety efficiency frontier in the area of performance-focused languages. If you're a performance-oriented programmer used to buggy programs that take a long time to build, then a language that gives you the performance you're used to with far fewer bugs and faster development time is really cool, even if it's still very un-productive next to productivity-oriented languages (e.g. Python). If something similar happened with productivity languages, I'd get excited, too - actually, I think that's what's happening with Mojo currently (same productivity, greater performance) and I'm very interested.
Whereas after you prove the safety of a design once, it stays with you.
I have fought the ownership rules and lost (replaced references by integers to a common vector-ugly stuff, but I was time constrained). But I have seen people spend several weeks debugging a single problem, and that was really soul-crushing.
I don't personally mind debugging, too much, but if your goal is to avoid bugs in your running software, then Rust has some serious advantages. We mainly use TypeScript to do things, which isn't really comparable to Rust. But we do use C when we need performance, and we looked into Rust, even did a few PoCs on real world issues, and we sort of ended up in a situation similar to GP. Rust is great though a bit "verbose" to write, but its eco-system is too young to be "boring" enough for us, so we're sticking with C for the time being. But being able to avoid running into crashes by doing the work before your push your code is immensely valuable in fault-intolerant systems. Like, we do financial work with C, it cannot fail. So we're actually still doing a lot of the work up-front, and then we handle it by rigorously testing everything. Because it's mainly used for small performance enhancement, our C programs are small enough to where this isn't an issue, but it would be a nightmare to do with 40.000 lines of C code.
I would much rather bang my head against a compiler for N hours, and then finally have something that compiles -- and thus am fairly confident works properly -- than have something that compiles and runs immediately, but then later I find I have to spend N hours (or, more likely, >N hours) debugging.
Your preferences may differ on this, and that's fine. But in the medium to long term, I find myself much more productive in a language like Rust than, say, Python.
Just wondering, how long did it take you to hit 40k lines? I’m a new Rust developer and it’s taken me ages to get this far.
I totally relate to your experience though. When I finally get my code to compile, it “just works” without crashes. I’ve never felt so confident in my code before.
This isn't a new idea for a desirable state. Same experience with Modula-2 three decades ago. A page or more of compiler errors to clear, then suddenly adiabatic. A very satisfying experience.
If you want extreme low contention extreme high-utilization, you’re doing threading and event-driven simultaneously, there are no easy answers on heavily contended data structures because you can’t duplicate to exploit immutability if mere insane complexity is an alternative, and mistakes cost millions in real time.
There’s a reason why those places scout so heavily for the lock-free/wait-free galaxy brains the minute they finish their PhDs.
That's not a serious article. That's a humourous video.
And are you using an ECS based architecture? Do you feel you’d have a different opinion if you were?
Is there a ML to subscribe to, to learn when the viewer is more generally available for testing? Thanks again!
Do... you... wind up having to set TCP_NODELAY?
•͡˘㇁•͡˘
Why? I'd take modern C++ over Rust every day of the week.
Why? (Serious question)
(Plus some increase in content load over the network, which does exist ala runtime mod loading, streaming, etc)
Without judgment I must ask, what made you decide to target metaverse specifically? Is it more of a fun challenge, or do you see it having a bright/popular future?
It is data-race free however.
This is the metaverse data overload problem - many creators, little instancing. No art director. No Q/A department. No game polishing. It's quite solveable, though.
Those occasional flashes on screen are the avatar (just a block in this version) moving asynchronously from the camera. That's been fixed.
The trouble is, we actually have tens/hundreds of people, all working on their own. The blessing and curse of open source development
The guy's got a point in that doing a bunch of Arc, RwLock, and general sharing of state is going to get messy. Especially once you are sprinkling 'static all over the place, it infects everything, much like colored functions. I did this whole thing once back when I was starting off where I would Arc<RwLock> stuff, and try to be smart about borrow lifetimes. Total mess.
But then rust also has channels. When you read about it, it talks about "messages", which to me means little objects. Like a few bytes little. This is the solution, pretty much everything I write now is just a few tasks that service some channels. They look at what's arrived and if there's something to output, they will put a message on the appropriate channel for another task to deal with. No sharing objects or anything. If there's a large object that more than one task needs, either you put it in a task that sends messages containing the relevant query result, or you let each task construct its own copy from the stream of messages.
And yet I see a heck of a lot of articles about how to Arc or what to do about lifetimes. They seem to be things that the language needs, especially if you are implementing the async runtime, but I don't understand why the average library user needs to focus so much on this.
When moving between threads I do what you suggest here and use channels to send signals rather than having a lot of shared state. Sometimes there is a crucial global state something that’s easier to just directly access, but I just write struct that manages all the Arc/RwLock or whatever other exclusive access mechanism I need for the access patterns. From the callers point of view everything is just a simple function call. When writing the struct I need to be thoughtful of sharing semantics but it’s a very small struct and I write it once and move on.
I also don’t understand their concern about making things Send+Sync. In my experience almost everything is easily Send+Sync, and things that aren’t shouldn’t or couldn’t be.
I get that sometimes you just want to wear sweatpants and write code without thought of the details, but most languages that offer that out of the box don’t really offer efficient concurrency and parallelism. And frankly you rarely actually need those things even if the “but it’s cool” itch is driving you. Most of the time a nodejs-esque single threaded async program is entirely sufficient, and a lot of the time Async isn’t even necessary or particularly useful. But when you need all these things, you probably need to hike up your sweatpants and write some actual computer code - because microseconds matter, profiled throughput is crucial, and nothing in life that’s complex is easy and anyone selling you otherwise is lying.
This is a recurring pattern I've started to notice with Rust: most things that repeatedly feel clunky, or noisy, or arduous, can be wrapped in an abstraction that allows your business logic to come back into focus. I've started to think this mentality is essential to any significant Rust project.
Async the keyword doesn’t, but Tokio forces all of your async functions to be multi thread safe. And at the moment, tokio is almost exclusively the only async runtime used today. 95% of async libraries only support tokio. So you’re basically forced to write multi thread safe code even if you’d benefit more from a single thread event loop.
Rust async’s set up is horrid and I wish the community would pivot away to something else like Project Loom.
I write a fair amount of code in Elixir professionally and this isn't how I view it.
There are some specific Elixir/Erlang bits of ceremony you need to do to set up your supervision tree of GenServers, but then once that's done you get to write code that feels like so gle threaded "ignore the rest of the world" code. Some of the function calls you're making might be "send and message and wait for a response" from GenServers etc. but the framework takes care of that.
I wrote some driver code for an NXP tag chip. Driving the inventory process is a bit involved, you have to do a series of things, set up hardware, turn on radio, wait a bit, send data, service the SPI the whole time in parallel. With the right setup for the hardware interface I just wrote the whole thing as a sequence, it was the simplest possible code you could imagine for it. And this at the same time as running a web server, and servicing hardware interrupts that cause it to reload the state of some registers and show them to each connected web session.
I imagine Rust to be a language far more similar to Go, in both use cases and functionality, than JS.
The dream of Smalltalk and true OOP is still alive.
If you say Smalltalk is better OOP I might agree, but calling it "true" is not correct.
When you need the absolute best performance sharing state is sometimes better - but you need a deep understanding of how your CPUs share state. A mutex or atomic write operation is almost always needed (the exceptions are really weird), and those will kill performance so you better spend a lot of time minimizing where you have them.
I would also suggest looking into ringbuffers and LMAX Disruptor pattern.
There is also Red Planet Lab's Rama, which takes the data flow idea and uses it to scale.
As a wise programmer once said, "Do not communicate by sharing memory; instead, share memory by communicating"
(But if you're only firing up a few tasks, why not just use threads? To get a nice wrapper around an I/O event loop?)
(This is assuming you are already switching to communicating using channels or similar abstraction.)
To get easier timers, to make cancellation at all possible (how to cancel a sync I/O operation?), and to write composable code.
There are patterns that become simpler in async code and much more complicated in sync code.
From https://news.ycombinator.com/item?id=37289579 :
> I haven't checked, but by the end of the day, I doubt eBPF is much slower than select() on a pipe()?
Channels have a per-platform implementation.
- "Patterns of Distributed Systems (2022)" (2023) https://news.ycombinator.com/item?id=36504073
Async code can scale essentially infinitely, because it can multiplex thousands of Futures onto a single thread. And you can have millions of Futures multiplexed onto a dozen threads.
This makes async ideal for situations where your program needs to handle a lot of simultaneous I/O operations... such as a web server:
http://aturon.github.io/blog/2016/08/11/futures/
Async wasn't invented for the fun of it, it was invented to solve practical real world problems.
Ultimately, it depends on your data model.
When you can guarantee sole ownership, why not put that exclusive pointer in the message? I’d think that this sort of compile-time lock would be an important advantage for the type system. (I think some VMs actually do this sort of thing dynamically, but I can’t quite remember where I read about it.)
On a multiprocessor, there’s of course a balance to be determined between the overhead of shuffling the object’s data back and forth between CPUs and the overhead of serializing and shuffling the queries and responses to the object’s owning thread. But I don’t think the latter approach always wins, does it? At least I can’t tell why it obviously should.
like “send request to channel A with message 123, make sure to get a response back from channel B exactly for that message”
But green threads were not and are not the right solution for Rust, so it's kind of beside the point. Async Rust is difficult, but it will eventually be possible to use Async Rust inside the Linux kernel, which is something you can't do with the Go approach.
Rust: it turns out that not every concurrency needs to be zero-cost abstraction
If you have a service that handles massive amounts of network calls at the core (think linkerd, nginx, etc.), or you want to have a massive amount of lightweight tasks in your game, or working on an embedded software where you want cooperative concurrency, async Rust is an amazing super-power.
Most system/application level things is not going to need async IO. Your REST app is going to be perfectly fine with a threadpool. Even when you do need async, you probably want to use it in a relatively small part of your software (network), while doing most of the things in threads, using channels to pass work around between async/blocking IO parts (aka hybrid model).
Rust community just mindlessly over-did using async literally everywhere, to the point where the blocking IO Rust (the actually better UX one) became a second class citizen in the ecosystem.
Especially visible with web frameworks where there is N well designed async web frameworks (Axum, Wrap, etc.) and if you want a blocking one you get:
tiny_http, absolute bare bones but very well done
rouille - more wholesome, on top of tiny_http, but APIs feel very meh comparing to e.g. Axum
astra - very interesting but immature, and rather barebonesBut it also praises Go for its implementation, which is also based on a coroutine of a different kind. Stackful coroutines, which do not have any of these problems.
Rust considered using those (and, at first, that was the project's direction). Ultimately, they went to the stackless operation model because stackfull coroutine requires a runtime that preempts coroutines (to do essentially what the kernel does with threads). This was deemed too expensive.
Most people forget, however, that almost no one is using runtime-free async Rust. Most people use Tokio, which is a runtime that does essentially everything the runtime they were trying to avoid building would have done.
So we are left in a situation where most people using async Rust have the worst of both worlds.
That being said, you can use async Rust without an async runtime (or rather, an extremely rudimentary one with extremely low overhead). People in the embedded world do. But they are few, and even they often are unconvinced by async Rust for their own reasons.
However, async Rust is not using stackless coroutines for this reason - it's using stackless coroutines because they achieve a better performance profile than stackful coroutines. You can read all about it on Aaron Turon's blog from 2016, when the futures library was first released:
http://aturon.github.io/blog/2016/08/11/futures/
http://aturon.github.io/blog/2016/09/07/futures-design/
It is not the case that people using async Rust are getting the "worst of both worlds." They are getting better performance by default and far greater control over their runtime than they would be using a stackful coroutine feature like Go provides. The trade off is that it's a lot more complicated and has a bunch of additional moving parts they have to learn about and understand. There's no free lunch.
I think that stackless coroutines are better than stackfull, in particular for Rust. Everything was done correctly by the Rust team.
Again, this is all fair and good, as long as people understand the tradeoff and make good technical decisions around. If they all jump on async bandwagon blind o the obvious limitations, we get where Rust ecosystem is now.
Stackful coroutines don't require a preemptive runtime. I certainly hope that we didn't end up with colored functions in Rust because of such a misconception.
I've used stackful coroutines many times in many codebases. It never required or used a runtime or preemption. I'm not sure why having a runtime that preempts them would even be useful, since it defeats the reason most people use stackful coroutines in the first place.
Yes. I just noticed that Tokio was pulled into my program as a dependency. Again. It's not being used, but I'm using a crate which has a function I'm not using which imports reqwest, which imports h2, which imports tokio.
I ask as someone who uses java and is about to rewrite a bunch of code to be able to chuck the entire async paradigm into the trash can and use a blocking model but on virtual threads where blocking is ok.
I enjoy Rust, and I love how the compiler helps me solve problems. However, the ecosystem is "async or gtfo", or "just write it yourself if you dont want async lmao", and that's not good enough.
Right now even building a library that support multiple async runtimes is a PITA, I have done it a couple times. So you end up supporting either just tokio and maybe async-std.
https://docs.rs/futures/latest/futures/executor/fn.block_on....
imagine you have an:
async fn do_things() -> Something { /* ... */ }
you can: use futures::executor::block_on;
fn my_normal_code() {
let something = block_on(do_things());
}
but this does get messy if the async code you're running isn't runtime-agnostic :(This is one of the goals of the async working group. Hopefully, when ready, that'll make it possible to swap out async runtimes underneath arbitrary code without issues.
If you’re learning the language, I would suggest starting out with some more vanilla sync code, loops and if statements, get used to the borrowing. Async is clearly still under heavy development, and not just from an implementation level, but also from the level of our philosophical paradigm about what async means and how it ought to work for the user. It’s entirely possible for humanity to have the wrong approach to this issue and maybe someone in this discussion will be able to answer it more effectively.
The compiler really depends on traits, and the ability for traits to handle async is not stable. Many highly intelligent people are hard at work thinking about how to make async rust more correct, readable, and accessible. For example, look here: https://blog.rust-lang.org/inside-rust/2022/11/17/async-fn-i...
I would argue, if the async functionality of traits is not stable in rust, then it is silly for us to attack rust for not having nice async code, because we’re effectively criticizing an early rough draft of what will eventually be a correct and performant and accessible book.
What does a good async API look like?
Also how do you prevent it spreading throughout a codebase?
I am trying to design a scalable architecture pattern for multithreaded and async servers. My design is that you have IO threads have asynchronous events into two halves "submit" and "handle". For example, system events from liburing or epoll are routed to other components. Those IO thread event loops run and block on epoll.poll/io_uring_wait_cqe.
For example, if you create a "tcp-connection" you can subscribe to async events that are "ready-for-writing" and "ready-for-reading". Ready-for-writing would take data out of a buffer (that was written to with a regular mutex) for the IO thread to send when EPOLLOUT/io_uring_prep_writev.
We can use the LMAX Disruptor pattern - multiproducer multiconsumer ringbuffers to communicate events between threads. Your application or thread pool threads have their own event loops and they service these ringbuffers.
I am working on a syntax to describe async event firing sequences. It looks like a bash pipeline, I call it statelines:
initialstate1 initialstate2 = state1 | {state1a state1b state1c} {state2a state2b state2d} | state3
It first waits for "initialstate1" and "initialstate2" in any order, then it waits for "state1", then it waits for the states "state1a state1b state1c" and "state2a state2b state2d" in any order.Edit: Of course, since this is what "unstable" means, right?
The lifetime of an Arc isn’t unknowable, it’s determined by where and how you hold it.
I think maybe the disconnect in this article is that the author is coming at Rust and trying to force their previous mental models on to it (such as garbage collection) rather than learning how to work with the language. It’s a common trap for anyone trying a new programming language, but Rust seems to trip people up more than most.
In the same sense that the lifetime of an object in a GC'd system has a lower bound of, "as long as it's referenced", sure. But that's nearly the opposite of what the borrow checker tries to do by statically bounding objects, at compile time.
> maybe the disconnect in this article is that the author is coming at Rust and trying to force their previous mental models on to it
The opposite actually! I spent about a decade doing systems programming in C, C++, and Rust before writing a bunch of Haskell at my current job. The degree to which a big language runtime and GC weren't a boogeyman for some problem spaces was really eye-opening.
Arc isn't an end-run around the borrow checker. If you need mutable references to the data inside of Arc, you still need to use something like a Mutex or Atomic types as appropriate.
> The degree to which a big language runtime and GC weren't a boogeyman for some problem spaces was really eye-opening.
I have the opposite experience, actually. I was an early adopter of Go and championed Garbage Collection for a long time. Then as our Go platforms scaled, we spent increasing amounts of our time playing games to appease the garbage collector, minimize allocations, and otherwise shape the code to be kind to the garbage collector.
The Go GC situation has improved continuously over the years, but it's still common to see libraries compete to reduce allocations and add complexity like pools specifically to minimize GC burden.
It was great when we were small, but as the GC became a bigger part of our performance narrative it started to feel like a burden to constantly be structuring things in a way to appease the garbage collector. With Rust it's nice to be able to handle things more explicitly and, importantly, without having to explain to newcomers to the codebase why we made a lot of decisions to appease the GC that appear unnecessarily complex at first glance.
Rust will do a lot of invisible memory relocations under the covers. Which can work great in single threaded contexts. However, once you start talking about threading those invisible memory moves are a hazard. The moment shared memory comes into play everything just gets a whole lot harder with the rust async story.
Contrast that with a language like java or go. It's true that the compiler won't catch you when 2 threads access the same shared memory, but at the same time the mental burden around "Where is this in memory, how do I make sure it deallocates correctly, etc" just evaporates. A whole host of complex types are erased and the language simply cleans up stuff when nothing references it.
To me, it seems like GCs simply make a language better for concurrency. They generally solve a complex problem.
These are not the same.
The problem with GC'd systems is that you don't know when the GC will run and eat up your cpu cycles. It is impossible to determine when the memory will actually be freed in such systems. With ARC, you know exactly when you will release your last reference and that's when the resource is freed up.
In terms of performance, ARC offers massive benefits because the memory that's being dereferenced is already in the cache. It's hard to understate how big of a deal this is. There's a reason people like ARC and stay away from GC when performance actually begins to matter. :)
The point about wrangling with Weak suggests that they're trying to build complex ownership structures (which, to be fair, would be easier in to deal with a single thread) which isn't really something easy to express in Rust in general. I use weak smart pointers exceedingly rarely. Outside of the first section (which isn't talking about async Rust specifically, it's just speaking about concurrency generally) channels aren't even mentioned. They're the main thing I use for communication between different parts of my program when writing async code and when interfacing between async and non-async code, plus the other signalling abstractions like Notify, semaphores, etc. Mutexes are slow and bottlenecky and shared state quickly gets complicated to manage, this has been known for ages. I think the problem might be more the `BIG_GLOBAL_STATIC_REF_OR_SIMILAR_HORROR` in the first place.
The comment about nothing stopping you from calling blocking code in an async context is valid, but it's relatively manageable and you can use `tokio::spawn_blocking` or similar when you must do it.
I think it's a fair assumption to say that the author is aware of what Arcs are and how they work. I believe their point is more so that because of how async works in Rust, users have to reach for Arc over normal RAII far more often than in sync code. So at a certain point, if you have a program where 90% of objects are refcounted, you might as well use a tracing GC and not have the overhead of many small heap allocations/frees plus atomic ops.
Perhaps there are in fact ways around Arc-ing things for the author's use cases. But in my (limited) experience with Rust async I've definitely run into things like this, and plenty of example code out there seems to do the same thing [1].
For what it's worth, I've definitely wondered whether a real tracing GC (e.g. [2]) could meaningfully speed up many common async applications like HTTP servers. I'd assume that other async use cases like embedded state machines would likely have pretty different performance characteristics, though.
[0] https://en.wikipedia.org/wiki/Garbage_collection_(computer_s...
[1] https://tokio.rs/tokio/tutorial/shared-state
[2] https://manishearth.github.io/blog/2015/09/01/designing-a-gc...
Fair, but when reading an article like this I have to refer to what's written, not what we think the author knew but didn't write.
…on a server where you can have a ton of RAM. It's superior on client machines because it's friendlier to swapped out memory, which is why Swift doesn't have a GC.
Obviously it's not random. It's statically unknowable.
In many cases this means it's much cheaper than objects in languages with implicit reference counting.
I'm currently plumbing through some logic to call a sync method on a struct that implements Future and it's... an interesting challenge.
While we can make zero-cost async abstractions somewhat easy for users, the library developers are the ones who suffer the pain.
You cannot run scoped fibers, forcing you to "Arc shit up", Pins are unusable without unsafe, and a tiniest change in an async-function could make the future !Send across the entire codebase.
A good candidate for this is Graal. It can compile (JIT/AOT) both WASM and also LLVM bitcode directly so Rust programs can have full hardware/OS access without WASM limitations, and in theory it could allow apps to fully benefit from the work done on Loom and async. The pieces are all there. The main issue is you need to virtualize IO so that it goes back into the JVM, so the JVM controls all the code on the stack at all times. I think Graal can do this but only in the enterprise edition. Then you'd be able to run ~millions of Rust threads.
Async/await was a terrible idea for fixing JavaScript's lack of proper blocked threading that is currently being bolted onto every language. It splits every language and every library-ecosystem in half and will cause pains for many years to come.
Everyone who worked with multi-threading outside of JavaScript knows that using actors/communicating sequential processes is the best way to do multi-threading.
I recently found an explanation for that in Joe Armstrong's thesis. He argues that the only way to understand multi-threaded programs is writing strictly sequential code for every thread and not muddling all the code for all the threads in one place:
"The structure of the program should exactly follow the structure of the problem. Each real world concurrent activity should be mapped onto exactly one concurrent process in our programming language. If there is a 1:1 mapping of the problem onto the program we say that the program is isomorphic to the problem.
It is extremely important that the mapping is exactly 1:1. The reason for this is that it minimizes the conceptual gap between the problem and the solution. If this mapping is not 1:1 the program will quickly degenerate, and become difficult to understand. This degeneration is often observed when non-CO languages ["non concurrency-oriented", looking at you JavaScript!] are used to solve concurrent problems. Often the only way to get the program to work is to force several independent activities to be controlled by the same language thread or process. This leads to a inevitable loss of clarity, and makes the programs subject to complex and irreproducible interference errors." [0]
[0] https://erlang.org/download/armstrong_thesis_2003.pdf
There is also a good rant against async/await by Ron Pressler who implemented project loom in java: https://www.youtube.com/watch?v=oNnITaBseYQ
As fun as it is to hate on JavaScript, it's really interesting to go back and watch Ryan Dahl's talk introducing Node.js to the world (https://www.youtube.com/watch?v=EeYvFl7li9E). He's pretty ambivalent about it being JavaScript. His main goal was to find an abstraction around the epoll() I/O event loop that didn't make him want to tear his eyes out, and he tried a bunch of other stuff first.
I actually think it was a great solution in JS/TS given it's a single threaded event loop. The lower level the language the worse of an abstraction it is though. So I think most of the complaints here about async Rust are valid.
The async patterns in Rust, especially with regards to data safety assurances for the compiler, are emblematic of this philosophy. Though there are complexities, the value proposition is a safer concurrency model that requires developers to think deeply about their data and execution flow. I do concur that Rust might not be the go-to for every massively concurrent userspace application, but for systems where robustness and safety are paramount, the trade-offs are justifiable. It's also worth noting that as the ecosystem evolves, we'll likely see more abstractions and libraries that ease these pain points.
Still, diving into the intricacies as this article does, gives developers a better foundational understanding, which in itself is invaluable.
This implies that you can't statically guarantee that a future is cleaned up properly, which means that if you spawn some async work, something may std::mem::forget a future, and then the borrow checker won't know that the references that were transitively handed out by the future are still live.
Rather than sprinkle Arc everywhere, I just use an unsafe crate like this:
https://docs.rs/async-scoped/latest/async_scoped/
This catches 99% of the bugs I would have written in C++, so it's a reasonable compromise. There's been some work to try to implement non-'static futures in a safe way. I'm hoping it succeeds.
The other big problem with rust (but this is on the roadmap to be fixed this year) is that async trait's currently require Box'ed futures, which adds a malloc/free to function call boundaries(!!!)
As for the "just use a channel" advice: I've dealt with large codebases that are structured this way. It explodes your control flow all over the place. I think of channels as the modern equivalent of GOTO. (I do use them, but not often, and certainly not in cases where I just need to run a few things in parallel and then wait for completion.)
An important distinction to make is that tokio Futures aren't 'static, you can instead only spawn (take advantage of the runtime's concurrency) 'static Futures.
> This implies that you can't statically guarantee that a future is cleaned up properly.
Futures need to be Pin'd to be poll()'d. Any `T: !Unpin` that's pinned must eventually call Drop on it [0]. A type is `!Unpin` if it transitively contains a `PhantomPinned`. Futures generated by the compiler's `async` feature are such, and you can stick this in your own manually defined Futures. This lets you assume `mem::forget` shenanigans are UB once poll()'d and is what allows for intrusive/self-referential Future libraries [1]. The future can still be leaked from being kept alive by an Arc/Rc, but as a library developer I don't think you can/would-care-to reasonably distinguish that from normal use.
[0]: https://doc.rust-lang.org/std/pin/#drop-guarantee
[1]: https://docs.rs/futures-intrusive/latest/futures_intrusive/
Would you prefer not to have internal mutability, not to have `Rc`, or have them but with infectious unsafe trait bounds, or something else?
Concurrency's correct primitive is Hoare's Communicating Sequential Processes mapped onto green threads. Some languages that have it right are Java (since JDK17 - Java Virtual Threads), Go, Kotlin.
The fact that a function can perform asynchronous operations matters to me and I want it reflected in the type system. I want to design my system on such a way that the asynchronous parts are kept where they belong, and I want the type system's help in doing that. "May perform asynchronous operations" is a property a calling function inherits from its callee and it is correctly modelled as such. I don't want to call functions that I don't know this about.
Now you can make an argument that you don't want to design your code this way and that's great if you have another way to think about it all that leads to code that can be maintained and reasoned about equally well (or more so). But calling the classes of functions red and blue and pretending the distinction has no more meaning than that is not such an argument. It's empty nonsense.
"We" don't all agree on this.
Futures aren't a fundamental CS mistake, they're a design decision. You may disagree with that decision, but the advantage Rust brings is that you don't need to worry about thread safety once your program actually compiles, at the cost of different code styles.
Neither asynchronous processing design is fundamentally wrong, they both have their strengths and weaknesses.
There is also nothing fundamentally bad with cooperative scheduling in scope of a single process.
1. inability to read an async result from a sync function, which is a legitimately major architectural limitation.
2. author's opinion how function syntax should look like (fully implicit, hiding how the functions are run).
And from this there is the endless confusion and drama.
The problem 1 is mostly limited to JS. Languages that have threads can "change colour" of their functions at will, so they don't suffer from the dramatic problem described in the article.
But people see languages don't fit the opinion 2, of having magic implicit syntax, and treat it as an equally big deal the dead-end problem 1. But two syntaxes are somewhere between minor inconvenience to actual feature. In systems programming it's very important which type of locks you use, so you really need to know what runs async.
I’m hesitant towards not distinguishing different things anymore and let the underlying system “figure it out”. I’m sure this could work as long as you’re on the happy path, but that’s not the only path there is.
What I'm missing at the end of the article is the author's point: I believe they're advocating for the use of raw threads and manual management of concurrency, and doing away with the async paraphernalia. But, at the same time, earlier in the article they give the example of networking-related tasks as something that isn't so easy to deal with using only raw threads.
So, taking into account that await&co. are basically syntactic sugar + an API standard (iirc, I haven't used Rust so much lately), I wonder about what the alternative is. In particular, it seems to me like the alternative you could have would be everyone rolling their own "concurrency API", where each crate (inconsistently) exposes some sort of `await()` function, and you have to manually roll your async runtime every time. This would obviously also not be ideal.
> Maybe Rust isn’t a good tool for massively concurrent, userspace software. We can save it for the 99% of our projects that don’t have to be.
Personally, I'm a bit more radical than the author. You won't be able to write software like the example correctly. It should just not be done, ever. Machines can still optimize some sanely organized software into the same thing, maybe, if it happens to be a tractable problem (I'm not sure anybody knows). But people shouldn't touch that thing.
What that means is that when I'm writing async code, I have to audit every library I import to make sure that library is guaranteed to yield after a few microseconds of execution, otherwise my own core loops starve. Importing unknown code when using async rust is not safe for any application that needs to know its own threads won't starve.
A safe async language must guarantee that threads will make progress. Rust should change the scheduler so that it can pre-empt any code after that code has hogged a thread for too long.
Rust doesn't have a scheduler, and having one would be a no-go for any sufficiently low level code (e.g. in microcontrollers).
You might be looking for parallelism, not concurrency.
It was used because of ineptitude of languages where it become popular, and its far easier to implement into GC-less languages than message-passing-based asynchronous, but it's just misery to write code in. I'd prefer to suffer Go ineptitudes just to use bastardised message passing called channels there rather than any of the Python/JS/Rust async.
It was created to be an improvement over the Javascript situation, and somehow every language that had a sane structure adopted it as if it was not only good, but the way to do things. This is insane.
JVM's futures are a joy to work with compared to JS's promises (or Kotlin's coroutines for that matter). While similar, I don't think you can conflate them.
Other times however rust stops me from writing buggy code and where I didn’t quite understand what I was doing. In some sense it can help you understand what your software better (when the problem isn’t an implementation detail).
I get the authors frustration, I often have the same feelings. Sometimes you just want to tell rust to get out of your way.
As an aside, I think there is room for a language similar to golang with sum types and modules and be a joy.
Concurrency is a subtype of parallelism. All concurrency is parallelism, but leaving some aspects of parallelism off the table.
I've worked in both worlds: I've built codes that manage thousands of connections through the ancient select() call on single processes (classic concurrency- IO multiplexing where most channels are not active simultaneously, and the amount of CPU work per channel is small) to synchronous parallelism on enormous supercomputers using MPI to eke out that last bit from Amdahl's law.
Over time I've come to the conclusion that a thread pool (possibly managed by the language runtime) that uses channels for communication and has optimizations for work stealing (to keep queues balanced) and eliminating context switches. Although it does not reach the optimal throughput of the machine (because shared memory is faster than message passing) it's a straightforward paradigm to work with and the developers of the concurrency/parallel frameworks are wise.
But these existential types can only be specified in function return or parameter position, so if you want to name a type for e.g.:
let x = async { };
You can't! Because you can only refer to it as `impl Future<Output = ()>` but that's not allowed in a variable binding!I have some quibbles with this article:
"Rust comes at this problem with an “async/await” model"
No, it does not. It allows for that, and there's a big ... community ... around the async stuff, but in reality the language is entirely fine with operating using explicit concurrency constructs. And in fact for most applications I think standard synchronous functions combined with communicating channels is cleaner. I work in code bases that do both, and I find the explicit approach easier to reason about.
In the end, Async is something people ideally reach for only after they hit the wall with blocking on I/O. But in reality they're often reaching for it just because -- either because it's cool... or because some framework they are relying on mandates it.
But I think the pendulum will swing back the other way at some point. I don't think it's fair to tar the whole language with it.
This is like saying C++ allows for templates, and theres a big community around it. Sure, but its the entire community.
Rust is all about lifetimes and the borrow checker. Async code (a la C#) will introduce overhead to reason about lifetime and it might not be as "fun" as it is with other languages that makes use of GCs and bigger runtimes.
The CSP vs Async/Await discussion is valid, but like in the majority of the cases, the drawbacks and benefits are not language relevant.
In CSP, the concurrent snippets behave just like linear/sequencial code as channels abstracts await a lot of the ugly bits. Sequential code tends to be easier to reason and this might be very important for Rust considering it design.
A good tool for massively concurrent software will as expected depend on the aspects you're evaluating: - Performance: the text does not show benchmarks evaluating Rust as a slow language. - Code/Feature throughput: the overall conclusion from the text if that Async Rust is a complex tool and expose the programmers in many ways to shoot themselves in the foot.
Assuming the "Maybe Rust..." is only talking about Async Rust, the existence of big Async Rust projects is a good counter argument. We also have the whole rest of the Rust language to code massively concurrent, userspace software.
Massively concurrent, userspace software tends to be complex and big to the point that design decisions generally impact way more the language decision.
Rust is a modern language with interesting features to prevent programmers from writing unsafe programs and this is a good head start to many when making those kind of programs, more than whether you want to use Async code or not.
* While the author states that not many apps "need" high concurrency in userspace... I would invert that and say that we may be missing so much performance, new potential applications, etc because highly concurrent code is so hard to get right. One bit of evidence of this (to me at least) is how often in my career I have had to scale things up due to memory or other resource limitations and not CPU. And when it is CPU, so often looking into it more finds bugs with concurrency that are the root cause or at least exacerbate the issue
* While I completely agree that rust is not easy with async and have myself poked around at which magical type things I need to do each time I have touched async rust code, I don't really like the suggestion being to "go use a different language", first, because if you are picking up rust, you (IMHO) should have a very good reason to already have chosen it. Rust is not easy enough or ubiquitous enough that you should be choosing it "just for fun" and your reason for using Rust should be compelling enough that you (right now) are willing to put in the effort to learn async when you need it
* What the other mentions in the body of the article, but I think is more of what my suggestion would be: don't use async unless you need it!. While I would love to see Rust (and think it should) evolve to the point where async is "easy", maybe we instead just need to get more pragmatic in what is taught and written about. I think when people start Rust they want to use all the fanciness, which includes async, and while some of that is just programmers, I think it is also how tutorials, docs, and general communication about a programming language happens where we show the breadth of capability, rather than the more realistic learning path, which leads people to feel like if they don't use async, they aren't doing it right
Finally, I do really hope Rust keeps working on the promise of these zero cost abstractions that can really simplify things... but if that doesn't work, I am at least hopeful of what people can build on top of the rust featureset/toolchain to help make things like async more realistic to be the default without the need for a complex VM/runtime.
I suspect that to take advantage of 1024-thread systems the only sane programming model will be structured concurrency with virtual threads instead of coroutines.
It’s the same progression as we saw in the industry going from unstructured imperative assembly programming to structured programming with modular features.
Both traditional mutexes and to a degree async programming are unstructured and global. They infect the whole codebase and can’t be reasoned about in isolation. This just doesn’t scale.
To your point, the C# guys seem to be interested in experimenting with green threads: https://twitter.com/davidfowl/status/1532880744732758018
It's an amazing combination.
Async functions don't have to always own their arguments. Just the outermost future that is getting spawned on another thread has to. The rest of the async program can borrow arguments as usual. You don't need to spawn() every task — there are other primitives for running multiple futures, with borrowed data, on the same thread.
In fact, this ability for a future to borrow from itself is the reason why Rust has native await instead of using callbacks. Futures can be "self-referential" in Rust, and nothing else is allowed to.
Maybe in the 2000's but I feel this reasoning is no longer valid in 2023 and should be put to rest.
10k problem.. Wouldn't modern computing not work if my Linux box couldn't spin up 10k threads? Htop says I'm currently at 4,000 threads on an 8 core machine.
The async case is suited to situations where you're blocking for things like network requests. In that case the thread will be doing nothing, so we want to hand off the work to another task of some kind that is active. Green threads mean you can do that without a context switch.
So do we discard existing ways of making software more efficient because we can be more wasteful on more recent hardware? What if we could develop our software such that 2000s computers are still useful, rather than letting those computers become e-waste?
> The numbers reported here paint an interesting picture on the state of Linux multi-threaded performance in 2018. I would say that the limits still exist - running a million threads is probably not going to make sense; however, the limits have definitely shifted since the past, and a lot of folklore from the early 2000s doesn't apply today. On a beefy multi-core machine with lots of RAM we can easily run 10,000 threads in a single process today, in production. As I've mentioned above, it's highly recommended to watch Google's talk on fibers; through careful tuning of the kernel (and setting smaller default stacks) Google is able to run an order of magnitude more threads in parallel.
By the 2010s the problem had been updated to C10M. The people discussing it (well, perhaps some) aren't idiots and understand that the threshold changes as hardware changes.
Also, the issue isn't creating 10k threads it's dealing with 10k concurrent users (or, again, a much higher number today).
Typically, if you want to build something with Rust, it'll have to use async, at least because gRPC and the like are implemented that way. So the vanilla (and excellent, IMO) Rust language doesn't exist there. Everything is async from the get-go.
A weird way to use Rust since you can do a lot of messaging within the process, and use the computing power much more efficiently.
RPC is essentially messaging and message-passing. Message-passing is a way to avoid mutable shared state - this is the model with which Go became successful.
RPC surely has its use but message passing is another, and very often inferior, solution to the problem set where Rust has excellent own solutions for.
If I'm implementing a library, how should I write it so that the consumer of the library doesn't have to pull in Tokio if they don't want to?
The arguments about Arc fall flat because how else would you safely manage shared references, even in other lower level languages. And so called "modern GCs" still do come with a significant hit in performance; it's not just some "bizarre psyop".
Really the only problem I've run into with Rust's async/await is the fact that there is not much support for composing async tasks in a structured way (i.e. structured programming) and the author doesn't even touch on this issue.
Ultimately the goals and criticism of the author are just downright confusing because at the end he admits that he doesn't actually care for the fact that Rust is design constrained by being a low level language and instead advocates for using Haskell or Go for any application that requires significant concurrency. So to reformulate his argument: we should never use or design into low level languages an ergonomically integrated concurrency runtime because it may have a handful of engineering challenges. When put concisely, their thesis is really quite ridiculous.
With all this in mind, I really like Swift concurrency runtime. It does automatic thread migration and compaction to reduce the overhead of context switches, balances the thread allocation system-wide taking relative priorities into account, and it appears to be based on continuations instead of state machines. A very interesting design worth studying IMO.
It's too complex.
Something simpler is needed with the benefits of memory safety.
I've coded performant applications on an OS that used channels and it sucked. It just got in the way and was confusing to engineers used to lower level constructs. "Just get out of my way!"
I think rust async is hard.
And that's what it comes down to. 99.9% (maybe more nines) of people do not need that level of control. They need conceptually simple things, like channels, and GC, and that will work for nearly everyone. The ones who need to drop to rust either have the engineers to do that, or their problem is intractable (for them). I pity those who drop to rust prematurely because it's cool.
I'm very curious; what OS is this?
Isn't that already, in this strong generality, an almost always wrong assumption?
Sure, one can do massively parallel or embarrassingly parallel computation.
Sure, graphic cards are parallel computers.
Sure, OS kernels use multiple cores.
Sure, languages and concepts like Clojure exist and work - for a specific domain, like web services (and for that, Clojure works fascinatingly well).
But there are many, even conceptually simple algorithms which are not easy to parallelize. There is no efficient parallel Fast Fourier Transform I know of.
Try it. It'll probably work fine. It may be very expensive, memory wise, but it's easy to get a machine with a lot of memory.
It's been tried, periodically. Still sucks.
Or in other words, the goal is that you can think in abstract what the natural optimal machine code would be for a program, and you can write a Rust program that, in principle, can compile to that machine code, with as little constraints as possible on what that machine code looks like.
Unlike C, that also has this property, Rust additionally seeks to guarantee that any code will satisfy a bunch of invariants (such as that a variable of a data type actually always holds a valid value of that data type) provided the unsafe code part satisfies a bunch of invariants.
If you use Go or Haskell, that's not possible.
For example, Go requires a GC (and thus requires to waste CPU cycles uselessly scanning memory), and Haskell requires to use memory to store thunks rather than the actual data and has limited mutation (meaning you waste CPU cycles uselessly handling lazy computations and copying data). Obviously neither of this are required for the vast majority of programs, so choosing such a language means your program is unfixably handicapped in term of efficiency, and has no chance to compile to the machine code that any reasonable programmer would conceive as the best solution to the problem.
Out of curiosity, could Rust be limited to a language subset to mimic the simplicity of Golang (with channels and message passing) and trade-off some of the powerful features that seem to be causing pain?
Pardon a naïve question. I’m a systems engineer who occasionally dabbles with simple cli tools in all languages for fun, but don’t have a serious need for them.
From what I can gather, such projects will never happen though. That's why I moved part of my work to Golang itself.
Rust is an amazing language. Though the team really takes the "system language" thing very seriously and they're making decisions and tradeoffs based on that, so it seems us its users should adapt and not use Rust for everything. That's what I ended up doing.
Good call, re: garbage collection FUD. Ultimately many programs have to clean up memory after it is no longer needed by the program and at a certain scale in a program it becomes necessary to write code that handles allocations/deallocations; and you end up manually writing a garbage collector. Done well you can get better performance for certain cases but often it's done haphazardly and you end up with poor performances.
It seems a good amount of Rust evangelism has given up on the, "no GC is required for performance," maxim. Is that the case, Rust friends?
That being said, I think it would be neat if there were a language like Haskell where there was an interface exposed by the compiler where a user could specify their own GC.
[0] https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/line...
Async Rust is many language features and behaviour all interacting with each other at the same time, to create something more complicatedly than how you would describe the problem you're actually trying to solve (I want to do X when Y happens, and I want to do X when Y happens × the number of hardware threads). When you're using async rust, you are having to think more carefully about:
* memory management (Arc) and safety and performance
* concurrency
* parallelism, arbitrary interleavings
* thread safety
* lifetimes
* function colouring
All interacting together to create a high cognitive load.
Is the assembly equivalent of multithreading and async complicated?
Multithreading, async, coroutines, concurrency and parallelism is my hobby of research I enjoy. I journal about it all the time.
* I think there's a way to design systems to be data intensive (Kleppmann) and data orientated (Mike Acton) with known-to-scale practices.
* I want programming languages to adopt these known-to-scale practices and make them easy.
* I want programs written in the model of the language to scale (linearly) by default. Rama from Red Planet Labs is an example of a model that scales.
* HN user mgaunard [0] told me about "algorithmic skeletons" which might be helpful if you're trying to parallelise. https://en.wikipedia.org/wiki/Algorithmic_skeleton
I think the concurrency primitives in programming languages are sharp edged and low level, which people reach for and build upon primitives that are too low level for the desired outcome.
[0]: https://news.ycombinator.com/item?id=36792796
[1]: https://blog.redplanetlabs.com/2023/08/15/how-we-reduced-the...
Note: You can use async Rust without threading but I assumed you're using multithreading.
Re: the conclusion, I wonder if this is a problem that can be solved over time with abstractions (i.e. async Rust is a good foundation that's just too low-level for direct use)?
(They mention this extra constraint early in the article: "But this approach has its limitations. Inter-process communication is not cheap, since most implementations copy data to OS memory and back.")
I'm familiar with writing services with large throughputs by offloading tasks onto a queue (say Redis/Rabbitmq whatever) and having a lot of single threaded "agents" or "workers" picking them off the queue and processing them.
But as implied in the earlier quote from the article, this is not an acceptable fast or cheap enough solution for the problems the author is talking about.
So now am left wondering: what are some examples of the class of (1%) problems the author is talking about in this article?
"Stackful" coroutines, on the other hand, do have runtime stacks (holding local variables) that get swapped out by the runtime on await points. It makes the code behave exactly like non-async code, but requires a runtime to manage those stacks. Rust didn't go this way, preferring the benefits of the stackless approach.
Until all the work you're trying to push is generating so many allocations that your GC goes to shit once every two minutes trying to clean up the mess you made. (https://discord.com/blog/why-discord-is-switching-from-go-to...)
I haven't investigated it deeply, but I was developing something in Rust, and whether something needs to be threadsafe or not is entirely on the consumer's use case... bad separation of concerns for the provider of a generic interface to have to specify the specific type of boxed value. 100% fine if the behavior in this case is to pre-allocate the max possible boxed type memory requirement.
This is the only thing I was really frustrated with in Rust
Your generic interface just takes a reference to the value inside the box.
If it's dynamic, you can use Cow or the supercow/bos/... crates if you want Arc/Rc to be options as well.
I really want to use TypeScript, as I like the language and I want to use this as a way to learn it better. I'm not expecting to have some super successful game, but the programmer part of my mind is upset at not utilizing all the cores of the machine. So, what do people do? Split up the server into multiple independent running components, or is my choice really to just use another language?
I know parallel ATA cables were all the rage. They had a higher theoretical throughput when compared with serial ATA cables but there was too much cross-talk involved to make it actually faster in the end so now we have serial ATA cables everywhere with much higher throughput than parallel ATA cables could ever achieve.
Should we move back away from parallelism and focus on handling synchronous stuff faster instead?
> Should we move back away from parallelism and focus on handling synchronous stuff faster instead?
Rust already has excellent handling of synchronous computation, given that it can meet/sometimes exceed equivalent performance in C. The problem is when you're I/O or network bound; you can either throw threads at the problem (and by extension throw memory at the problem for the thread stacks) or use async programming.
I want to write stuff to disk (SSD these days). I can issue a request, then have to wait tens to hundreds of milliseconds (in the average case, the worst case can be far longer) for that request to finish and let me know that my I/O request succeeded or failed. There's no getting around that with present-day technology.
The situation is worse and even less reliable with network I/O. If you are talking to a server in another continent, the speed of light determines the minimum of time I hear back from it, even if it (and all the intermediary network links) are lightly loaded and functioning perfectly.
Java is ok too if you want object oriented atomic joint parallelism, but I only recommend using it on the server where you need a VM anyhow.
C from 1970 and Java from 1990 still got things right.
Also Vulkan/Metal/DX12 does not really help, OpenGL 3 with VAO is enough.
Er, no. That’s not what those words mean.
“We want to use the whole computer. Code runs on CPU cores, and in 2023, even my phone has eight of the damn things. If I want to use more than 12% of the machine, I need several threads.”
Well, I would hardly mind to use the GPU for any part of my program which would fit it. That's why I believe it could be a great idea for a modern programming language to include first-class GPU-accelerated types and instructions.
Please make it happen! I want my userspace software to be in Rust!
Although, if it won't happen, then even better, a free real estate for a RustScript.
Not my cup of tea.
I think having to keyword async is frustrating as a design decision
Sure, Rust is certainly verbose and very strict how the ownership rules apply in the context of async, but this is a hard constraint of its memory safety model. We could probably do better while retaining all performance but this is by far one of the best implementations. Another example of nice to use async/await is C# which trades performance/memory (state has to be boxed if it is to live across continuations) for convenience (you just write it naturally without worrying about underlying behavior).
There is a reason Rust toyed with "green threads" at its inception but decided against such. The only popular languages of today that do these are Go and Java (which basically forced to do this because you can't go async without introducing the feature early in the lifecycle of the language, and the authors of project Loom are simply wrong with their excuses why this is superior to async/await).
Async/await is here to stay and is the right abstraction, git good, and it's not even difficult to use anyway.
[0] where feature name is green threads, not doing concurrency at all, doing it manually, etc.
It's probably the right abstraction for Haskell, or any other language that works well with functional programming, lambdas and monads. Loom is a better fit for Java. Rust also would have probably been better off with something else. Effect handlers might have been a good choice.
Why?