1. embedded hardware, like you mentioned
2. high-performance stuff
3. "embedding" in the cross-language sense, with foreign function calls
Of course the "don't use a lot of resources" thing that makes Rust/C/C++ good for tiny hardware also tends to be helpful for performance on bigger iron. Similarly, the "don't assume much about your runtime" thing that's necessary for bare metal programming also helps a lot with interfacing with other languages. And "run on a GPU" is kind of all three of those things at once.
So yeah, which of those concerns was async Rust really designed around? All of them I guess? It's kind of like, once you put on the systems programming goggles for long enough, all of those things kind of blend together?