It's a big deal because, while it has some downsides, being stalkless means they can have next to no overhead, meaning it can be performant to use coroutines to write asynchronous code for even
very fast operations. The example given
https://www.youtube.com/watch?v=j9tlJAqMV7U&t=13m30s is that you can launch multiple coroutines to issue prefetch instructions and process the fetched data, so you can have clean code that issues multiple prefetches and process the results. Whereas in Python (don't get me wrong, I love Python) you might use a generator to "asynchronize" slow operations like requesting and processing data from remote servers, C++ coroutines can be fast enough to asynchronously process "slow" operations like requesting data from main memory.