Coroutines basically make the same observation as transmit windows in TCP/IP: you don’t send data as fast as you can if the other end can’t process it, but also if you send one at a time then you’re going to be twiddling your fingers an awful lot. So you send ten, or twenty, and you wait for signs of progress before you send more.
On coroutines it’s not the network but the L1 cache. You’re better off running a function a dozen times and then running another than running each in turn.