Netflix's edge nodes are optimized for streaming already encoded videos to end users. They have to transcode some number of formats from the source and send them all to the edge nodes to flow out. It's harder to manage a ton of different streams flowing out to the edge nodes cleanly.
I would guess YouTube, being built on google's infrastructure , has powerful enough edge nodes that they stream one video stream to each edge location and the edges transcode for the clients. Only one stream from source to edge to worry about and is much simpler to support and reason about.
But that's just my wild assed guess.
I think this could be one of upsells that Netflix could use.
Premium: get no delay
Normal users: get cache and delay
Sample size 1, but...
I saw a ton of buffering and failure on an embedded Netflix app on a TV, including some infinite freezes.
Moved over to laptop, zero buffering.
I assume the web app runs with a lot bigger buffer than whatever is squeezed into the underpowered TV.
E.g. "give me this previous chunk" vs "send me the current stream"