I'm really not sure that's an issue, utf8 decoding is very, very cheap and it's iterating either way.
It would have to be benched, but I wouldn't be surprised if allocating the caches (at least one allocation per line of input) had way more overhead, especially given the inputs are so very short.
I'm not going to claim Rust's utf8 decoder is the fastest around, but it's very fast.