Yet we all use web browsers that copy copyrighted text from buffer to buffer all the time. This doesn't even include all of the copying that ISPs perform.
It might be fair to say that the read performed in training has the same character since no human is involved.
The real copyright violation would be using a derived work.