How is this done? Are bits not written into RAM or disk? Are they not sent between machines in a training cluster? That's copying.
> it is seemingly not far removed from how humans consume content
Except that humans don't make full copies to RAM, or disk or paper.
AI doesn't need lasting copies to train, however I don't know what the actual implementation is. But if it's ruled that they can only use copyrighted data if it's not stored for more than the time it would take a human to consume, It wouldn't really cripple the models, but perhaps make training more logistically challenging.
It's important to understand that models are not data archives. They are statistical constructs made from getting quizzed, that uses human made content to generate the quiz questions.
Images on your retina form exact copies.
They are scanned and translated into impulses that are then sent to a first set of "neural columns" - that's an exact copy.
This is then connected to the visual cortex by the two most high bandwidth links in the human body ("the optical nerve", there's 2 of them of course, always wondered why everybody insists on using the singular). Why would you have that high bandwidth link unless to create verbatim copies.
The way those columns are structured also very strongly suggests they make carbon copies, which they then make available on the "brain bridge" (which is probably at least vaguely similar to the "attention matrix" of a transformer). If it does work like that, that's also a verbatim copy.
The only way "humans don't make full copies to RAM" is that humans don't have separate RAM. The processing power is colocated with the processing, even on a microscopic level. You know, what everybody knows is the best way of doing things even in silicon, it's just incredibly impractical if you can't rebuild your circuit every time there's a slight change to the instructions your "computer" carries out (the brain is not a "Von Neumann architecture", except it kind of is when it regrows connections. But in the short term it isn't)
> Using software almost always involves creating copies, even though many of these copies only exist for a very short time. For example, executing a program means copying it from the hard disk into RAM so that the CPU can interpret the instructions. Because of this, the right to run a program is considered to fall under the copyright of the author.
For comparison, when a human looks at the letters, there is no copying.
Also, models can reproduce text verbatim which proves that they store it.
So it is unfair when ordinary folks got sued for this and Zuckerberg wants to get away with a million times larger violation. He must go directly to jail.
The computer model is working differently of course but functionally it's the same idea.