Latents are a compressed representation of the source images that are fully recoverable.
If you train a model on a compressed jpg of an image, or on any deterministic transformation of it, you’re still training it on that image.
Any suggestion otherwise is only because someone is trying to put some spin on things.
> Stable Diffusion v2 16-bit is ~3GB of data. It was trained on hundreds of millions of images…
And yet! Remarkably! It can generate pictures of the Mona Lisa!
Here’s a question for you: if you encode the process of drawing an exact copy of an image, does the pure code that implements that mean you have a copy of the image in it?
Have you encoded pixels as code?
Does that mean there’s no copy of the image?
How about a zip file full of images? It’s just a high entropy binary blob right? Yet… remarkably!!! It can be transformed into images by applying an algorithm.
I don’t know the answer, but this handwavy “it couldn’t possibly encode them it’s too small” is…
Pure. Nonsense.
Of course some part of some images is embedded in the model in some form.
Stop trivialising the issue.
The issue here is: Does an algorithm that generates content infringe copyright?
Does a black box that takes the input “a picture of xxx” and a seed and outputs a copyrighted image infringe?
You know that’s possible. Don’t dodge. Technical details about oh “it couldn’t possibly have…” are pure rubbish.
Sure it could. It could have a full resolution copy of a photo of the image in that black box.
Of all the training data? Probably not. But of some of it? In compressed latent form? Most definitely.