How can this possibly be a valid good faith argument? Either they're in breach of authors' copyright which extends to every piece of art that they included in the dataset without permission, or they're in the clear and aren't obligated to respond to removal requests.
This reads like damage control to me in an effort to temporarily silence the loudest critics.
Stable Diffusion's U-Net is trained to remove noise from images in latent space, which the variational autoencoder (VAE) converts to and from pixel space. CLIP embeddings are used to improve the denoising step of the U-Net by using the correlations between human language descriptions of the pixel image to reduce latent noise. Neither the U-Net nor the VAR are trained to interpolate or reproduce images from the training set; if that happened the model would be overfitted and loss would be terrible on the validation set. The VAE is trained to produce a latent space that can accurately encode and decode any pixel image, and the U-Net is trained to remove gaussian noise from the latent space.
Stable Diffusion v2 16-bit is ~3GB of data. It was trained on hundreds of millions of images (minimum of 170M in the 512x512 step alone). That leaves a maximum of ~20 bytes per image that could conceivably be a copy, which is certainly not enough to directly reproduce either the style or contents of any individual image.
There is no artwork included in Stable Diffusion. There is a semantic representation of how images are composed of varied subjects represented in the latent space and what pixel probabilities over those subjects relate to human language phrases during decoding, and finally a method to remove noise from the semantic representation, e.g. starting with a blank or random canvas and interpreting what may be there, iteratively guided by CLIP embeddings. If you give Stable Diffusion an empty CLIP embedding you get a random human-interpretable image obeying the distribution of the learned latent space.
You might as well say that there's no artwork included in a .jpg, just data that can be used to recreate a piece of artwork using a carefully crafted interpreter.
Latents are a compressed representation of the source images that are fully recoverable.
If you train a model on a compressed jpg of an image, or on any deterministic transformation of it, you’re still training it on that image.
Any suggestion otherwise is only because someone is trying to put some spin on things.
> Stable Diffusion v2 16-bit is ~3GB of data. It was trained on hundreds of millions of images…
And yet! Remarkably! It can generate pictures of the Mona Lisa!
Here’s a question for you: if you encode the process of drawing an exact copy of an image, does the pure code that implements that mean you have a copy of the image in it?
Have you encoded pixels as code?
Does that mean there’s no copy of the image?
How about a zip file full of images? It’s just a high entropy binary blob right? Yet… remarkably!!! It can be transformed into images by applying an algorithm.
I don’t know the answer, but this handwavy “it couldn’t possibly encode them it’s too small” is…
Pure. Nonsense.
Of course some part of some images is embedded in the model in some form.
Stop trivialising the issue.
The issue here is: Does an algorithm that generates content infringe copyright?
Does a black box that takes the input “a picture of xxx” and a seed and outputs a copyrighted image infringe?
You know that’s possible. Don’t dodge. Technical details about oh “it couldn’t possibly have…” are pure rubbish.
Sure it could. It could have a full resolution copy of a photo of the image in that black box.
Of all the training data? Probably not. But of some of it? In compressed latent form? Most definitely.
These artists complaints are ridiculous, and are being made by people who don’t understand how things work.
If some other person draws a picture in their “style”, no one has to ask permission. That’s not a thing.
They either don’t understand how it works or they are just upset that a computer can make art as good as (or better than) they can in a fraction of the time.
All knowledge workers and creatives are going to face this in the future. It’s going to suck, but it would be great if we all could try to understand reality first.
More pointedly, how do I keep my GPL'd code from spewing, license free, out of CodePilot?
Try making a comic book with a character that looks like Mickey Mouse and see how well that goes.
> All knowledge workers and creatives are going to face this in the future. It’s going to suck
This is not a given. It's up to us and the copyright law. Real original work should be compensated appropriately unless you're proposing that we accelerate deployment of universal basic income and completely abolish copyright law.
I have a feeling you might not like the violent outcome if you effectively strip original creators of their copyright, give corporations the right to effectively generate infinite profit off the backs of their work and tell the creators (and other people whose jobs will be automated away) to pound sand when they ask how they're supposed to pay rent from now on.
Notice that it's a legal posture that implicitly condemns Copilot, which ignores explicitly formulated opt-outs in the form of licensing.
I think it is to avoid any common law wrongs related to the publicity rights of the defendants. It seems like something that a legal team would flag as an unnecessary risk for the product. Removing their names and images from the training data doesn't impact the usefulness of the model while at the same time creating a much smaller surface area for collecting subpoenas.
They could just as well not do anything and continue on - it's likely this case will be in defendents favor. Same as how Google can crawl the net, cache data, transform it, etc.
I don't see how more people copying the artist style would not increase the value of originals.
A smart artist here should promote that their style is staying in the dataset. It is as good free publicity as they will ever get.
e.g. Ask 10 random people who wrote the song “Hurt” and 9 out of 10 will probably say Johnny Cash.
This didn't make any sense to me. Without the curated training data (images) how are they making the models?
No matter what, putting images into your machine then selling the output generated with them and not compensating the original creators is going to be seen as problematic. Machines aren't people.
There's no reason why that is the significant detail. Why does it matter? If you can look at millions of images over your lifetime and faithfully reproduce famous works of art by hand, aren't you just as wrong?
Setting aside the question of "is the model a derivative work", running the program cannot create a work that is copyrighted. Only humans (and not monkeys) can hold a copyright.
And thus, the questions are: "is generating a model based on the data set a derivate work" and the unasked question "is asking the model to generate a work in the style of {artist} a derivative work by the person asking the model?"
> It's cheaper to hire 1000 designers to make 100000 images of artistic styles they are going for
I bet that you are massively underestimating the cost of that.
The reason we have copyright for a limited time is to promote the arts, not to give people a monopoly on a specific style so they can milk it. (although that is what seems to be happening)
What about a company where you submit images and it tells you which faces are in them?
If the image is freely viewable (say you can browse to it), and you just look at it, are you violating any rights?
It seems that violation would only come if you would use the model to produce images that are derivative of that original image, the same way a counterfeiter would make a copy of it. Have the skill to copy is not the same as actually copying.
The fundamental issue with this line of argument is that it equates the process of human vision and the consequences of that with that of a computer program ingesting that image and the consequences of that.
This anthropomorphization seems like a form of deep fallacy when considering the nature and impact of AI software. In the case of "seeing" an image, the two processes could not be any more unlike each other, both in content and context.
I'm not putting any weight here on what is good or bad for society, but relying on that humans somehow work in a completely different way from where AI is and is going is not going to help.
I do think it will take longer for the AIs to know all about human contexts though, so the pairing of human AD + bulk-gen AI seems to me to be an obvious near-term tag team that's hard to beat.
I agree that this is a dangerous fallacy. Something that legislatures and culture have agreed is fine for a human to do - limited by human scaling, memory, and skill - may not be fine for a computer to do.
If I read Harry Potter, then turn around a write a book about a wizard with a z-shaped scar? Who works at a school for wizards? With a pet owl? Who is an orphan? At some point I have started to violate intellectual property rules. (Ignoring all the Harry Potter material that was itself lifted from prior public domain art.)
AI systems aren't just reading, they are generating material based on the stuff they have read. They and the people controlling them have to abide the copyright rules just like any other "author".
Look at the extreme case, then. What if that one image is your only input, and your output is identical to it? What if your output is your input reflected over the x-axis? What if your output just crops the input? What if your output is your input cut into irregular pieces and randomly rearranged? Which outputs violate copyright?
Slightly less extreme: suppose your input is two images, and your output is those two images next to each other in a single image? Or your output is the second image, reduced in size and placed in the center of the first? What if both of the inputs are human figures, and your output is to cut out the face and hands of one image and put it onto the other?
> images that are derivative of that original image, the same way a counterfeiter would make a copy of it.
Only one of these outputs are anything a counterfeiter would do. Are any of the others copyright-violating?
Would they be able to use your photos for Adobe Stock without permission ?
This isn't the kind of question that the lawyers of the defendants are going to ask the court.
They'll more likely ask if it isn't clearly fair use similar to Sony v Universal and Authors Guild v Google and then present evidence of significant non-infringing commercial use.
> It seems that violation would only come if you would use the model to produce images that are derivative of that original image, the same way a counterfeiter would make a copy of it. Have the skill to copy is not the same as actually copying.
Yeah, that's basically how the courts see it these days although for a different reason. They don't ask questions about skills or work or anything like that. They ask questions like, "is this supposed infringing work a replacement in the market for the plaintiff's work?".
The deeper questions about what the hell anyone meant by the words in the Constitution about Copyright wait for the highest courts to get involved, which is where we got this nice division between tools and what the tools are used for which allows for innovative fair use of copying other people's protected works with tools like VCRs, online book search and large language models.
Those were not cases about 'generators' but about 'aggregators', a completely different class of application.
The problem as I understand it is in all the likely "precedent" cases for this, what was being done with the scraped data was in some identifiable way different than the purpose of the source data itself. Authors Guild v Google for instance, the argument was that Google wasn't reproducing whole texts, it just used that data to make the texts searchable. Meaning the purpose of the consumption of the text by Google was to essentially make a searchable index, rather than to reproduce a book, and thus that isn't harming the authors.
In this case, it would seem a very key difference is that this is Art being consumed and Art being produced, with no different purpose.
in order to create 5 very different illustrations you need to talk with 5 people. in the end 5 people will get money when they finish with their work.
an AI consumes these artists past output and instead of paying to these artists it will gather income to the owner. So by using the output of 5 people who have spent decades on perfecting their craft, the AI generates income by stealing their work, and the money flows to the owner only, who doesn't give back anything to these people.
so in essence AI in this form kills income stream for humans, since it gives back nothing.
What specifically is the defining reason that people can learn by copying other peoples styles but ai cannot?
Are we supposed to halt technological progress to avoid antiquated job destruction?
This doesn't mean anything. If an unsecured SSH server is connected to the internet that lets anyone who connects to it in and gives them a root shell, it is still illegal to 'hack' that machine. The law cares about intent, not technicalities.
edit: Since HN decided to break with "You're posting too fast. Please slow down. Thanks." again, banning me from replying: This is obviously just an example intended to show that the law cares more about intent than technical measures.
@dang Calm down dude.
With this sort of model's "creation" process, is something close to everything it generates derivative of everything it ingested, since had you ingested a different set of images you'd presumably have a different model with different weights?
That's kinda sorta analogous to human creation, but a human can much more actively choose what to think about, what to ignore, what to filter out.
The human process involves an explicit creative judgement step that I don't think the image-generation-by-model process can - and that creative transformation is key, legally, to a derivative work being able to itself be copyrightable and to not be infringing.
Since reasonably simplified information about SD is available and/or the plaintiff could have involved an expert to review his claims - it does raise a question if the function of the lawsuit is more about rattling chains rather than the merits of their argument. I.E. A deliberate ploy to extract a settlement.
I think one issue is just that of scale. I personally tend to agree that there's something icky with just slurping up literally everyone's content, then producing a tool that will then proceed to put them out of business en masse. But proving that illegal under current law is certainly going to be a challenge.
I have not read the original complaint but it surprises me that the lawsuit doesn't have a much stronger focus on this aspect. Copyright law is very concerned about not destroying the market for a given work through infringement, but this is a case about destroying the market for entire artists at a stroke.
But that's a hard argument in court. There's no legal basis for claiming damages because the entire market itself is being destroyed.
Though I'm not sure it's any weaker than the other claims trying to be made. The basic problem is, this isn't illegal in any sense. I don't just mean "illegal in that it must be banned" but any level of gradation in between, in the licenses, in requiring compensation, in any sort of regulation whatsoever. Technology has simply outrun law again.
Disney can train a model from every frame of their video library as well as whatever they can find which is unambiguosly public domain. Then they could hire a few hundred artists to draw whatever the model is bad at by the end of this process for finetuning.
I hope it returns when they win and get rid of this legal bullying.
Information comes with many different rights: copy-right is the right to make copies; "moral rights" were mentioned in a few of my UK job contracts and that's "the right to be identified as the author of a work"; database rights are for collections of statements of fact that are not eligible for copyright but which were deemed to be worth protecting anyway for much the same reasons.
Even if copyright is totally eliminated from law by the mere existence of these AI[0], we may well retain the aforementioned "moral rights". And even if it is totally legal, there's also a strong possibility of it being considered gauche to use an AI trained on the works of those that don't like this.
[0] https://kitsunesoftware.wordpress.com/2022/10/09/an-end-to-c...
I don't think it means the author has a right to all similar styles. If I can legally ask somebody to paint me something in the style of a famous (living) artist, that person presumably having seen and studied their famous works for a while, why should I not be able to ask the AI to do the same thing?
(I understand there might be people who think even a human person emulating the style of another artist is morally wrong, but at least that's a consistent argument)
but this isn't enforceable, they cannot be transferred
Any use of that work without permission (and thus attribution/compensation) is the problem.
Copying an artists style is legal in every jurisdiction in which Stability operates.
Comics are one example of an area where individual artists might develop a large body of work in a very distinctive style. You probably know what a Tintin comic (by Belgian artist Hergé) looks like. And lots of Manga artists have very specific and instantly identifiable styles. Individual artistry is a little less obvious with popular western comics because the best-known titles tend to be superhero franchises where the characters/story world are owned by a corporation and individual artists come and go.
I've seen the collage tool argument several times, and I don't agree with it. But I can understand why people believe it.
You see, there's a very large number of people who use AI art generators as a tracing tool. Like, to the point where someone who has never touched one might believe that it literally just photobashes existing images together.
The reality is that there's three ways to use art generators:
- You can tell it to generate an image with a non-copyright-infringing prompt. i.e. "a dog police officer holding a gun"
- You can ask it to replicate an existing style, by adding keywords like "in the style of <existing artist>"
- You can modify an existing image. This is in lieu of the random seed image that is normally provided to the AI.
That last one is confusing, because it makes people think that the AI itself is infringing when it's only the person using it. But I could see the courts deciding that letting someone chuck an image into the model gives you liability, especially with all of the "you have full commercial rights to everything you generate" messaging people keep slapping onto these.
Style prompting is one of those things that's also legally questionable, though for different reasons. As about 40,000 AI art generator users have shouted at me over the past year, you cannot copyright a style. But at the same time, producing "new" art that's substantially similar to copyrighted art is still illegal. So, say, "a man on a motorcycle in the style of Banksy" might be OK, but "girl holding a balloon in the style of Banksy" might not be. The latter is basically asking the AI to regurgitate an existing image, or trace over something it's already seen.
I think a better argument would be that, by training the AI to understand style prompts, Stability AI is inducing users to infringe upon other people's copyright.
This is incredibly disheartening. Who knows how long will it take to progress the tech to the point where anyone will be able to train and run models unrestricted without dealing with lawyer nonsense.
I'm probably on the opposite side of the fence. I do find it disheartening that it's opt-out instead of opt-in. The training set should be limited to public domain and CC-0 until such a time it can comply with attribution; then other CC works could be incorporated.
So many artists styles could have gone viral and actually bring those artists some work from the people who tried the AI commercially and got results that weren't completely satisfactory. Now barely anyone will ever have any contact with their art (relatively speaking vs scenario of virality).
Basically the only people who win are the lawyers and handful of artists that were mislead by lawyers primitive argumentation. Everybody else looses. First and foremost artists and art lovers but also AI researchers and hardware manufacturers.
People are allowed to view private art, draw inspiration and ideas from it, and execute on those to create new things.
Why should we limit AI any differently?
If the end result is too close to the original - apply the same guidelines you would for any other artist who copied your work.
Otherwise... you're not allowed copyright over a particular style (for damn good reason). While I would like to see artists retain some form of revenue, I don't think this is really the most pressing issue on that front.
This is the crux of the issue for me. It's a different set of rules for AI companies than everyone else. If I started selling pirated copies of Nintendo games they would send an army of lawyers after me and this "opt-out" reasoning would not be a valid defense in court. These AI companies are trying to get away with stealing art and other content with a simple "whoopsie, we promise we won't do it again" when people demand that their own rights be respected.
Yeah, it's disheartening. There's also no good way to fix it; the cost of storing copies of their art is negligible, and AI trains the same whether the material is copyright or creative commons. If you get Stability AI to omit your art, then Unstable Diffusion will be trained on your likeness. Opt-out of that one, and some guy in Nevada personally sponsors a bespoke model for making copies of your art.
So, I agree with the parent. The most tragic part is not the short-term fight, it's the long-term consequences. Artists will have to internalize what software developers realized decades ago; creating takes work, and copying is free.
What we need is enough computation power to run these models on our own computers, on our phones even. Then we'll be able to do whatever we want and there's nothing they can do about it.
These are orthogonal issues at this point.
The one concern I do have is that the “lawyer nonsense” (read: AI companies playing fast and loose with current laws) will stack the regulatory deck against AI technology unnecessarily - essentially because of an unforced error that brings negative attention to the technology.
Put another way, these companies are asking to have a spotlight put on them by being so flippant about copyright and ethics issues. This spotlight could have been avoided with better behavior, and the tech would still appear magical and remain one of the most impactful jumps in tech in decades.
It's an area where there are no existing laws. We're not going to stop AI because some furry deviant art artist complains loudly online.
This reminds me of the backlash against the wacom community on deviantart in the early days.
The price floor on art commissions is already very low and AI effectively makes that cost zero, while providing zero compensation to the thousands of artists. Without their work, there's no Stability AI. From an ethical standpoint Stability is in the wrong, and from a legal one I think the class has a very strong case to recover damages.
The ends of having an useful model like stable diffusion doesn't really justify just ignoring the IP rights of tens of thousands of creators who were already having a pretty rough time making ends meet. That's just a shitty thing to do.
Copyright law isn't friendly to small creators, and big creators use it as a cudgel with absolutely no consequences.
I can see see a future dispute arising over outpainting (beginning with an existing copyrighted work) but there infringement and identity of the infringer (the user, not the toolmaker) is more clear.
Stable Diffusion is equivalent to hip-hop sampling in the 80s and 90s. The outcome is obvious.
Are there specific similarities that make you believe these are equivalent scenarios? Not just “it feels thematically similar”.
Hip-hop originally recorded and transformed vocals, instruments, and beats to create something new from pieces of something old. The practice occurred without permission and obviously ended up in court. Now sampling requires a licensing agreement. The additional cost has fundamentally changed the genre (over the last 40 years).
Hip-hop and tech both ignored IP rights because neither started with a legal framework and both would have found the additional cost prohibitive.
Remember, input is fundamentally required. Without that dataset, Stable Diffusion delivers exactly nothing.
Best outcome in my opinion would be for the output to be judged on a case-by-case basis, like human works are, not for machine learning on data without "proper authorization from the owner" to inherently count as infringement.
The use of including copyrighted materials in the trained model was a choice, not some obvious fact about the nature of AI. All of this could've been avoided if the data set did not include unlicensed work in the first place.
How does this work? Do they retrain the model from scratch every week? Or is it somehow possible to retroactively remove specific training-set items from the already-trained model?
The Disney protection act rears its head…
As noted in OP, this is an outstandingly bad definition of Deep-Neural-Networks, and the lawsuit should fail when the court hears an explanation from any competent practitioner.
However, a correct definition would make the lawsuit far more interesting, imo. Diffusion models can be compared to a superhumanly talented artist that can be cloned in unlimited fashion by anyone having the software and hardware means. How does this entity affect social well-being, how should existing laws be modified--if at all-- with the welfare of humanity in mind, etc?
How can you claim with a straight face that this is a better explanation of what an NN is?
An NN is simply an approximation of a multi-valued function, whose parameters are adjusted by minimizing the difference between the output of the NN and the output of the real function for a certain input. It is much much closer to "a giant archive of compressed images being used to interpolate between them" (though it's not that) than it is to a "superhumanly talented artist".
Right, but that equally fits a biological NN if you zoom in that close. You'll need more than wikipedia to appreciate what deep-neural-networks are doing here, it's dimensional space that's key. What DNNs do that is similar to the human brain is that they order "concepts" in high-dimensional space. Colors, textures, shape and hierarchies of same are organized and cross-referenced with text in an incredibly complex connectome. It would be useless to memorize images with their textual descriptions as that would be horrendously inefficient/ineffective during inference. Rather, the model must do what we do and understand what makes an image a "landscape" or a "portrait" or a "cartoon". It needs to understand what is an artist's style and how to perform it on a work never before created.
"Understanding" can only mean ordering meaningless letters and pixels in multidimensional space so that they line up with human understanding (and human 'understanding', in turn, can only mean ordering meaningless sensory perceptions in the brain's multidimensional connectome such that reality turns out to be approximately predicted and controlled). The only systems that work this way efficiently are neural networks, biological and artificial.
So how often does this happen? Somehow I'm too cynical to believe that a judge would rule against the intellectual property industry. The whole thing is based on absurd concepts to begin with, concepts that can be reduced to the ownership of unique numbers. Once a society accepts that, what difference do explanations make?
Now, that could work out. Major movie studios and recording companies do file copyright registrations and submit a deposit copy. But few others bother. It seems that you can send a DMCA takedown request without a copyright registration, but you can't enforce it in court without one.[2] This raises the question of, if you as a service receive a DMCA takedown request, should you ask the requestor to send proof of copyright registration, and if they don't, ignore the request?
[1] https://www.copyright.gov/registration/
[2] https://www.traverselegal.com/blog/is-a-registered-copyright...
That would mean that the vast majority of artwork posted online is essentially free to exploit in the USA, since I’m sure most people do not routinely register their works with the copyright office before posting them.
This suggests an online process which looks like this:
* US Service provider offers web page for DMCA notices.
* Web page requests that the user enter copyright registration info.
* If user fails to provide registration info, web page offers links to various national copyright registration sites to register a copyright. A payment receipt for copyright registration is acceptable as temporary proof of registration, but must be followed up within some period of time by actual proof of registration.
* Temporary proof of registration is enough for a takedown, but the material will go back up if full proof is not submitted later.
This would put a big dent in nuisance DMCA claims. The service provider might get sued occasionally, but for big providers, it's probably worth litigating this once or twice. The companies that have valuable IP file copyright registrations. Disney will be able to show a copyright registration on all their movies.