Judge pares down artists' AI copyright lawsuit against Midjourney, Stability AI (opens in new tab)

(reuters.com)

190 pointsstarshadowx22y ago418 comments

418 comments

162 comments · 24 top-level

nologic012y ago· 34 in thread

Can somebody explain how this will not kill any incentive to publish anything?

Why would any human spend their limited lifespan to create a piece of work that will be grabbed without permission, approximated algorithmically (at least on the surface) and reused in infinite possible small variations without any attribution or remuneration whatsoever?

This feels like a reversion to medieval times with minimal trade between regions as thieves would ambush traders and steal any goods.

csallen2y ago

I despise the underlying belief here. That belief that the only reason people create art is based on desires for fame and monetary gain. Which is so demonstrably incorrect that it boggles my mind.

This is where centuries of copyright law have gotten us, brainwashing people into thinking ideas are property ("intellectual property") and should be treated as rivalrous goods, and that the only reason to be an artist is to profit. Brainwashing people into thinking the only way we'll have art in the world is if we maximize the profits of commercial artists.

Take one brief look at the internet, music, video, podcasts, a museum, the walls and refrigerators in people's homes, a kindergarten class room, an art class, or hell, this very forum. And you'll see that it's universally true that people like creating stuff because people like creating stuff. For free. Because it's fun and stimulating. That's inherent in us. We do not need laws to prop up an artificial business model for humans to maintain our drive to create.

renegat0x02y ago

... but you do have a rent to pay for? You know that artists also have to live somewhere?

Why do we have stars on github? Maybe not for fame, but it is a good indicator of how somebody is good. Fame is important. We do not make github repositories for "stars", but I think it is a good motivator for people to continue what they're doing.

If there is an author who spent years of his life into producing some kind of music piece, should not there be some kind of laws protecting his work from theft?

2 more replies

raynr2y ago

I didn't get that from the poster you're replying to. I agree with you that people will create because people are people and want to create.

But people also have to eat, need shelter, want kids, have to take care of health issues, and so on. For that they need money. If they can't get money from their creations they'll spend less time creating and more time engaging in activity that generates returns.

1 more reply

nologic012y ago

You are going on an irrelevant tangent (whether people enjoying being creative - which is obviously true at least for some) instead of answering a very clear and simple question: how, in your evolved and less broken universe, will talented people dedicate their life to produce something that society does not acknowledge or reward but simply appropriates.

2 more replies

Capricorn24812y ago

I appreciate the sentiment but AI is not a panacea to copyright laws, it's a way to hoard ideas whether it's protected or not.

And we still have copyright laws. Corporations are still hoarding IP. So rules for thee, not for me. The harder you make it for artists to get paid, the more people you get promoting Raid Shadow Legends

throwaway59592y ago

Do you want to work for free?

1 more reply

moose44002y ago

Well said.

__loam2y ago

Professional artists have to eat. Holy shit.

marcinzm2y ago

>Why would any human spend their limited lifespan to create a piece of work that will be grabbed without permission, approximated algorithmically (at least on the surface) and reused in infinite possible small variations without any attribution or remuneration whatsoever?

Because they enjoy it? Or do you see artists as some type of corporate drone who hates the very act of making art?

That's like asking why anyone would contribute to MIT or Apache licensed open source.

nologic012y ago

Excuse me? What moral and economic planet are you living on? Enjoyment is an important drive behind any creative person, so much is true. But at least part of the enjoyment comes from other people appreciating, acknowledging and, yes, remunerating that creative work.

The idea that authors, artists and other creatives will keep pumping original work as part-time love affairs so that AI bros can grab it and mint a dime is... strange.

4 more replies

Wissenschafter2y ago

People like to create things regardless of profit motive, like art, who woulda thunk it?

I don't understand how people think this will suddenly make human art vanish, that is just ridiculous and naive. People will spend their limited lifespan to make art, because that's what humans just do. Cavemen weren't being paid to paint on the walls.

Your viewpoint is honestly insane. The people against AI art are bonkers.

nologic012y ago

In your infinite sanity you have not answered my question of how a creative person will dedicate their life (starting from long studies) producing something that society will not reward in any way.

1 more reply

glimshe2y ago

You are applying an outdated mental framework to AI, not to mention a falacious view on the relation between copyright protection and the incentive for art. Have in mind that millions of humans have historically produced enormous amounts of art with no copyright protection of any kind. The protections you are trying to defend are a relatively recent phenomenon (~200 years); arguably, some of our best art was created well before these protections existed!

Additionally, the beauty of contemporary AI is that it's much more similar to the mechanism of inspiration and learning that humans employ than, let's say, the literal copy of a photo. I think it's reasonable for an artist to limit the visibility of their work and prohibit their images from being shared online and used by AI training - but this must apply across the board. If their image is public in a way that anyone could see it, and be inspired by it, then they need to accept that the AI could be equally "inspired" by it.

If you want an easy, concrete example of the process of inspiration and copying taking place for humans, just look at Animes. Styles are imitated by humans left and right, with no concern of original artists losing their livelihood. Human Anime artists copy their idols when learning to draw, oftentimes producing literal imitations for years before starting to produce their own original work, which to be honest usually greatly resembles the source of inspiration (PS: I like Animes, but most of it is very similar in terms of artistic style).

Do humans ever really "invent" any art? Or are the artistic innovators simply a mix of influence of existing art, the natural images of nature/life plus a spice of randomness? Because that's pretty much how AI art functions.

nologic012y ago

I am not trying to defend copyright. I am trying to defend the incentives and ability of humans to dedicate their lives to something that might be innate to all to some degree but only comes to fruition after long years of dedication.

Older societies did not have copyright but they, manifestly, had ways to sustain creatives.

People wax philosophical about paradigm changes and other vacuities yet refuse to answer a simple question: how will society reward human creativity that takes a lifetime of cultivation to flourish.

1 more reply

huimang2y ago

The problem is these discussions are being had by STEM/tech people who don't respect or value art or the effort behind it, not by artists. They simply do not get the concerns that artists have.

It truly boggles the mind that people equate machines that can output thousands and thousands of images in short time spans in any ingested style... with humans who have to hone styles and can only produce a result every so often.

surgical_fire2y ago

This is how technology works. Bulldozers effectively replaced people with shovels. Excel effectively replaced accounting clerks. Generative AI effectively replace artists (to some capacity).

Most people care only about the output of a system, not about who the system replaces.

2 more replies

ronsor2y ago

If you had a conversation with a STEM person, they'd probably say everything is 100% fine, society and all. If you had a conversation with an artist, they'd probably say AI is pure evil theft and society is collapsing.

If you solely listen to either side, you'll be blinded by madness. On that note, how many people respect or value the effort behind software? Most don't, not most artists either. That is the nature of life.

hfuyf652y ago

Is it because it seems the value can be imitated so easily?

bawolff2y ago

> Why would any human spend their limited lifespan to create a piece of work that will be grabbed without permission, approximated algorithmically (at least on the surface) and reused in infinite possible small variations without any attribution or remuneration whatsoever?

Because many people make art for art's sake.

Besides popular art works always essentially had this yet people still made them. The difference is in scale not kind.

Or to put it in another context - why would anyone work on an open source project when their work can be reused without (explicit) permission, cloned and reused in infinite small varations without any renumeration and essentially no credit (when was the last time you actually looked at the CREDITS file in an open source project? Have you ever?)

onlyrealcuzzo2y ago

~99% of art is extremely derivative.

Why does it matter if some artist uses ChatGPT to knock-off your style indirectly rather than directly?

I mean, sure, ChatGPT is better at it than most artists. Is that the problem? The quality of knock-offs is too good now?

Take any new song - and any music head can list several songs it is just like. Take any new movie - and most screenwriters could go on for hours how it's almost exactly 10 different movies. Etc.

There is nothing new under the Sun.

naasking2y ago

> Why would any human spend their limited lifespan to create a piece of work that will be grabbed without permission

Creative people will create regardless of financial incentives. Fan fiction and free art is already everywhere, for instance.

pkdpic2y ago

Agreed, working with generative AI tools at work and independently over the past year has only made me want to make weirder more personal paintings / drawings with less interest in mass appeal or artistic professional viability. It's felt unexpectedly liberating and inspiring.

nologic012y ago

In your universe apparently "creative people" don't need to spend a lifetime of study to hone their art, don't need food and shelter every single day etc.

Its amazing how callous tech people have become as they salivate for their unicorns or whatever they are pursuing.

2 more replies

nirav722y ago

I don't see how this could be any different than say a person reads a book. Then uses what they learned from that book to write another book? Sure, if they're literally copying material from one book and then adding that as their own work into their book, that could be against copyright.

spencerflem2y ago

Because a machine can do it millions of times faster

1 more reply

lolinder2y ago

This decision isn't "a reversion to medieval times", whatever your opinion on the legal status of these images— this is an entirely procedural decision in which the judge ruled that the specific claims of copyright infringement are invalid because the plaintiff never filed for copyright:

In other words, the story here is mostly that the lawyers screwed up badly in pursuing a copyright lawsuit before ensuring copyright had been filed.

johngher2y ago

It's a parlor trick, but I'm going to use it anyway: your own comment refutes your point.

You put work into posting this comment: thought about the situation and crafted sentences you wanted to publish. I've absorbed them, learned from them, they'll inform my own output in the future. And respectfully, I won't remember your name or give credit.

So why did you publish your comment? People can't avoid creating data. We do it passively. And you'll continue doing it, for your entire limited lifespan, even if you get neither laid nor paid for it.

gedy2y ago

It's just not that different from people seeing works and learning or being inspired, so how do you "ban AI" without adding more crazy DRM/DMCA stuff for legitimate use?

1 more reply

creer2y ago

Plenty of people create with minimal profit effort. Outstanding creation happens with minimal profit all the time.

To profit from creation you have to publish - what alternative do you propose?

So that I don't understand this idea that anything will "kill publishing". Copyright changes the economic math around publishing, sure - and most of the time currently not for the better. That will keep evolving but there is no risk of killing creation or publishing.

mock-possum2y ago

Oh I can explain it: what you’re describing is already happening, it’s been happening since far far before medieval times, it’s literally how human creativity works: novel recombinations of existing works.

If anything, it will lower the barrier for creation, to allow a greater variety of art by a greater variety of artists. Adding a new tool to an artist’s kit is an incredible step forward for art.

tomjen32y ago

It may very well kill the incentives. But if new laws are needed, that is a matter for congress, not the courts.

golergka2y ago

Other humans have always been doing exactly that with anything you published.

latexr2y ago

Not at this rate, speed, and reliability. Even setting aside the morality and legality of the matter, let’s please stop with the fiction that what these computerised systems do is the same as other humans. Scale matters. If someone said “I’m worried about the consequences of machine guns being sold at convenience stores”, it is not a sensible response to say “human have always been able to kill other humans with knives and handguns”.

nologic012y ago

Really? can you point to some example?

People have been putting up with some theft because they could still eke a living.

This attitude has all the coherency of "some people are thieves, we cannot catch them all, so lets make theft legal".

Unless I hear some sensible argument why this slippery road won't destroy a good fraction of the economy I am assuming that regression to kleptocracy is the shape of things to come.

1 more reply

ballenf2y ago· 25 in thread

Can someone explain again how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?

There are artists that can study a painting for a few minutes and then recreate it from memory. There are artists who study a particular body of work so long that they can create more works indistinguishable in style. If an artist recreates a copyrighted work or creates a derivative too close to the original, then that new work is potentially copyright infringement.

That is, we focus on the output of the process to determine infringement with living artists and ignore the training. But with ML, everyone focuses on the training.

It seems an ML tool could add a filter to the output and refuse to output a work that too closely resembles one or more work under copyright. Isn't that basically what legitimate professional artists do as well?

Thousands of artists are capable of infringement, but we don't take away their brushes based on capability.

dragonwriter2y ago

> Can someone explain again how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?

Its legally different because the human brain is not considered a fixed medium under copyright law, so a human experiencing and learning something is not making what is potentially a copy or derivative work under copyright law, and therefore not exercising, potentially without permission, one of the exclusive rights granted to the copyright holder.

> There are artists that can study a painting for a few minutes and then recreate it from memory

Right, and those artists violate copyright at the moment they recreate it in a fixed medium without permission, but encoding it into computer storage media is already a copy, so a machine (or, rather, the person using the machine) hits that threshold before creating an actual visual output.

> That is, we focus on the output of the process to determine infringement

No, we focus on what is set in a fixed medium. If that is done at an intermediate step of the process, rather than being an output, it can still be infringement.

Its just that a human doesn’t always need to make a copy in what is legally a fixed medium until the output, but that is not the same as categorically only treating outout of a process as legally relevant.

jncfhnb2y ago

People betting on all “lossy compression” being equivalent will lose. It’s a lame argument. Can you encode images to be basically 1:1 with embeddings? Absolutely.

Is that what these embeddings are doing? No.

If artists try to argue that all copies are equivalent, but are unable to recreate their works from the embeddings, their argument will fall flat.

This argument also only applies to sharing models, which is doubly dumb because we want open source models, not closed source models. It’s a harmful status quo to try and enforce.

koolba2y ago

> If an artist recreates a copyrighted work or creates a derivative too close to the original, then that new work is potentially copyright infringement.

I see no reason the same standard cannot be applied to ML generated content. If the evaluation is being performed on the end result, then that is all that matters. The same judges that decide these things for human generated content can continue to do so for ML generated ones.

Even the people submitting and responding to the copyright claims will still be human (with briefs generated by ML…).

What will be more interesting is when the judges themselves get replaced with an “objective” AI to quantify similarity for copyright purposes. If that ever happens, it’ll trigger an arms race to hit the razors edge without going over.

raincole2y ago

> how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?

You just answered it? One is a ML system and one is a human?

I'm really, really baffled why people keep using this argument. Like you guys know machines are not humans, right? ...right?

Humans are special cases in laws. Always have been and always will be (until AGI). A pedestrian is treated differently in laws than a driver is. The fact that a pair of legs and a car both move you from point A to point B doesn't make them same. Selling human livers on your local market is very different from selling cow livers, even biologically they are all organic tissues.

Let me say it again: humans are special cases. AI learning copyrighted materials might be illegal or legal, but it has little to do with "what if a human being does the same".

pmoriarty2y ago

Isn't copyright about the final product, not how it was arrived at?

If I independently come up with a song called "Let it Be" that has the same lyrics as the Beatles song and publish it without the permission of the copyright holders, I will have violated their copyright.

It doesn't matter if I heard the song before or not. It doesn't matter if I did it myself or used a computer to do it. What matters is the final product and my publishing of a song close enough to the one that was copyrighted.

AI image generators are just tools, like Photoshop is a tool. Nobody cares if you used a paint brush or Photoshop to create something that looks like a copyrighted image, why should AI image generators be any different?

If the final image is similar enough to a copyrighted work and I publish that image without permission of the copyright holder, then that's a copyright violation.

If the final image is different enough, then it's not.

That an AI was used and how the AI was trained are completely separate issues.

1 more reply

ben_w2y ago

> Always have been and always will be (until AGI)

Probably not even then, at least not initially. While some people conflate AGI with personhood, consciousness, qualia, etc. we've got at least 22 different[0] ideas of what consciousness is and no idea how to even determine whether or not a mind has qualia — and even if we did, I see no specific reason to require any of them, as a P-zombie[1] AGI doesn't seem to me like a contradiction in terms.

[0] https://www.nature.com/articles/s41583-022-00587-4

[1] https://en.wikipedia.org/wiki/Philosophical_zombie

theonlybutlet2y ago

In law there is such a thing as legal person as opposed to natural person. When it comes to commercial law, its provisions tends to relate to legal persons.

1 more reply

panta2y ago

It's a matter of scale. No human being can ingest ALL existing images. If it was the case that the average human artist was able to replicate any other work, without effort, probably we would have had two effects: first, we'd have much less works of art (because the gains would have been eliminated, so why bother), and second, copyright law would have been much more restrictive. This is exactly what we should do: avoid applying a law thought for human beings, and create a new more specific law, much more restrictive. Otherwise future art created by actual human beings will suffer greatly (without mentioning the loss of work and human abilities), to the economic benefit of a very small set of monopolistic players.

theonlybutlet2y ago

What you propose is just a different small set of monopolistic players. Copyright has always been a trade off between the creator and society. It should be enforced the exact same way as it currently is. Fair use is fair use. By your same logic, what is the difference between an AI or a very productive human? Where do you draw the line?

1 more reply

thrill2y ago

The law does not say it is a matter of scale.

1 more reply

gaganyaan2y ago

The only way we end up with only monopolistic players being the only ones that benefit from AI is if we misapply the outdated concept of copyright here. Open source models like SD are the path away from monopolization.

kranke1552y ago

Let me change the argument around: Why is it assumed that because an artwork is freely available on the internet, you are allowed to train a machine to reproduce it, being in its totality or just details that are used in the creation of new works?

IE why isn't it that an artist could say, hey I'm letting you see this painting, but you are not allowed to sit down with a canvas and learn how to reproduce it? Because you can do that in galleries - no photos, no reproductions.

So actually building a machine there, under the cover of darkness, that learns from your work so you can produce new work, why is that allowed in the first place? Certainly wouldn't be at a museum.

The key thing here is - if you want artists' data, you should ask for it. They didn't. This would be equivalent of training a Github CoPilot on every available piece of code in existence, ever, instead of what they had available. Why should that be allowed? So if I built some toy code in 1996, and happened to post it on usenet, and it's a great implementation of X, why the heck is CoPilot allowed to read it? It's my property.

shkkmo2y ago

> why isn't it that an artist could say, hey I'm letting you see this painting, but you are not allowed to sit down with a canvas and learn how to reproduce it? Because you can do that in galleries - no photos, no reproductions.

But you can't stop people from sitting and studying your painting and then painting stuff similar to it.

One of the core assertions that is being decided in this case is if there is any actual reproduction here. Does a model contain a reproduction of every image it was trained on? Can the model actually create a reproduction of any images it was trained on?

If it turns out that there is no reproduction here, then it comes down to how much legal control we give copyright owners to regulate access.

A gallery can reasonably ban cameras and canvases, but it becomes a lot less reasonable if they try to ban artists.

Let's imagine that this isn't just specifically tuned ML but proper General AI that can learn new skills. Is your argument that this AI would be legally prohibited from viewing any images it doesn't have a specific license for?

I think that drawing hard lines around what kind of processing can be done on publicly available images is going to become problematic. It's better to regulate around what can be done with the results of the processing than that processing itself. That's how our existing laws work. Making a reproduction, even just from memory, of a copyrighted work is restricted. Memorizing a copyrighted work is not.

1 more reply

Vvector2y ago

"It's my property."

When you publish it, you lose some property rights. While under copyright, there is a short list of things that others are prohibited from doing (reproduce, distributed, etc.). And you lose all your rights once the copyright expires.

creer2y ago

The question is about copyright law. You can raise other legal theories or ask congress to create an entirely new class of intellectual property law. Sure. The lawsuit is about whether copyright applies, it seems to me.

mattigames2y ago

This is the typical intentionally misleading argument in favor of AI, comparing a software to a human artist conveniently forgetting that a real artist cannot a create millions of pieces every hour, just that difference makes any direct comparison laughtable because such threshold was an absolute immutable constant for all human history until very recently, and that includes among many other things the incentives artists had to persue that career instead of any other. And of course the societal problems that displacing so many jobs entails.

picadores2y ago

The AI will not throw a molotowcocktail at you or hang you from a lamppost when it starves?

__loam2y ago

The issue is that ml companies are using exact copies of the work in question, and using it to make massive for-profit systems without permission or compensation. Individual artists don't threaten the market in the same way.

frumper2y ago

How do they get the works in question to train on?

1 more reply

nextaccountic2y ago

> Can someone explain again how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?

Because (fortunately) human thoughts can't be subject to copyright law yet. So when we talk about copying and making derivative works, if you have this

    artistic works -> neural network weights

The end result may or may not be copyrightable (that's for the courts to decide), but this

    artistic works -> human brain

Definitively can't be

CrimsonRain2y ago

No difference. Some people are just luddites or have vested interest against automation of their own field (but fine with other fields).

__loam2y ago

I don't know. It seems pretty shitty that these systems are literally leveraging their work against them. It also seems shitty that we're trying to automate cultural expression. Even if it's not explicitly illegal, the ai art guys are still ass holes.

1 more reply

bergen2y ago

An artist can not professionally scan and incorporate millions of pieces of art into his cortex in a minute for commercial purpose.

_petronius2y ago

> Can someone explain again how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?

Three things immediately spring to mind: scale (1), accountability (2), and profit (3).

1. An automated system can train on data at huge volume, in a way that no single human is capable of doing. Setting aside the issue that training an ML model and artists learning by copying techniques of other artists is, I would argue, fundamentally different acts, _even if we take them to be the same_, we have to acknowledge that in a single human lifetime one person can only "train" on so many works. Automated systems have no such limitation.

2. If an artist violates copyright or oversteps norms around artistic professional practice, they can be held accountable. Companies which violate this by using automated systems so far hide behind those systems ("the AI is doing it/did it") so aren't held responsible (it should be: the company has built the system, and therefore is responsible for how it is used, and what it does). By building up this false sense of agency on the part of systems (which the marketing term "AI" is designed to bolster), lack of accountability is laundered into the actions being taken at scale.

3. Automated systems are, due to their scale, very profitable. I can generate hundreds or thousands of copyright-violating work that dilute the market for artists, and it is incredibly cheap to do so. Fighting those copyright violations in court has to be done more or less on an individual basis (especially if actions like that in the original article continue to fail), which is extremely slow and expensive. If the cost of violating copyright is tiny, and the cost of enforcing it is huge, then it ceases to be a useful tool except for the most well-resourced organizations.

> It seems an ML tool could add a filter to the output and refuse to output a work that too closely resembles one or more work under copyright. Isn't that basically what legitimate professional artists do as well?

No, because copyright is more complicated than "these two things look a lot alike", and legitimate professional artists don't run into this issue, because they aren't constantly trying to skirt the line of "as close as possible to copyright violation while still getting away with it".

> Thousands of artists are capable of infringement, but we don't take away their brushes based on capability.

But they do get sued when they infringe! Enforcement happens, because (for now) it is still possible for independent artists to enforce their copyrights. The argument being made by artists with regard to these ML models is that _they are already infringing copyright_, not that they hypothetically may in the future.

lofaszvanitt2y ago

Have you been sitting in Midjourney rooms and saw the prompts? :DDD

kranke1552y ago· 21 in thread

This will be the greatest act of Intellectual Property theft in history.

All because judges will be befuddled about what to do after hearing terms like “training data” and “compression”. We will, of course get the emails in 10-20 years showing that it’s all lies and that the CEOs of these companies knew exactly what they were doing.

If this continues, AI will be the great inequality machine in history. Take data from 1,000,000 individuals, train your AI to replace them, compensate no one.

You can do this in every area: driving, farming, cooking. Music. Just dispossess everyone of all their property by training an AI to copy all their work! What could be easier (and less morally right)…

treyd2y ago

Intellectual property never really existed. Copyright is something we made up to extend the logic of commodities to the full value chain for books, which made sense 200 years ago. But it makes no sense to apply the logic of commodities to digitally produced and distributed media. The production of culture has been slowly becoming more distorted as cultural assets that should be and historically were held in common accumulates under the umbrella of massive intellectual property holders (Disney, Universal, etc) after we took the legal concept that was meant to apply to a much more narrow context and applied it broadly. They benefit disproportionally from intellectual property than individual artists do. The (recent) past dominating the present, being ruled by abstractions, and all that.

Is it bad that this will be used to displace individual artists/creatives in the value chain of media production? Of course it is. But we shouldn't be responding to that by clinging harder to schemes that have outlived their usefulness, we should be developing new models for funding production.

rvz2y ago

> Copyright is something we made up to extend the logic of commodities to the full value chain for books, which made sense 200 years ago. But it makes no sense to apply the logic of commodities to digitally produced and distributed media.

Great! So given your articles are in the public domain on your website I can make millions out of it without given you a cent or direct credit and sources without paying you and can claim it all as my own then.

5 more replies

harshreality2y ago

If you memorize all of harry potter word for word, or some famous solo vocal track from memory, are you committing a copyright violation? Or only if you then recreate it and try to redistribute your copy?

The scenario where AI training is locked down doesn't result in 1,000,000 individuals getting paid. (What would they get paid, and by whom?) It results in Disney, Adobe, etc.—massive companies with existing licenses to use content just about however they want—training their own models and locking everyone else out of the large AI model training game, until AI gets good enough to start generating human-quality creative work on its own (the same kind of progression as alphago/lee to alphago/zero), perhaps with the addition of a small set of purely copyright-free material.

Excluding all copyrighted material would be tying an AI model's metaphorical hands behind its back, since humans, although capable of producing great works through much iterative effort in isolation, all rely on having learned from some copyrighted work. Find an author who hasn't read plenty of recent books as well as older classics, or a musician (other than classical) who hasn't listened to plenty of modern music, or a director or editor who hasn't watched tons of movies and films. Recall Newton, "[I]f I have seen further, it is by standing on the shoulders of giants." Many of those "shoulders" are copyrighted.

cmiles742y ago

Where is this idea that copyrighted material should be excluded from training data coming from?

My understanding is that people want to be compensated when their intellectual property is used as training data for a machine. That strikes me as an entirely reasonable expectation.

One person memorizing Harry Potter for their own amusement, even if they make money doing public appearances where they recite sections of the work verbatim for the amusement of the audience is not in any way similar to the process of training an LLM or of that LLM's output. The scale alone is so vastly different that it renders the comparison useless and misleading.

1 more reply

kranke1552y ago

Yes, and you know how humans acquire works to learn from?

They pay for it.

They buy the books. They buy tickets to theatre. They buy entrance to the gallery.

The trick that's being done now is hey, we don't have to pay since it's not a person. (to the creator) But hey, it is just like a person when it learns! (legal system)

If AI models require human training data, then they should pay for it. Easy.

4 more replies

olalonde2y ago

Learning is not theft and never has been. I don't care whether it's a human or machine doing it. AI will benefit everyone enormously, even if it won't be equally distributed. The real issue here is that some skills are increasingly becoming obsolete and people have a hard time coping with that. Instead of demanding compensation, which would really be impractical to implement anyways, why not focus on developing new skills?

kranke1552y ago

No that is not what people are upset about. They are upset that their life's work is being used without even asking permission, for someone else to get insanely rich.

That's what they're upset about.

If there were no use for 2D artists, then Stability Ai wouldn't be making an AI to replace them.

Key word here is: replace. 2D artists are not becoming obsolete - they're being replaced by a machine that was trained on their works without permission.

If you want to make an AI that does amazing paintings, and doesn't use human training data, more power to you. I can't compete with that. But if you use MY WORK to make a machine that's going to replace me, you do it under the cover of darkness and without permission - yeah i'll get pretty mad.

What happened to visual artists was more like Logitech announcing Logitech CoPilot and revealing they've extracted code from keylogging for the past 20 years.

3 more replies

cmiles742y ago

It's convenient to refer to the training of the machine as learning but let's not lose sight of the fact that "machine learning" is not at all the same thing as people "learning". Pretending they mean the same thing in this context, in my opinion, is dishonest.

I also take issue with the assumption that AI will "benefit everyone enormously, even if it won't be equally distributed"; I don't see any factual basis for this assumption. On the contrary, it seems much more likely that AI will be used to concentrate wealth even further. Given the high cost, I find it hard to believe it will ever by "equally distributed".

For as long as I can remember big corporations have been merciless in their preservation of "intellectual property". I didn't love it then and I don't love it now. OTOH, the idea that Microsoft can train their LLM on code I've written and then sell access to that LLM for money (sharing no dollars with me) strikes me as outright theft.

Kim_Bruning2y ago

Just because it's on twitter doesn't mean it's true. I think a court setting where things are contemplated in a rigorous and reasoned way has a somewhat better chance of arriving at something resembling the truth.

We've been here before several times: Silhouette painting, Photography, Airbrushing, Pianos, Synthesizers, Sampling, Photoshop, Ray Tracing, and many more. "It's not real art", "they're stealing from us", "we'll go hungry!" .

Some of these are already quite old. For others, I've actually been asked the question back when I was in school: "Are you really making music if your instrument has a microprocessor in it?". Um, yes, yes I claim I am making music thank you very much.

First people complain, then they adapt, and then they end up making awesome art with the new tools and/or instruments. Which isn't to say historically it was all rainbows and roses, but it was never the end of the world either. Seeing the newer generation of AI tools and how the tools end up getting integrated into regular workflows, it seems to be going the same direction.

To quote the song, I think it's "all just little bits of history repeating".

1 more reply

CrimsonRain2y ago

There is no "taking data" going on. Nobody is going into your private locker and training on your painting, music or cooking recipes.

If you put your "work" out in the world, anyone who views it, is automatically training their brains on it. Viewing is training.

JoshTriplett2y ago

> If you put your "work" out in the world, anyone who views it, is automatically training their brains on it. Viewing is training.

A perfectly reasonable view for humans, since you shouldn't be able to copyright a brain.

Not at all a reasonable view for a computer, until we also get to freely use all the copyrighted works ourselves. The problem here is that AI training is asymmetric: the people training an AI use works in violation of their licenses, but don't let their own works be used in the same way. For instance, Microsoft uses code on GitHub to train Copilot, but you still don't get to freely the source code of Windows or GitHub.

I am absolutely in favor of eliminating copyright and patent law. I am not in favor of keeping it around while letting AI become a laundering mechanism to get around it. AI training should not get to uniquely ignore copyright; copyright should cease to exist.

kranke1552y ago

It's funny to me that we haven't reached AGI or anywhere near it, but when we talk about training Diffusion models, suddenly they're "just like a person" for legal reasons.

Same thing that happened on the construction of the "corporation as a person", built on top of rulings made to protect African Americans.

1 more reply

myaccountonhn2y ago

AIs and humans are not the same, why do people keep grouping them together and assume the same logic should apply?

3 more replies

kranke1552y ago

Ok, so reproduce me a Picasso. You've seen one right?

2 more replies

EnergyAmy2y ago

Don't be silly. This was a level-headed ruling that avoids retarding the progress of science and the useful arts.

I'd really like to see people drop the inequality argument. If you actually cared about that instead of virtue signaling, you'd push for a mandatory GPL-style license that forces models to be available to anybody that uses them. That would avoid trying to unsuccessfully put the genie back in the bottle, while also preventing a few companies from benefiting at the expense of everyone else. Just like OpenAI's original mission of making AI available for everyone, before they got dollar signs in their eyes.

smrtinsert2y ago

Is this supposed to be sarcastic, because this is impossible and you're arguing a strawman. You could also say electricity enabled this great inequality, we have got to stop electricity.

surgical_fire2y ago

> This will be the greatest act of Intellectual Property theft in history.

Good.

Intellectual Property is a mistake. If AI brings about its end, I welcome it.

Illotus2y ago

Not really good if AI can run around it but to normal people it exists as before.

williamcotton2y ago

Google “expert witness”! Courts are also known to hire their own experts who mediated between the experts on either side.

Also, those emails seem very likely to be ordered to be produced during discovery.

This thing could really go either way at this point but I feel like Stability has the upper hand.

Imagine training a model without any of the plaintiffs images, then using that side by side with the model that does. This could then be used to show the jury that those individual works are of no importance to the system if the images are of the same quality.

They will probably argue that the individual expressions of each work are not copied, rather the abstract ideas of two-dimensional representations present across any and all images.

Expect lots of side by side pictures as Exhibits from both side! Grandma and her fellows have to weigh in on this one!

This is a fun one!

kranke1552y ago

Stable Diffusion has been known to make virtually identical copies of the images it was "trained" on, afaik.

If the images are REALLY of no importance, they wouldn't have been used anyway.

3 more replies

mock-possum2y ago

Sing it with me now: Copying Is Not Theft

https://youtu.be/IeTybKL1pM4

AndrewKemendo2y ago· 13 in thread

Having done way more corporate court than I want (patents, mergers, liquidation), I’m increasingly convinced that the judicial system is fundamentally flawed.

The reality is that the law in 2023 US is so obscure and opaque and how judges come to their ruling seems to be by their total whim with no actual philosophy other than maintenance of the system.

Further I’m extremely unimpressed with the vast majority of judges competence in display - such that contempt should be the starting position.

The fact that this is how laws are actually made (precedent of applications will always beat the letter) means that nobody who doesn’t have a warchest will be able to actually utilize the system coherently

As with everything now, courts are rules by those with the most money

tiahura2y ago

Not to be too rude, but you’re not an attorney and couldn’t be more wrong.

The law has never been more transparent. The public has nearly complete access to every docket in the country. Moreover, the level of jurisprudence has never been higher.

Moreover, I’ve lost a case or two in my time, but it was never because of a lack of a warchest.

SketchySeaBeast2y ago

> Moreover, the level of jurisprudence has never been higher.

I'm confused as to what this sentence means. The level of {the study/philosophy/science of law} has never been higher?

1 more reply

AndrewKemendo2y ago

In which a person who professionally practices law, has been to law school, has a higher than average IQ, and has a decade of experience describes how easy it is to interpret the legal system

Hey quick question, my good friend, lets call him Doug is a high school equivalency graduate, has a few felonies and currently works as a road flagger

How does this complete access to every docket in the country help him?

You have more perfectly explained my point better than I could have. Thanks

corethree2y ago

The final decision made by this article is one I agree with with or without money and I have no incentive the game.

Every piece of creation you and I make us the sum total of our experiences and that includes copy written work. Holding an LLM guilty for that is like holding the human brain guilty for memorizing copyrighted work.

nirvdrum2y ago

And yet these systems are incapable of genuine creativity. If they were, they would be taught rules & techniques and set off to their own devices to draw, like humans. But, they can't and they're not. LLMs and humans don't learn or create in the same way. Moreover, there's no reason we should grant LLMs the full rights and privileges of humans.

4 more replies

YurgenJurgensen2y ago

This argument gets repeated often enough that it implies that there are a significant number of people who actually believe it. This is pretty depressing, as the only way you could think that a human is not fundamentally more capable of creativity than an LLM is if you are incapable of imagining anything other than a life of ‘consuming content’.

2 more replies

snovv_crash2y ago

It's ok to give humans rights that we don't give to machines.

1 more reply

woodrowbarlow2y ago

it's interesting that you start from the assumption that a piece of software should be judged by the same measure as a human.

1 more reply

AndrewKemendo2y ago

You seemed to miss the part where the judge said that the only things that can be claimed as copyrighted are those things that were submitted to the USPTO for specific narrow coptright

"The other problem for plaintiffs is that it is simply not plausible that every Training Image used to train Stable Diffusion was copyrighted (as opposed to copyrightable), or that all DeviantArt users’ Output Images rely upon (theoretically) copyrighted Training Images, and therefore all Output images are derivative images"

This displays either ignorance as to how artists work and the extent to which they are involved or can be involved in the legal copyright system, or reflects incoherence around the copyright system.

In this case the Judge chose to say, in effect: unless you have explicitly copyrighted it, it's fair use

That is now a new precedent that negatively impacts individual artists who have no power in the market, and protects giant corporate interests which have tons of power in the market

4 more replies

AlexandrB2y ago

For me this argument will hold water when we can put LLMs in jail if they commit a criminal act. Until then, an LLM is not a human and not entitled to be treated like one.

Moreover, at least in the case of music, people have been successfully sued when their song strongly resembles another copyrighted work. Thus "holding the human brain guilty for memorizing copyrighted work" is actually the status quo.

1 more reply

monkaiju2y ago

The courts are like the doors to the Ritz, open to everyone!

AndrewKemendo2y ago

Oh this is perfectly said thank you

creer2y ago

The court IS about maintaining the system. That's the plan. If you want to change it, then you need to change the law and that's the job of Congress.

raincole2y ago· 11 in thread

My perspective is there are two different main issues about AI (especially Stable Diffusion).

One is how it works consistently with the current law. Ml model is basically a highly lossy compressed data format. If you collect millions of copyrighted images, merge them into a super big image, then compress it into a .jpg. Are you allowed to redistribute this .jpg file?

To me, it's mostly depending on how lossy (low quality) your .jpg is.

(Note the fact that human brains are also lossy compressed data is completely irrelevant here: you can only compare machine to machine, algorithm to algorithm. You can't say if a human has right to do X, therefore a machine has the same right to do X.)

But this line of thinking, while consistent to me, is dangerous. Because it means open models like Stability Diffusion are more likely to be illegal than a closed one like MidJourney, since it's closer to the source materials. If closed models end up being legal but open models don't, it would be a big loss for our society as a whole.

notnullorvoid2y ago

It's not compression.

Compression implies the input can be reconstructed from the output (lossy or not), in the case of these ml models the input is the training data and the output is the model. You can't reconstruct even a fraction of that training data using the model alone therefore it is not compression even in the most lossy sense.

The model produced though can be an efficient compressor/decompressor, which produces a lossy output image when given a input of prompt and/or image.

All that aside, the whole human/machine thing is a dumb argument. It's humans that are using the tool. The question shouldn't be does a machine have rights to do X, but rather do humans the have right to use and build such tools?

raincole2y ago

> Compression implies the input can be reconstructed from the output (lossy or not), in the case of these ml models the input is the training data and the output is the model. You can't reconstruct even a fraction of that training data using the model alone therefore it is not compression even in the most lossy sense.

It's already proven that you can reconstruct at least a small fraction of the training set from diffusion models. It's something quite well known, so could we not die on this hill?

[1] https://twitter.com/Eric_Wallace_/status/1620449942090420224 [2] The paper: https://arxiv.org/abs/2301.13188 [3] Relevant HN thread: https://news.ycombinator.com/item?id=34596187

Kim_Bruning2y ago

So just to be sure: the list of URLs + metadata that gets used for stable diffusion is several terabytes. Not the images. Just the list of URLs alone (and a bit of other metadata).

Stable diffusion itself is just 6+ GB, and fits comfortably on my USB stick.

That's one heck of a lossy compression algorithm, sir!

(this thread has more discussion on this line of thinking https://news.ycombinator.com/item?id=37879938 )

raincole2y ago

> So just to be sure: the list of URLs + metadata that gets used for stable diffusion is several terabytes. Not the images. Just the list of URLs alone (and a bit of other metadata). > Stable diffusion itself is just 6+ GB, and fits comfortably on my USB stick.

Thanks for sharing this info which I'm aware of. However, this fact is not as significant as it might sound in terms of whether it's a lossy compression algorithm.

In most lossy compression algorithms, the compression rate is arbitrary. For example, for an algorithm that based on fourier transform, you can choose only take the first sin wave, or the first 1000 ones (a bit oversimplification here).

So yes, SD is small. Quite miracally small, and its size alone implies some important insights on how human see and read artworks. But this fact doesn't change whether I see it as a lossy compression. (In my previous comment I stated human brain stores lossy compressed data too, so you can see I'm using a broad definition of "lossy compression".)

2 more replies

harshreality2y ago

That's a difficult question because the boundaries of similarity/derived works for copyright purposes are determined by judges and juries based on their intuitions. There's no mathematical similarity testing, and trying to formulate such a thing would be challenging.

What's similar enough to a pop music theme, that has a grand total of a few lines of unique music, to be a copyright violation? How many bars have to be copied, and what kinds of minor variances do or don't avoid a violation? If you're inspired by a haiku, and change 5 of 17 syllables, is that still a copyright violation? Who knows.

raincole2y ago

> That's a difficult question because the boundaries of similarity/derived works for copyright purposes are determined by judges and juries based on their intuitions.

I believe that's why DALL-E bans some keywords related to alive artists. To show they have "no intention to violate copyrights".

And that's why I'm so worry about that we're heading to a future where open, uncensored models are illegal and closed source AI-as-a-service services are legal. It's not fearmongering: right now, you can't use GPL code in your closed source apps, but you can use GPL code on your server running a service that provides the exact same functions. I believe it has already hugely undermined the original intent of GPL (written in an era before SaaS became popular).

Some AI proponents say ML is the biggest invention since steam machines. I don't know if it's true, but if we end up stuck in a situation where open models are illegal while AI-as-a-service is legal, then it's the biggest step toward a dystopia since steam machines.

ben_w2y ago

To the extent that Stable Diffusion models are "lossy compression", the main one is somewhere between 1 and 10 bytes per image depending on whose answer I use for the question "how many images was it trained on?" (I assume the cause is 1.5, 2.0 and SDXL having different answers and the reporters conflating them). The geometric mean of those is ~three bytes, which is only enough for one single RGB pixel per image.

For all the legal issues — and the artistic flaws — I still find it quite remarkable how good it is at such a small size.

raincole2y ago

> the main one is somewhere between 1 and 10 bytes per image depending on whose answer I use for the question "how many images was it trained

Here is a catch tho. It's just "by average" several bytes. We can't tell if some images practically contribute 0 bit to the final results while some others contribute more.

(I know this "contribute" word is a little non-sense in the context of ML. But existing lossy compression algorithms are not that different in this sense: if you compress a 1M frames produced by a 3D renderer to a .mpeg video, each frame doesn't contribute the same amount of bytes to the final result.)

1 more reply

gpderetta2y ago

Machines do not have rights. The question is whether an human with a specific machine has a certain right, as opposed to a human with a different machine.

hunter2_2y ago

> human with a specific machine

I assume the entire client+server system constitutes the "machine" in this case, correct? So does "human with" refer to the end user (client side) or the sysadmin (server side)? Maybe one is an accomplice? The machine isn't going to infringe without certain prompting by the end user, just as an inkjet printer isn't going to do so.

smrtinsert2y ago

> Ml model is basically a highly lossy compressed data format This is a pretty incendiary statement for those opposed to generative models, but more important its not a good interpretation because the intent is not to store a compressed format for restoring the same image, nor can it.

alphanullmeric2y ago· 11 in thread

Intellectual property shouldn’t be a thing. If you still have it after I’ve supposedly stolen it from you, then it’s not real property. The easiest test of consistency is simply to ask about both piracy and AI training data. If you support IP in one case but not the other then you’re a hypocrite. There is no third option where your support of something depends not on what it is but who it benefits.

czl2y ago

Say someone takes your written work (say your online comments, any articles, blogs etc) and claims it as their own. You still have a copy of your work but now your audience the authorship is in doubt. Would you be against this happening to you? What are your thoughts about plagiarism? How is this different from "copyright"?

simbolit2y ago

You are straw-manning.

Imagine you encounter a public domain image (which by definition is not protected by copyright), you download it, and put it on your website.

Perfectly fine.

But if you write "I made this image" below it, you are a liar and a fraud. No copyright needed.

1 more reply

alphanullmeric2y ago

Don’t care.

justanotherjoe2y ago

The creation of information is a divine thing, information lasts until humanity itself goes extinct. The very first concept created by our caveman ancestor we still use today. Copying is easy. Creating is hard. Even something as simple as creating an original name is really hard, let alone making entire movies and video games. I actually think intellectual property is the single best thing humanity had done, precisely because otherwise there is no movies, there is no games, why would there be. Although I agree it shouldn't last forever.

alphanullmeric2y ago

In case I haven’t already made it clear enough in the comment you replied to - whether I believe in something or not doesn’t depend on who would benefit from it.

But to answer the question, you use proprietary software protected by means other than government force every single day.

gumballindie2y ago

You do realise that if people’s intellectual work is not protected there wont be any intellectual work left, right? Why would i create something knowing you can just grab it and use it? Communism did the same to physical property, where you didnt own much and everything belonged to everyone. That didnt end particularly well because people inherently want to own things, especially the output of their own creation. Sure you can use it, but according to the terms and conditions of the owner. Same goes for owning objects. You can use my car if i let you use my car.

kstrauser2y ago

Thank god copyright came along and gave us Shakespeare, Bach, da Vinci, Chaucer, Beethoven…

And can you imagine life without the wheel? Too bad we didn’t invent patents earlier so that we could’ve gotten a head start inventing it and the spear.

1 more reply

zirgs2y ago

Copyright activists really lost a lot of respect because of those silly music industry lawsuits, the mickey mouse protection act, software patents of trivial stuff and the like.

This lawsuit is even sillier than the previous ones.

alphanullmeric2y ago

And if there was no slavery we wouldn’t have any pyramids. I don’t care. You don’t have the right to an idea, a sound or a particular arrangement of pixels. That’s not communism because nothing is being taken from you. I don’t owe you any terms and conditions to something you don’t own.

3 more replies

vortegne2y ago

"people inherently want to own things, especially the output of their own creation". That is the founding idea of communism indeed. I'm not sure you understand anything about it.

kmeisthax2y ago

This is how moneyheads think the world works: that everything is a series of monetary incentives to be linked together to make an end result. Most humans don't actually think this way, and in specific a LOT of creative work is made without calculating exactly what the profit is going to be. This doesn't mean that artists don't want to be paid, but that artists focus on making their work first and monetizing it later.

What copyright actually protects is creative industry. By assigning individualized monopolies over copying and reproduction, the publishing industry can persistently lowball the shit out of artists (who themselves undervalue their work, see above) and then reap the profits for themselves. Since the vast majority of creative work would never see market interest, it's cheaper to pay billions of dollars to the handful of known, recognizable, and marketable mega-successes than to pay smaller amounts to a far larger pool of mid-list or unknown artists. This is why unions exist in basically every creative industry: otherwise, nobody below the talent line[0] gets paid.

To put a finer point on it: right now, the unions are doing a way better job of protecting human artists against AI art than copyright is. The argument for training AI being infringing is very weak in the general case where there's no obvious regurgitation. I mean, where does your copyrighted material even 'live' in the model, if the model can't even reproduce it? However, unions can very easily just say "you can't force us to cut corners by using this tool" in their negotiations and actually get that result. Furthermore, those rulings only bind publishers that hire artists. The artists themselves can still use AI when it makes sense in their workflow, rather than when publishers think they can cheap out on shit.

The failures of Soviet communism are complicated, but if you had to boil it down to one factor, I would not summarize it as "communal ownership bad" or "collectivism bad". Collective action has its place. Furthermore, the analogy you're making between copyright and physical property is flawed[1]. The reason why physical property ownership even exists is because of scarcity - the reason why I need permission to use your car is because you can't use your car if I'm also using it.

The irony of your communism analogy is that copyright is specifically used to erode ownership in private property in a way that makes the communism haters cry communism. There's a novel form of copyright misuse as a business model in which you put software in a thing that used to not require software, call it "smart", and then use the software to enforce your own idea of what "owning" the product means, backed up by the same laws that make it illegal to copy DVDs. There are a LOT of people who would like to go back to owning their cars and computers again, and that requires rolling back copyright, not strengthening it.

[0] Hollywood-ism for "people whose contribution to the work is not marketable"

[1] And, I suspect, a by-product of having read a bunch of Ayn Rand nonsense

1 more reply

williamcotton2y ago· 5 in thread

Orrick also dismissed McKernan and Ortiz's copyright infringement claims entirely.

Well, duh. The judge is helping out the plaintiffs in this case. A jury would have been easily convinced by the defense that no images produced by Stability's systems are visually derivative.

The key is indeed what follows:

The judge allowed Andersen to continue pursuing her key claim that Stability's alleged use of her work to train Stable Diffusion infringed her copyrights.

So unless there is some kind of summary judgement I would wager that this becomes the focus of both sides as this heads towards trial.

But that's it. As predicted by commentary from legal scholars, the outputs of Stable Diffusion are distinct from the model and are not infringing on copyright... at least for this complaint!

gamblor9562y ago

No, he dismissed McKernan and Ortiz because they didn't register their images for U.S. copyright, which is a foundational prerequisite for any copyright lawsuit (in the U.S.)

This simply means that they need to register their images for copyright before they can re-join the case. (https://www.gibsondunn.com/supreme-court-holds-that-copyrigh...)

EDIT: reading the linked PDF further, and it appears that McK and O's legal counsel stated that the two weren't asserting the copyright claims at all, which is why they were dismissed with prejudice. That means that they can't re-join the case by filing for copyrights for their images...Their lawyer fucked up pretty badly and if I were either of them I'd be filing a malpractice lawsuit.

williamcotton2y ago

I'll check PACER and read the actual ruling when I'm at work tomorrow, but yeah I'm interpreting "dismissed entirely" as "dismissed with prejudice".

You're entirely correct that if it was dismissed without prejudice the complaints on copyright infringement on the outputs could be amended and refiled.

1 more reply

williamcotton2y ago

Re: EDIT

Another interpretation is that the plaintiffs were well aware of how weak their case was with regards to the outputs and basically planned on abandoning it from the start.

There's been more than a bit of showmanship from the plaintiff's counsel so I'm not surprised that the actual legal tactics differ from the rhetoric of the blog posts. It's also common to stack the complaint so that when the judge does start focusing on the key issues that maybe a little more ends up at trial than otherwise.

There's winning in the court of public opinion and then there's winning in a Federal court.

starshadowx2OP2y ago

Isn't that linked case because they started to file for copyright and then sued rather than waiting for it to be completed first?

In this case they never filed in the first place, and it was dismissed with prejudice.

michaelbrave2y ago

I'm not convinced the training on copyrighted things argument will hold up either.

brucethemoose22y ago· 3 in thread

Why is Midjourney completely off the hook while Stability AI is not?

I'm trying to pull up the original court document, but the PDF isnt loading.

gamblor9562y ago

The plaintiffs apparently failed to plead sufficient factual allegations to support their infringement claim against MTD, which is a rookie mistake.

Factual allegations at this point don't have to be correct (that's what discovery is for), but they do have to at least satisfy the legal requirements for each prong of a legal claim. In many legal pleadings, the plaintiffs will state, "upon information and belief, we [assert X factual allegation]" since they don't yet have the discovery to support a more specific factual allegation.

starshadowx2OP2y ago

For that count specifically, Stability was directly involved with creating and funding the LAION dataset, whereas Midjourney and DeviantArt were not.

The DeviantArt direct claim is because of how DeviantArt has been using Stable Diffusion for their DreamUp system, but the direct claim against Midjourney has been less clear from the plaintiffs about whether they're going against Midjourney using Stable Diffusion in one model (beta/test/testp) or their use of training data (like LAION)

tick_tock_tick2y ago

Basically the judge said the idea AI images generated are infringing on copyright is so stupid it's thrown out.

The other part of the case is if the artists copyright was violated when training the AI and they have only claimed that Stability used their art to train.

aa_is_op2y ago· 3 in thread

Amazing how copyright law amazingly disappears when it's to the detriment of major tech companies and protecting smaller creators.

Just amazing!

gmerc2y ago

I’m not sure if here you are reading that. Have you read the article?

The copyright infringement claim (for training) is left intact. It’s the other claims that had no basis in existing law (e.g. no copyright was registered, etc) that have been thrown out.

flanked-evergl2y ago

> Amazing how copyright law amazingly disappears

Can you elaborate in what way it disappeared in your opinion?

1 more reply

Spivak2y ago

It's really not, copyright is and always has been incredibly political and ebbs and flows with the whims of the existing power structures. And that's mostly to do with the fact that copyright, unlike say murder, is much less defined, has as many interpretations as people, and is something that is designed to be overbroad and selectively enforced where everyone in the world by the letter is constantly violating it.

So this it pretty much expected and if it swings the other way the US/EU is going to hobble themselves in the face of any locality that gives zero shits about copyright. It's less about the art and more that the art enables these models to do real useful work and is better at it for having access to more data.

TotalCrackpot2y ago· 2 in thread

This is consistent with historically intellectual property being a construct that benefits owners of capital and not actual innovators. That's why I think it should be abolished, this is yet another mechanism to monopolize a space to profit through some kind of rent-seeking procedure.

nness2y ago

What should it be replaced with — a system where no one retains intellectual rights over the works that they create?

TotalCrackpot2y ago

What is intellectual right? I respect authorship, with obvious consideration that no intellectual activity happens in a vacuum, as Isaac Newton said: "if I have seen further, it is by standing on the shoulders of giants.". I believe that I should never be able to get financially hurt or go to prison because I used other person's thoughts.

1 more reply

gamblor9562y ago· 2 in thread

https://fingfx.thomsonreuters.com/gfx/legaldocs/byprrngynpe/...

The dismissal of Deviant was inappropriate given that the case hasn't reached discovery yet. The dismissal was granted based on a substantive evaluation of the Defendant's assertions which is inappropriate at this early procedural stage of the case. (see e.g. page 10 where the judge evaluates the "plausibility" of alleged facts, and page 12 where he says "I am not convinced" about the plaintiff's theory, even though in a MTD this is not a determination he is supposed to make pre-discovery).

Moreover, even if plaintiff's language was "unclear", the appropriate procedure is to require them to amend their claim and dismiss Deviant if the plaintiff does not amend, not to dismiss a defendant and give the plaintiff leave to amend their claims.

With respect to Midjourney, the Plaintiffs failed to plead sufficient factual allegations to support their claim, so that dismissal was appropriate. (Pre-discovery, it's okay for the alleged/pleaded "facts" to be wrong, you just need to allege sufficient "facts" that you have a legal basis for a court case. Note that "facts" in the MTD context doesn't mean real world facts, it is a legal term of art that actually refers to an allegation of a fact that will later be determined to be true or false at the actual legal proceeding on the merits.)

artninja19882y ago

Interesting. How do you see the Getty v stability lawsuit going? That looks much worse for stability. Do you think they will just settle and stability will pay them some licensing fee?

gamblor9562y ago

Getty has a much stronger case, given that warped versions of the Getty logo have shown up in a number of SD-generated images, so it's obvious that there was impermissible copying.

I'm not sure Stability will agree to a licensing fee, since part of the rationale for the last version of SD was to remove the infringing images from their training sets going forward.

4 more replies

OsrsNeedsf2P2y ago· 2 in thread

> Two of the three artists who filed the lawsuit have dropped their infringement claims because they didn’t register their work with the copyright office before suing. The copyright claims will be limited to artist Sarah Anderson’s works, which she has registered.

The lawsuit is moving forward, but only on copyrighted work. This is (not yet) a story.

thowaway912342y ago

I’m so confused about American copyright law. I was always under the impression that copyright is granted automatically and you didn’t need to “register” it, contrarily to a trademark which must be registered and is only valid for its specific industry.

nness2y ago

That was my belief too, but: "Copyright exists from the moment the work is created. You will have to register, however, if you wish to bring a lawsuit for infringement of a U.S. work."

https://www.copyright.gov/help/faq/faq-general.html

(Makes me wonder if, back in the day, every song that was downloaded and then pursued by the RIAA was registered...)

1 more reply

soulofmischief2y ago· 2 in thread

A thought experiment:

Imagine you have a blob of seemingly random data. Nothing in the data contains anything recognizable as illegal or in violation of copyright.

Now imagine that the right input suddenly turns the data into illegal or infringing material, after a transformation operation. And not just a single unique input such as a password which clearly represents a mapping function between two sets of data.

But imagine if there were seemingly infinite possible inputs, each of which transformed the data into a different infringing blob of data. If these inputs exactly represented the novel, copyrightable or illegal aspects, but the blob itself was inert.

What should be illegal here? The blob, which by itself is free of any questionable bits of data, or the inputs which transform it into something tangible? Both? Neither?

Well, it has never been illegal to draw or paint something representing CSAM, for example. And it has never been illegal to draw or paint Mickey Mouse in your own home.

What's often illegal is publishing said data. Ignoring the free speech debate around artificially produced CSAM, publishing it is already illegal in many territories. It is also illegal to violate copyright in many countries when publishing information.

What's interesting is that it is not illegal to trace a drawing and hanging it up on your wall, instead of buying the the real drawing from its rights-holder. It's also not illegal to reproduce a tracing done by a friend. But the recording and film industries have been more successful in convincing us that it should be illegal to do the same for a song or film. That you should not be able to "trace" the data at home, and that you should not be able to share it with me, that I should not be able to trace over your tracing and bring home a copy for myself.

I can understand, and support a copyright system which regulates the publishing of copyrighted material. Even copyleft paradigms lean on regulation for enforcement. But the film and music industry actively try to restrict individual freedoms in the name of corporate profits, while still screwing over their clients and employees with respect to profit-sharing.

Back to the point: That blob should never be illegal. The activation functions should never be illegal. That is a basic extension of free speech. But publishing, that is a different story, and we already have laws offering such protections both with respect to illegally-produced or copyrighted content. Any attempt to regulate what kind of model I am allowed to run at home is a massive infringement on my rights as an individual, and is borne either out of gross ignorance of current copyright law from the same people crying, "But think of the copyrights!", or direct, insidious corporate greed.

You can adjust this thought experiment so that instead of dealing with a magic blob, we are dealing with a program that makes it really easy to produce illegal or copyrighted works after a bit of human interaction. Is there claim here now? Are we basing the law on how much human involvement was needed to create the output? We've faced similar arguments around technological leaps such as the printing press or mechanical loom. Did we, as a society, reject these advances in technology in order to protect loom workers and scribes?

Bottom line. You can pry my models out of my cold, dead or handcuffed hands. Times like these really shine a light on who is complicit in the system, and who suffers from it.

If you are in the creative industry, you need to understand how things are going to change. As an engineer with decades of investment into my craft, I also have to face the rude awakening that is ahead in my own industry as automation creates a gap between highly-skilled professionals and newcomers. Being a paid software engineer might become as hard of work as becoming a famous professional artist. Lots of connections, insane specialization and a lifetime devoted to the craft. A lot of people in school for engineering right now might struggle to find employment in 20 years or less if they cannot cross this gap in time. Artists aren't the only tribe experiencing a huge industry shake-up over a technology that will one day be so ubiquitous that it's inside of your toaster.

Aerroon2y ago

>Imagine you have a blob of seemingly random data. Nothing in the data contains anything recognizable as illegal or in violation of copyright.

The set of Real numbers contains every positive whole number. This is already the magical blob.

Eg the decimal number 65101114114111111110 is "Aerroon" in ASCII.

Edit3: real numbers are better than natural numbers or whole numbers for this. They have zero and they solve the "0005" problem.

mithr2y ago

This feels like a bit of a naive interpretation of the situation. At its core — regardless of specific lawsuits, etc — the questions here are (1) should copyright laws be adapted to the new reality of generative AI, (2) should artists be able to control how their work is used given generative AI is a reality, and (3) do we as a society think people should be able to make a living as artists, and what are the implications of that either way when it comes to AI models and their use.

Until this point, an artist who has developed their own personal, recognizable style, could be somewhat confident that it is difficult for someone else to generate a new piece of art exactly mimicking their style. That is to say, it was never impossible — there have certainly always been other artists out there who are capable of taking artwork and creative something new in that style — but there were some barriers to getting there, including that those artists aren’t easily and instantaneously accessible to every human being on the planet, that they generally don’t work for free, and that they would need some time to produce their work. The combination of these factors resulted in a system wherein, for the most part, if you really wanted to create something in the style of a specific artist, you would need to commission them, thereby supporting their ability to live and continue creating art. And/or they sold merchandise with their art, or collections, etc.

Now, on the other hand, it is incredibly easy to go to an image generator and have it generate art in the style of a specific (sufficiently well-established) artist quickly, easily, and freely. The barriers have, overnight, gone from being reasonably protective to pretty much nonexistent. As a result, artists are asking themselves how they can continue to live and create art. This is something a sufficiently well-established professional artist used to be able to do before generative AI came into the picture, because other than the odd copycat (which again took time and effort and an actual human with the right ability), they were the only ones who could produce images in their own styles, and this ability was thus a valuable resource that people paid for. If anyone can now produce identical images independently and for free, then this ability may no longer be a resource other people will pay for.

Part of what these court cases are trying to determine is exactly whether any copyright does apply to generated images. You wrote that “publishing, that is a different story, and we already have laws offering such protections both with respect to illegally-produced or copyrighted content”, but those laws are exactly what’s being tested here: artists (and organizations like Getty) are seeing what they claim are AI-generated copies of their copyrighted works in use out in the world (so these have been “published” by some definition — they are not only being printed out and hung in people’s garages for them and their friends to look at in private), and are suing to stop that.

But aside from that, I think there is a real philosophical discussion here. If you’ve trained as an artist your entire life, have worked hard to develop a unique style, and are one of the relatively few artists who have been successful doing so — should a company be able to wait until you became popular, then just take all of your work, and use it to train a model that can produce works exactly in your style easily and without any effort, which it can then provide to people freely or for a subscription?

This also isn’t as much about the output, as about how the output was obtained. If the model did not actually ingest your images, but someone wrote a prompt that involved a super-detailed description of what made your style unique, going into color palettes, line thicknesses, art styles, influences, etc etc, and you would have to get all of that right in order to generate something that looked like your art, then I think most folks would be generally ok with that. But when (1) your prompt can just be “give me art that looks like soulofmischief made it” and it’ll give you just that, and (2) you know that your art was used to train the model in order for it to be able to do that, then there is a question of whether fair use laws should be adjusted to prohibit this behavior and protect your ability to live off of your work.

I also think that regardless of the outcome of these lawsuits, no one is really coming for your own models and hour ability to tinker in your garage. It may not be legal today to duplicate a copyrighted image and hang it in your office, but no one will ever know (or care enough to do nothing about it) if you do. Similarly, even if this use becomes copyrighted, nothing will practically stop you from building your own large model that includes any copyrighted images you want, for your own personal use, in your own garage. But if you then turn around and try to profit off of that model, or if you want someone else to produce a model (thus stepping more into the publishing realm) that’s where a line may be drawn. I personally think that’d be fair.

Finally, zooming all the way out, I believe that it should be possible to make a living as an artist, and I think when we have discussions like these, we should keep reminding ourselves to think about how our technical or legal arguments affect that outcome.

ndiddy2y ago· 1 in thread

I’m impressed that their legal team was incompetent enough that they didn’t bring this up as an issue before filing the lawsuit.

OsrsNeedsf2P2y ago

What makes you think the legal team didn't know? The plaintiffs wanted to sue, so they did

sofixa2y ago· 1 in thread

I'm really looking forward to the EU framework around "AI". It's definitely a better approach than having individual artists sue and get dismissed on technicalities (that don't even apply in most of the EU - e.g. in France, if you release something by default you get copyright on it, so the judge's reasoning couldn't apply here) and judges deciding based on their interpretation of vague laws crafted in an age when "AI" was little more than niche science fiction if that.

flanked-evergl2y ago

> I'm really looking forward to the EU framework around "AI"

After GDPR and the cookie pop-ups my expectations for things coming out of the EU is quite low. Every company I have worked it has a different and often conflicting interpretation of GDPR, and some places uses it to play politics, and governments of individual EU countries are not doing their part to clarify how things should be interpreted. It's a dumpster fire IMO.

cmiles742y ago· 1 in thread

It seems like they focused too much on the details of how the model works and how data is encoded by the model.

"In his dismissal of infringement claims, Orrick wrote that plaintiffs’ theory is “unclear” as to whether there are copies of training images stored in Stable Diffusion that are utilized by DeviantArt and Midjourney. He pointed to the defense’s arguments that it’s impossible for billions of images “to be compressed into an active program,” like Stable Diffusion."

Perhaps future litigation will be more successful if they treat the model as a black box. Could an argument be made that a person's intellectual property was used to train the model without compensation and _that_ is the illegal act? From there one would only have to demonstrate that the output form the model is similar to a person's body of work.

soco2y ago

Maybe even the data for training should be opt-in, then at least this case would have been easier solved. The outputs are then another story - I can agree to training but I'm not eager to see knock-offs of my work being outputted and spread.

ptx2y ago· 1 in thread

> Orrick spends the rest of his ruling explaining why he found the artists’ complaint defective, which includes various issues, but the big one being that two of the artists — McKernan and Ortiz, did not actually file copyrights on their art with the U.S. Copyright Office. [...] The other problem for plaintiffs is that it is simply not plausible that every Training Image used to train Stable Diffusion was copyrighted (as opposed to copyrightable)

What? I thought everything was copyrighted by default under the Berne Convention?

That's the reason for the existence of CC0 [0], after all. Their FAQ says: "Copyright and other laws throughout the world automatically extend copyright protection to works of authorship and databases, whether the author or creator wants those rights or not."

[0] https://wiki.creativecommons.org/wiki/CC0_FAQ#What_is_CC0.3F

silverlight2y ago

In the U.S. you have to actually file for a copyright with the U.S. Copyright Office if you actually want to bring a copyright suit against someone.

minimaxir2y ago

This lawsuit was always weird because it was a much much weaker case than the GitHub Copilot lawsuit by the same firm: atleast with text you can point out exact infringement, but the Stable Diffusion lawsuit (https://stablediffusionlitigation.com/) seems mostly based on inaccurate technical memes like "diffusion is just compression" without examples.

The HN discussion back when this lawsuit was first announced was correctly pessimistic: the top comment was "Where are the copies?". https://news.ycombinator.com/item?id=34377910

laylower2y ago

This is the first paragraph...

"The contentious issue of whether AI art generators violent copyright — since they are by and large trained on human artists’ work, in many cases without their direct affirmative consent, compensation, or even knowledge — has taken a step forward to being settled in the U.S. today."

Is it human-generated? Violent copyright?

Topfi2y ago

Here is a direct link to the motion for those interested: https://scribd.com/document/681174239/Order-on-motion-to-dis...

sinuhe692y ago

Orrick dismissed McKernan and Ortiz's copyright claims because they had not registered their images with the U.S. Copyright Office, a requirement for bringing a copyright lawsuit.

That is the key.

artninja19882y ago

Good. Most claims got dismissed (although with leave to amend) with only the infringement on the input side really remaining. This lawyer is a clown

0dayz2y ago

Hasn't this always been a precarious road? With say fair use for instance.

Not only that but I really wish we could just redo copyright to be more flexible but ultimately empowering the creator with conclusive licenses for others to use (like in AI, other creative work, streaming, etc.) and the creator is paid either monthly or per generated image/song.

hankchinaski2y ago

This is like Michelangelo suing Caravaggio because he copied or better, was inspired by his work

j / k navigate · click thread line to collapse

418 comments

162 comments · 24 top-level

nologic012y ago· 34 in thread

Can somebody explain how this will not kill any incentive to publish anything?

This feels like a reversion to medieval times with minimal trade between regions as thieves would ambush traders and steal any goods.

csallen2y ago

I despise the underlying belief here. That belief that the only reason people create art is based on desires for fame and monetary gain. Which is so demonstrably incorrect that it boggles my mind.

renegat0x02y ago

... but you do have a rent to pay for? You know that artists also have to live somewhere?

If there is an author who spent years of his life into producing some kind of music piece, should not there be some kind of laws protecting his work from theft?

2 more replies

raynr2y ago

I didn't get that from the poster you're replying to. I agree with you that people will create because people are people and want to create.

1 more reply

nologic012y ago

2 more replies

Capricorn24812y ago

I appreciate the sentiment but AI is not a panacea to copyright laws, it's a way to hoard ideas whether it's protected or not.

throwaway59592y ago

Do you want to work for free?

1 more reply

moose44002y ago

Well said.

__loam2y ago

Professional artists have to eat. Holy shit.

marcinzm2y ago

Because they enjoy it? Or do you see artists as some type of corporate drone who hates the very act of making art?

That's like asking why anyone would contribute to MIT or Apache licensed open source.

nologic012y ago

The idea that authors, artists and other creatives will keep pumping original work as part-time love affairs so that AI bros can grab it and mint a dime is... strange.

4 more replies

Wissenschafter2y ago

People like to create things regardless of profit motive, like art, who woulda thunk it?

Your viewpoint is honestly insane. The people against AI art are bonkers.

nologic012y ago

In your infinite sanity you have not answered my question of how a creative person will dedicate their life (starting from long studies) producing something that society will not reward in any way.

1 more reply

glimshe2y ago

nologic012y ago

Older societies did not have copyright but they, manifestly, had ways to sustain creatives.

People wax philosophical about paradigm changes and other vacuities yet refuse to answer a simple question: how will society reward human creativity that takes a lifetime of cultivation to flourish.

1 more reply

huimang2y ago

The problem is these discussions are being had by STEM/tech people who don't respect or value art or the effort behind it, not by artists. They simply do not get the concerns that artists have.

surgical_fire2y ago

This is how technology works. Bulldozers effectively replaced people with shovels. Excel effectively replaced accounting clerks. Generative AI effectively replace artists (to some capacity).

Most people care only about the output of a system, not about who the system replaces.

2 more replies

ronsor2y ago

hfuyf652y ago

Is it because it seems the value can be imitated so easily?

bawolff2y ago

> Why would any human spend their limited lifespan to create a piece of work that will be grabbed without permission, approximated algorithmically (at least on the surface) and reused in infinite possible small variations without any attribution or remuneration whatsoever?

Because many people make art for art's sake.

Besides popular art works always essentially had this yet people still made them. The difference is in scale not kind.

onlyrealcuzzo2y ago

~99% of art is extremely derivative.

Why does it matter if some artist uses ChatGPT to knock-off your style indirectly rather than directly?

I mean, sure, ChatGPT is better at it than most artists. Is that the problem? The quality of knock-offs is too good now?

Take any new song - and any music head can list several songs it is just like. Take any new movie - and most screenwriters could go on for hours how it's almost exactly 10 different movies. Etc.

There is nothing new under the Sun.

naasking2y ago

> Why would any human spend their limited lifespan to create a piece of work that will be grabbed without permission

Creative people will create regardless of financial incentives. Fan fiction and free art is already everywhere, for instance.

pkdpic2y ago

nologic012y ago

In your universe apparently "creative people" don't need to spend a lifetime of study to hone their art, don't need food and shelter every single day etc.

Its amazing how callous tech people have become as they salivate for their unicorns or whatever they are pursuing.

2 more replies

nirav722y ago

spencerflem2y ago

Because a machine can do it millions of times faster

1 more reply

lolinder2y ago

In other words, the story here is mostly that the lawyers screwed up badly in pursuing a copyright lawsuit before ensuring copyright had been filed.

johngher2y ago

It's a parlor trick, but I'm going to use it anyway: your own comment refutes your point.

gedy2y ago

It's just not that different from people seeing works and learning or being inspired, so how do you "ban AI" without adding more crazy DRM/DMCA stuff for legitimate use?

1 more reply

creer2y ago

Plenty of people create with minimal profit effort. Outstanding creation happens with minimal profit all the time.

To profit from creation you have to publish - what alternative do you propose?

mock-possum2y ago

If anything, it will lower the barrier for creation, to allow a greater variety of art by a greater variety of artists. Adding a new tool to an artist’s kit is an incredible step forward for art.

tomjen32y ago

It may very well kill the incentives. But if new laws are needed, that is a matter for congress, not the courts.

golergka2y ago

Other humans have always been doing exactly that with anything you published.

latexr2y ago

nologic012y ago

Really? can you point to some example?

People have been putting up with some theft because they could still eke a living.

This attitude has all the coherency of "some people are thieves, we cannot catch them all, so lets make theft legal".

Unless I hear some sensible argument why this slippery road won't destroy a good fraction of the economy I am assuming that regression to kleptocracy is the shape of things to come.

1 more reply

ballenf2y ago· 25 in thread

Can someone explain again how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?

That is, we focus on the output of the process to determine infringement with living artists and ignore the training. But with ML, everyone focuses on the training.

Thousands of artists are capable of infringement, but we don't take away their brushes based on capability.

dragonwriter2y ago

> Can someone explain again how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?

> There are artists that can study a painting for a few minutes and then recreate it from memory

> That is, we focus on the output of the process to determine infringement

No, we focus on what is set in a fixed medium. If that is done at an intermediate step of the process, rather than being an output, it can still be infringement.

jncfhnb2y ago

People betting on all “lossy compression” being equivalent will lose. It’s a lame argument. Can you encode images to be basically 1:1 with embeddings? Absolutely.

Is that what these embeddings are doing? No.

If artists try to argue that all copies are equivalent, but are unable to recreate their works from the embeddings, their argument will fall flat.

This argument also only applies to sharing models, which is doubly dumb because we want open source models, not closed source models. It’s a harmful status quo to try and enforce.

koolba2y ago

> If an artist recreates a copyrighted work or creates a derivative too close to the original, then that new work is potentially copyright infringement.

Even the people submitting and responding to the copyright claims will still be human (with briefs generated by ML…).

raincole2y ago

> how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?

You just answered it? One is a ML system and one is a human?

I'm really, really baffled why people keep using this argument. Like you guys know machines are not humans, right? ...right?

Let me say it again: humans are special cases. AI learning copyrighted materials might be illegal or legal, but it has little to do with "what if a human being does the same".

pmoriarty2y ago

Isn't copyright about the final product, not how it was arrived at?

If the final image is similar enough to a copyrighted work and I publish that image without permission of the copyright holder, then that's a copyright violation.

If the final image is different enough, then it's not.

That an AI was used and how the AI was trained are completely separate issues.

1 more reply

ben_w2y ago

> Always have been and always will be (until AGI)

[0] https://www.nature.com/articles/s41583-022-00587-4

[1] https://en.wikipedia.org/wiki/Philosophical_zombie

theonlybutlet2y ago

In law there is such a thing as legal person as opposed to natural person. When it comes to commercial law, its provisions tends to relate to legal persons.

1 more reply

panta2y ago

theonlybutlet2y ago

1 more reply

thrill2y ago

The law does not say it is a matter of scale.

1 more reply

gaganyaan2y ago

kranke1552y ago

So actually building a machine there, under the cover of darkness, that learns from your work so you can produce new work, why is that allowed in the first place? Certainly wouldn't be at a museum.

shkkmo2y ago

But you can't stop people from sitting and studying your painting and then painting stuff similar to it.

If it turns out that there is no reproduction here, then it comes down to how much legal control we give copyright owners to regulate access.

A gallery can reasonably ban cameras and canvases, but it becomes a lot less reasonable if they try to ban artists.

1 more reply

Vvector2y ago

"It's my property."

creer2y ago

mattigames2y ago

picadores2y ago

The AI will not throw a molotowcocktail at you or hang you from a lamppost when it starves?

__loam2y ago

frumper2y ago

How do they get the works in question to train on?

1 more reply

nextaccountic2y ago

> Can someone explain again how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?

Because (fortunately) human thoughts can't be subject to copyright law yet. So when we talk about copying and making derivative works, if you have this

    artistic works -> neural network weights

The end result may or may not be copyrightable (that's for the courts to decide), but this

    artistic works -> human brain

Definitively can't be

CrimsonRain2y ago

No difference. Some people are just luddites or have vested interest against automation of their own field (but fine with other fields).

__loam2y ago

1 more reply

bergen2y ago

An artist can not professionally scan and incorporate millions of pieces of art into his cortex in a minute for commercial purpose.

_petronius2y ago

> Can someone explain again how an ML system scanning and training on a copyrighted work is different from a highly skilled artist doing the same?

Three things immediately spring to mind: scale (1), accountability (2), and profit (3).

> Thousands of artists are capable of infringement, but we don't take away their brushes based on capability.

lofaszvanitt2y ago

Have you been sitting in Midjourney rooms and saw the prompts? :DDD

kranke1552y ago· 21 in thread

This will be the greatest act of Intellectual Property theft in history.

If this continues, AI will be the great inequality machine in history. Take data from 1,000,000 individuals, train your AI to replace them, compensate no one.

treyd2y ago

rvz2y ago

5 more replies

harshreality2y ago

cmiles742y ago

Where is this idea that copyrighted material should be excluded from training data coming from?

My understanding is that people want to be compensated when their intellectual property is used as training data for a machine. That strikes me as an entirely reasonable expectation.

1 more reply

kranke1552y ago

Yes, and you know how humans acquire works to learn from?

They pay for it.

They buy the books. They buy tickets to theatre. They buy entrance to the gallery.

The trick that's being done now is hey, we don't have to pay since it's not a person. (to the creator) But hey, it is just like a person when it learns! (legal system)

If AI models require human training data, then they should pay for it. Easy.

4 more replies

olalonde2y ago

kranke1552y ago

No that is not what people are upset about. They are upset that their life's work is being used without even asking permission, for someone else to get insanely rich.

That's what they're upset about.

If there were no use for 2D artists, then Stability Ai wouldn't be making an AI to replace them.

Key word here is: replace. 2D artists are not becoming obsolete - they're being replaced by a machine that was trained on their works without permission.

What happened to visual artists was more like Logitech announcing Logitech CoPilot and revealing they've extracted code from keylogging for the past 20 years.

3 more replies

cmiles742y ago

Kim_Bruning2y ago

To quote the song, I think it's "all just little bits of history repeating".

1 more reply

CrimsonRain2y ago

There is no "taking data" going on. Nobody is going into your private locker and training on your painting, music or cooking recipes.

If you put your "work" out in the world, anyone who views it, is automatically training their brains on it. Viewing is training.

JoshTriplett2y ago

> If you put your "work" out in the world, anyone who views it, is automatically training their brains on it. Viewing is training.

A perfectly reasonable view for humans, since you shouldn't be able to copyright a brain.

kranke1552y ago

It's funny to me that we haven't reached AGI or anywhere near it, but when we talk about training Diffusion models, suddenly they're "just like a person" for legal reasons.

Same thing that happened on the construction of the "corporation as a person", built on top of rulings made to protect African Americans.

1 more reply

myaccountonhn2y ago

AIs and humans are not the same, why do people keep grouping them together and assume the same logic should apply?

3 more replies

kranke1552y ago

Ok, so reproduce me a Picasso. You've seen one right?

2 more replies

EnergyAmy2y ago

Don't be silly. This was a level-headed ruling that avoids retarding the progress of science and the useful arts.

smrtinsert2y ago

Is this supposed to be sarcastic, because this is impossible and you're arguing a strawman. You could also say electricity enabled this great inequality, we have got to stop electricity.

surgical_fire2y ago

> This will be the greatest act of Intellectual Property theft in history.

Good.

Intellectual Property is a mistake. If AI brings about its end, I welcome it.

Illotus2y ago

Not really good if AI can run around it but to normal people it exists as before.

williamcotton2y ago

Google “expert witness”! Courts are also known to hire their own experts who mediated between the experts on either side.

Also, those emails seem very likely to be ordered to be produced during discovery.

This thing could really go either way at this point but I feel like Stability has the upper hand.

They will probably argue that the individual expressions of each work are not copied, rather the abstract ideas of two-dimensional representations present across any and all images.

Expect lots of side by side pictures as Exhibits from both side! Grandma and her fellows have to weigh in on this one!

This is a fun one!

kranke1552y ago

Stable Diffusion has been known to make virtually identical copies of the images it was "trained" on, afaik.

If the images are REALLY of no importance, they wouldn't have been used anyway.

3 more replies

mock-possum2y ago

Sing it with me now: Copying Is Not Theft

https://youtu.be/IeTybKL1pM4

AndrewKemendo2y ago· 13 in thread

Having done way more corporate court than I want (patents, mergers, liquidation), I’m increasingly convinced that the judicial system is fundamentally flawed.

The reality is that the law in 2023 US is so obscure and opaque and how judges come to their ruling seems to be by their total whim with no actual philosophy other than maintenance of the system.

Further I’m extremely unimpressed with the vast majority of judges competence in display - such that contempt should be the starting position.

As with everything now, courts are rules by those with the most money

tiahura2y ago

Not to be too rude, but you’re not an attorney and couldn’t be more wrong.

The law has never been more transparent. The public has nearly complete access to every docket in the country. Moreover, the level of jurisprudence has never been higher.

Moreover, I’ve lost a case or two in my time, but it was never because of a lack of a warchest.

SketchySeaBeast2y ago

> Moreover, the level of jurisprudence has never been higher.

I'm confused as to what this sentence means. The level of {the study/philosophy/science of law} has never been higher?

1 more reply

AndrewKemendo2y ago

In which a person who professionally practices law, has been to law school, has a higher than average IQ, and has a decade of experience describes how easy it is to interpret the legal system

Hey quick question, my good friend, lets call him Doug is a high school equivalency graduate, has a few felonies and currently works as a road flagger

How does this complete access to every docket in the country help him?

You have more perfectly explained my point better than I could have. Thanks

corethree2y ago

The final decision made by this article is one I agree with with or without money and I have no incentive the game.

nirvdrum2y ago

4 more replies

YurgenJurgensen2y ago

2 more replies

snovv_crash2y ago

It's ok to give humans rights that we don't give to machines.

1 more reply

woodrowbarlow2y ago

it's interesting that you start from the assumption that a piece of software should be judged by the same measure as a human.

1 more reply

AndrewKemendo2y ago

You seemed to miss the part where the judge said that the only things that can be claimed as copyrighted are those things that were submitted to the USPTO for specific narrow coptright

This displays either ignorance as to how artists work and the extent to which they are involved or can be involved in the legal copyright system, or reflects incoherence around the copyright system.

In this case the Judge chose to say, in effect: unless you have explicitly copyrighted it, it's fair use

That is now a new precedent that negatively impacts individual artists who have no power in the market, and protects giant corporate interests which have tons of power in the market

4 more replies

AlexandrB2y ago

For me this argument will hold water when we can put LLMs in jail if they commit a criminal act. Until then, an LLM is not a human and not entitled to be treated like one.

1 more reply

monkaiju2y ago

The courts are like the doors to the Ritz, open to everyone!

AndrewKemendo2y ago

Oh this is perfectly said thank you

creer2y ago

The court IS about maintaining the system. That's the plan. If you want to change it, then you need to change the law and that's the job of Congress.

raincole2y ago· 11 in thread

My perspective is there are two different main issues about AI (especially Stable Diffusion).

To me, it's mostly depending on how lossy (low quality) your .jpg is.

notnullorvoid2y ago

It's not compression.

The model produced though can be an efficient compressor/decompressor, which produces a lossy output image when given a input of prompt and/or image.

raincole2y ago

It's already proven that you can reconstruct at least a small fraction of the training set from diffusion models. It's something quite well known, so could we not die on this hill?

[1] https://twitter.com/Eric_Wallace_/status/1620449942090420224 [2] The paper: https://arxiv.org/abs/2301.13188 [3] Relevant HN thread: https://news.ycombinator.com/item?id=34596187

Kim_Bruning2y ago

So just to be sure: the list of URLs + metadata that gets used for stable diffusion is several terabytes. Not the images. Just the list of URLs alone (and a bit of other metadata).

Stable diffusion itself is just 6+ GB, and fits comfortably on my USB stick.

That's one heck of a lossy compression algorithm, sir!

(this thread has more discussion on this line of thinking https://news.ycombinator.com/item?id=37879938 )

raincole2y ago

Thanks for sharing this info which I'm aware of. However, this fact is not as significant as it might sound in terms of whether it's a lossy compression algorithm.

2 more replies

harshreality2y ago

raincole2y ago

> That's a difficult question because the boundaries of similarity/derived works for copyright purposes are determined by judges and juries based on their intuitions.

I believe that's why DALL-E bans some keywords related to alive artists. To show they have "no intention to violate copyrights".

ben_w2y ago

For all the legal issues — and the artistic flaws — I still find it quite remarkable how good it is at such a small size.

raincole2y ago

> the main one is somewhere between 1 and 10 bytes per image depending on whose answer I use for the question "how many images was it trained

Here is a catch tho. It's just "by average" several bytes. We can't tell if some images practically contribute 0 bit to the final results while some others contribute more.

1 more reply

gpderetta2y ago

Machines do not have rights. The question is whether an human with a specific machine has a certain right, as opposed to a human with a different machine.

hunter2_2y ago

> human with a specific machine

smrtinsert2y ago

alphanullmeric2y ago· 11 in thread

czl2y ago

simbolit2y ago

You are straw-manning.

Imagine you encounter a public domain image (which by definition is not protected by copyright), you download it, and put it on your website.

Perfectly fine.

But if you write "I made this image" below it, you are a liar and a fraud. No copyright needed.

1 more reply

alphanullmeric2y ago

Don’t care.

justanotherjoe2y ago

alphanullmeric2y ago

In case I haven’t already made it clear enough in the comment you replied to - whether I believe in something or not doesn’t depend on who would benefit from it.

But to answer the question, you use proprietary software protected by means other than government force every single day.

gumballindie2y ago

kstrauser2y ago

Thank god copyright came along and gave us Shakespeare, Bach, da Vinci, Chaucer, Beethoven…

And can you imagine life without the wheel? Too bad we didn’t invent patents earlier so that we could’ve gotten a head start inventing it and the spear.

1 more reply

zirgs2y ago

Copyright activists really lost a lot of respect because of those silly music industry lawsuits, the mickey mouse protection act, software patents of trivial stuff and the like.

This lawsuit is even sillier than the previous ones.

alphanullmeric2y ago

3 more replies

vortegne2y ago

"people inherently want to own things, especially the output of their own creation". That is the founding idea of communism indeed. I'm not sure you understand anything about it.

kmeisthax2y ago

[0] Hollywood-ism for "people whose contribution to the work is not marketable"

[1] And, I suspect, a by-product of having read a bunch of Ayn Rand nonsense

1 more reply

williamcotton2y ago· 5 in thread

Orrick also dismissed McKernan and Ortiz's copyright infringement claims entirely.

Well, duh. The judge is helping out the plaintiffs in this case. A jury would have been easily convinced by the defense that no images produced by Stability's systems are visually derivative.

The key is indeed what follows:

The judge allowed Andersen to continue pursuing her key claim that Stability's alleged use of her work to train Stable Diffusion infringed her copyrights.

So unless there is some kind of summary judgement I would wager that this becomes the focus of both sides as this heads towards trial.

But that's it. As predicted by commentary from legal scholars, the outputs of Stable Diffusion are distinct from the model and are not infringing on copyright... at least for this complaint!

gamblor9562y ago

No, he dismissed McKernan and Ortiz because they didn't register their images for U.S. copyright, which is a foundational prerequisite for any copyright lawsuit (in the U.S.)

This simply means that they need to register their images for copyright before they can re-join the case. (https://www.gibsondunn.com/supreme-court-holds-that-copyrigh...)

williamcotton2y ago

I'll check PACER and read the actual ruling when I'm at work tomorrow, but yeah I'm interpreting "dismissed entirely" as "dismissed with prejudice".

You're entirely correct that if it was dismissed without prejudice the complaints on copyright infringement on the outputs could be amended and refiled.

1 more reply

williamcotton2y ago

Re: EDIT

Another interpretation is that the plaintiffs were well aware of how weak their case was with regards to the outputs and basically planned on abandoning it from the start.

There's winning in the court of public opinion and then there's winning in a Federal court.

starshadowx2OP2y ago

Isn't that linked case because they started to file for copyright and then sued rather than waiting for it to be completed first?

In this case they never filed in the first place, and it was dismissed with prejudice.

michaelbrave2y ago

I'm not convinced the training on copyrighted things argument will hold up either.

brucethemoose22y ago· 3 in thread

Why is Midjourney completely off the hook while Stability AI is not?

I'm trying to pull up the original court document, but the PDF isnt loading.

gamblor9562y ago

The plaintiffs apparently failed to plead sufficient factual allegations to support their infringement claim against MTD, which is a rookie mistake.

starshadowx2OP2y ago

For that count specifically, Stability was directly involved with creating and funding the LAION dataset, whereas Midjourney and DeviantArt were not.

tick_tock_tick2y ago

Basically the judge said the idea AI images generated are infringing on copyright is so stupid it's thrown out.

The other part of the case is if the artists copyright was violated when training the AI and they have only claimed that Stability used their art to train.

aa_is_op2y ago· 3 in thread

Amazing how copyright law amazingly disappears when it's to the detriment of major tech companies and protecting smaller creators.

Just amazing!

gmerc2y ago

I’m not sure if here you are reading that. Have you read the article?

The copyright infringement claim (for training) is left intact. It’s the other claims that had no basis in existing law (e.g. no copyright was registered, etc) that have been thrown out.

flanked-evergl2y ago

> Amazing how copyright law amazingly disappears

Can you elaborate in what way it disappeared in your opinion?

1 more reply

Spivak2y ago

TotalCrackpot2y ago· 2 in thread

nness2y ago

What should it be replaced with — a system where no one retains intellectual rights over the works that they create?

TotalCrackpot2y ago

1 more reply

gamblor9562y ago· 2 in thread

https://fingfx.thomsonreuters.com/gfx/legaldocs/byprrngynpe/...

artninja19882y ago

Interesting. How do you see the Getty v stability lawsuit going? That looks much worse for stability. Do you think they will just settle and stability will pay them some licensing fee?

gamblor9562y ago

Getty has a much stronger case, given that warped versions of the Getty logo have shown up in a number of SD-generated images, so it's obvious that there was impermissible copying.

I'm not sure Stability will agree to a licensing fee, since part of the rationale for the last version of SD was to remove the infringing images from their training sets going forward.

4 more replies

OsrsNeedsf2P2y ago· 2 in thread

The lawsuit is moving forward, but only on copyrighted work. This is (not yet) a story.

thowaway912342y ago

nness2y ago

That was my belief too, but: "Copyright exists from the moment the work is created. You will have to register, however, if you wish to bring a lawsuit for infringement of a U.S. work."

https://www.copyright.gov/help/faq/faq-general.html

(Makes me wonder if, back in the day, every song that was downloaded and then pursued by the RIAA was registered...)

1 more reply

soulofmischief2y ago· 2 in thread

A thought experiment:

Imagine you have a blob of seemingly random data. Nothing in the data contains anything recognizable as illegal or in violation of copyright.

What should be illegal here? The blob, which by itself is free of any questionable bits of data, or the inputs which transform it into something tangible? Both? Neither?

Well, it has never been illegal to draw or paint something representing CSAM, for example. And it has never been illegal to draw or paint Mickey Mouse in your own home.

Bottom line. You can pry my models out of my cold, dead or handcuffed hands. Times like these really shine a light on who is complicit in the system, and who suffers from it.

Aerroon2y ago

>Imagine you have a blob of seemingly random data. Nothing in the data contains anything recognizable as illegal or in violation of copyright.

The set of Real numbers contains every positive whole number. This is already the magical blob.

Eg the decimal number 65101114114111111110 is "Aerroon" in ASCII.

Edit3: real numbers are better than natural numbers or whole numbers for this. They have zero and they solve the "0005" problem.

mithr2y ago

ndiddy2y ago· 1 in thread

I’m impressed that their legal team was incompetent enough that they didn’t bring this up as an issue before filing the lawsuit.

OsrsNeedsf2P2y ago

What makes you think the legal team didn't know? The plaintiffs wanted to sue, so they did

sofixa2y ago· 1 in thread

flanked-evergl2y ago

> I'm really looking forward to the EU framework around "AI"

cmiles742y ago· 1 in thread

It seems like they focused too much on the details of how the model works and how data is encoded by the model.

soco2y ago

ptx2y ago· 1 in thread

What? I thought everything was copyrighted by default under the Berne Convention?

[0] https://wiki.creativecommons.org/wiki/CC0_FAQ#What_is_CC0.3F

silverlight2y ago

In the U.S. you have to actually file for a copyright with the U.S. Copyright Office if you actually want to bring a copyright suit against someone.

minimaxir2y ago

The HN discussion back when this lawsuit was first announced was correctly pessimistic: the top comment was "Where are the copies?". https://news.ycombinator.com/item?id=34377910

laylower2y ago

This is the first paragraph...

Is it human-generated? Violent copyright?

Topfi2y ago

Here is a direct link to the motion for those interested: https://scribd.com/document/681174239/Order-on-motion-to-dis...

sinuhe692y ago

Orrick dismissed McKernan and Ortiz's copyright claims because they had not registered their images with the U.S. Copyright Office, a requirement for bringing a copyright lawsuit.

That is the key.

artninja19882y ago

Good. Most claims got dismissed (although with leave to amend) with only the infringement on the input side really remaining. This lawyer is a clown

0dayz2y ago

Hasn't this always been a precarious road? With say fair use for instance.

hankchinaski2y ago

This is like Michelangelo suing Caravaggio because he copied or better, was inspired by his work

j / k navigate · click thread line to collapse