undefined | Better HN

0 pointsechelon2y ago0 comments

> As someone who works in the field

What's your read on what's going to happen with AI?

Will companies be allowed to train on copyrighted works? Seems like we'll fall behind international competition or supercharge monopolies if we don't allow it.

Japan and China permit training on copyrighted works. China goes a step further and allows AI outputs to be copyrighted.

Really interested in what insiders think or know about this.

0 comments

10 comments · 1 top-level

mod50ack2y ago· 9 in thread

Well, to be clear, I am mainly a public domain and copyright theory guy. I do keep up to date with everything here. But I also don't write laws; if I did, they would look different.

Companies are allowed to train on copyrighted works - or, to be more precise, there is no prohibition in copyright law on them doing so. On the other hand, there are no particular protections.

The real question is to what extent an AI generator can return its copyrighted training data as a non-de minimis output. In other words, can I get one of the existing copyrighted works by putting in a prompt? This is something that AI developers are trying to avoid, but it's actuallg a pretty tricky problem.

Allowing AI output to be copyrighted is one of the worst ideas in copyright. Thankfully, the US Constitution as interpreted by the courts only allows for copyright to inhere in works of human creativity.

By the way, while I'm very excited about certain "AI" things, I have a very poor opinion of the merit of generative AI — and I mean in theory we well as how it stands today.

Aloha2y ago

What would your ideal copyright regime look like?

Mine would be different than what we have now, it'd be 20 years or artists lifetime, whichever is shorter - then 10 year long renewals are possible after that, but the cost of the renewal would ratchet up with each renewal.

I've also considered using a percentage of revenue for the work - basically a tax on the revenue from that work, as a condition for the right of monopoly on it - which would also ratchet upwards with each renewal.

I'd also consider a use it or lose it strategy for copyright like trademark, meaning if you are not making the work available for purchase/license within the copyright renewal period, for reasonable terms, you lose the ability to renew it.

Mine is mostly designed to deal with orphaned works, ensuring they enter public domain in a predictable way, I think the biggest issue with our existing copyright system isn't enriching Disney - they're still putting those works out there, making them available - its all the works being lost to the sands of time.

hedora2y ago

Disney regularly censors or modifies classic films in its catalog, and also simply pulls things out of production. If you want to read more, look up the Disney Vault strategy.

1 more reply

mod50ack2y ago

> What would your ideal copyright regime look like?

I would have to actually write something blog-length about this.

BeefDinnerPurge2y ago

AI models have an amazing ability to approximately memorize any training data. It's just that that memorization is useless unless it memorizes something real as opposed to random (randomized ImageNet labels as a simple example).

So as much as I want there to be a fair use case here, the artists have a real point. If someone can break the memorization without losing significant validation/test set performance, that might go a long way.

But even then, artists don't want their style copied either, and that's problematic to me in that if a human does it, that's OK, but if an AI does it, it's not? Yes I get the ease of asking an AI to do it vs a 10K+ hours artist, but, well, more or less the same to me on a geological time scale.

In the next year, I'm hoping to Patreon/Kickstart project that offers two major funding tiers. Hitting the lowest tier means it will use AI to create assets, and hitting the higher tier will use humans instead. My response to this brouhaha is to throw the controversy right back at the people creating it in the first place and ask if they're willing to walk their fancy talk on this subject with their wallets.

dahart2y ago

> Companies are allowed to train on copyrighted works

I’m curious what you mean by this. If I’m not allowed under copyright law to make a personal copy of the latest Pixar movie or watch it without permission or payment, even if I’m not sharing it with anyone else, what under the law allows a company to make a copy and train on it? I thought I understood copyright law to not only prohibit redistribution of copyrighted works without permission, but also to prohibit consumption of copyrighted works without permission? Is that accurate? In that sense, I would have thought copyright law does prohibit companies from training on copyrighted works.

brookst2y ago

You’re hitting on the distinction between duplication and training.

I own hundreds of paperback books. Copyright law does not limit what I can learn from them.

It may be that assembling a corpus for training is illegal, but if so, that would be true even if it was never used for training. The act of training an AI is orthogonal to the collection of the corpus.

1 more reply

BlueTemplar2y ago

> I’m not allowed under copyright law to make a personal copy of the latest Pixar movie or watch it without permission or payment, even if I’m not sharing it with anyone else.

You aren't, but in many cases it doesn't matter because, even for USians, you're protected by the fair use exception. And even in some other cases like for educators and archivists where it extends further.

When we get new laws for the training of neural networks, I would expect their spirit to be based on the state of this exception.

xorcist2y ago

> Allowing AI output to be copyrighted is one of the worst ideas in copyright

What does that mean, in more specific language?

If I create a poster in Photoshop, it is under copyright? What about if I use a smart fill plugin? What about if I use a prompt plugin?

bee_rider2y ago

“Find me a prompt to generate this image,” seems like an interesting problem to toss at an AI, I wonder if anyone knows of work in that direction?

Hypothetically if such a system existed where you passed in a copyright image, and then got a prompt to generate it, would that be sufficient to show some kind infringement?

j / k navigate · click thread line to collapse

0 comments

10 comments · 1 top-level

mod50ack2y ago· 9 in thread

Well, to be clear, I am mainly a public domain and copyright theory guy. I do keep up to date with everything here. But I also don't write laws; if I did, they would look different.

Companies are allowed to train on copyrighted works - or, to be more precise, there is no prohibition in copyright law on them doing so. On the other hand, there are no particular protections.

By the way, while I'm very excited about certain "AI" things, I have a very poor opinion of the merit of generative AI — and I mean in theory we well as how it stands today.

Aloha2y ago

What would your ideal copyright regime look like?

hedora2y ago

Disney regularly censors or modifies classic films in its catalog, and also simply pulls things out of production. If you want to read more, look up the Disney Vault strategy.

1 more reply

mod50ack2y ago

> What would your ideal copyright regime look like?

I would have to actually write something blog-length about this.

BeefDinnerPurge2y ago

dahart2y ago

> Companies are allowed to train on copyrighted works

brookst2y ago

You’re hitting on the distinction between duplication and training.

I own hundreds of paperback books. Copyright law does not limit what I can learn from them.

1 more reply

BlueTemplar2y ago

> I’m not allowed under copyright law to make a personal copy of the latest Pixar movie or watch it without permission or payment, even if I’m not sharing it with anyone else.

When we get new laws for the training of neural networks, I would expect their spirit to be based on the state of this exception.

xorcist2y ago

> Allowing AI output to be copyrighted is one of the worst ideas in copyright

What does that mean, in more specific language?

If I create a poster in Photoshop, it is under copyright? What about if I use a smart fill plugin? What about if I use a prompt plugin?

bee_rider2y ago

“Find me a prompt to generate this image,” seems like an interesting problem to toss at an AI, I wonder if anyone knows of work in that direction?

Hypothetically if such a system existed where you passed in a copyright image, and then got a prompt to generate it, would that be sufficient to show some kind infringement?

j / k navigate · click thread line to collapse