What color are your bits? (2004) (opens in new tab)

(ansuz.sooke.bc.ca)

40 pointstomodachi944mo ago11 comments

11 comments

7 comments · 3 top-level

dahart4mo ago· 2 in thread

In the 2024 preface:

> Copyright holders worry about how to exercise control over the use of "their" creative material for training models; but that begs the question of whether copyright holders ever had, or should have, a right to any such control. If a human can read a book and learn from it, and then write their own books, why shouldn't a computer?

There’s a small amount of irony in an article that’s discussing copyright, and the invisible but critical context of information, then dismissing the context of copying when it comes to copyright, as well as confusing what copyright protects. I’m certain the author knows that copyright does not protect ideas, it does not protect “colour”, it deliberately only protects the “bits”. In US copyright law this is called the “fixation” of a work. The Berne Convention uses similar terminology: “works shall not be protected unless they have been fixed in some material form.”

AI’s “learning” has a different colour than human learning. This has been debated at length on HN and elsewhere, and in the courts, but it’s definitely wildly misleading to compare ChatGPT training on all books ever written and then being distributed (for a profit) to everyone, to one human reading one book and learning something from it.

aftbit4mo ago

More interesting to me is the "derivative work" concept. If a human sits down with a novel, reads it cover to cover, then writes their own novel which broadly has the same characters following the same plot in the same setting, but with slight differences in names and word choice, is that new work a derivative of the first for copyright purposes? What if they do the same thing for code? What if an AI does either or both of those?

IP courts will have some truly novel questions before them this century.

dahart4mo ago

Copyright does not cover ideas, period. If you write your own novel and use your own word choices, even if you copy the plot structure exactly and the same character names while writing a new book, it’s not even considered a derivative work under the law, it’s a new work. Copyright covers copying the fixed work itself. You aren’t in violation of copyright unless you copy the words themselves.

The flip side is that this is why the article’s discussion about randomness and monkeys on typewriters is irrelevant to copyright law. It’s a copyright violation to produce the same “fixation” no matter how you do it. If you generated a random sequence of characters, and it happened to match a NYT best selling book, you violate the book author’s copyrights, and claiming it was random isn’t a viable defense. Intent to copy can make it worse, but lack of intent does not absolve. There is precedent for people coming up independently with the same songs and one being successfully sued.

Do note that there are other laws that might cover plagiarism of ideas, trademarks, code, etc., copyright isn’t the only consideration, but copyright seems to be often misunderstood. We definitely have some novel questions because of the scale of AI’s copying, the nature of training and the provenance of the training data, and because of AI’s growing ability to skirt copyright law while actually copying.

2 more replies

nathan_compton4mo ago· 1 in thread

It has been very interesting to watch how discourse on copyright and ip has evolved over the years. In the end, however, what seems to happen is that copyright always "works" when powerful people benefit from it working and "doesn't work" when its less powerful people being victimized. Or vice versa, really. It seems like we've entered into a period of history where any little legal or cultural power differential is very rapidly exploited to produce profit and I'm curious (to say the least) how we are going to fix this.

everyone4mo ago

Well, at least vote with your wallet, support FOSS and drm free stuff. Never support drm (if u still want access to drm containing stuff, just pirate it)

jasode4mo ago· 1 in thread

The "past" link for previous submissions won't work in this case because the title was submitted as "color" instead of the original spelling of "colour". Search with "colour" to see the previous discussions: https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

captn3m04mo ago

I’d have thought Algolia would cover this in their default synonyms.

j / k navigate · click thread line to collapse