The Times Sues OpenAI and Microsoft Over A.I.’s Use of Copyrighted Work (opens in new tab)

(nytimes.com)

45 pointsthecybernerd2y ago12 comments

12 comments

Inevitable outcome. Since ChatGPT launched, nobody has a clue as to what is legal and what is illegal with these chat-based LLMs.

Is the content that LLMs produce enough to rise to the level of copyright infringement? Is the fact that a company trained their LLM on your data, with the knowledge it would be used for outputs (=profit), enough that all of their outputs should be considered, to at least a minuscule degree, influenced by your work? How would ChatGPT's "training" differ from, say, another journalist who reads the NYT, and subconsciously uses that to help provide better services?

None of us can answer these questions definitively. The courts hearing these sorts of arguments were a foregone conclusion. I think a lot of the large LLMs (certainly OpenAI competitors) are going to breathe a sigh of relief that this is happening sooner rather than later, so they know where the legal lines are to be drawn.

berniedurfee2y ago

This will be an interesting inflection point for humanity.

Though, call me jaded, but I can’t help but doubt that the _actual_ content creators, the writers themselves, will see any of the money should The Times win or settle the case.

donohoe2y ago

The content creators for the Times have already been paid for their work.

berniedurfee2y ago

They were paid when the original content was to be printed or posted on the internet.

Subsequently selling (or extracting compensation for) those works to AI companies is an emergent revenue stream.

I suppose the NYT isn’t legally obligated to share that revenue fairly with the authors, but it’d be awful nice if they did.

lobsterthief2y ago

Believe me, publishers have enough trouble keeping writers employed. If they could give them a larger cut or do some kind of revenue share, most editors and GMs would love to (and many do).

CrypticShift2y ago

The trajectory we're seeing with quality small AI Models, coupled with the self-imposed censorship and the foreseeable scarcity of high-quality training data due to new copyright law, leads me to forecast a surge in "pirate" models.

Increasingly, the distinction between core model training and fine-tuning might become ambiguous (how ?). Considering this, we might witness a trend where custom 'add-ons' for AI models become commoditized. Imagine simply downloading a "New York Times" pack to enhance your unofficial "pirate" language model.

jruohonen2y ago

"The legal landscape surrounding generative-AI is unsettled, with the technology still in its early days. There are other lawsuits that could test the rights of AI companies to “scrape” content from the web to train AI tools, including one by several prominent book authors against OpenAI. In February, Getty Images sued the AI art company Stability AI in Delaware, alleging that it had infringed on Getty’s copyrights."

Any news or speculations on these cases?

jdkee2y ago

https://archive.is/cCIeJ

cranberryturkey2y ago

heh. good luck with that one. everyone is crawling the web now. why didn't they sue google for using their content in the serps?

gniv2y ago

There are significant differences: attribution and snippetting. OpenAI probably cannot claim these.

mhss2y ago

And Google search doesn't "generate" new content that potentially puts out of business the very same entities it learned from.

mdaniel2y ago

currently, and the non-paywalled link: https://news.ycombinator.com/item?id=38781941

j / k navigate · click thread line to collapse

12 comments

cowsup2y ago

Inevitable outcome. Since ChatGPT launched, nobody has a clue as to what is legal and what is illegal with these chat-based LLMs.

berniedurfee2y ago

This will be an interesting inflection point for humanity.

Though, call me jaded, but I can’t help but doubt that the _actual_ content creators, the writers themselves, will see any of the money should The Times win or settle the case.

donohoe2y ago

The content creators for the Times have already been paid for their work.

berniedurfee2y ago

They were paid when the original content was to be printed or posted on the internet.

Subsequently selling (or extracting compensation for) those works to AI companies is an emergent revenue stream.

I suppose the NYT isn’t legally obligated to share that revenue fairly with the authors, but it’d be awful nice if they did.

lobsterthief2y ago

Believe me, publishers have enough trouble keeping writers employed. If they could give them a larger cut or do some kind of revenue share, most editors and GMs would love to (and many do).

CrypticShift2y ago

jruohonen2y ago

Any news or speculations on these cases?

jdkee2y ago

https://archive.is/cCIeJ

cranberryturkey2y ago

heh. good luck with that one. everyone is crawling the web now. why didn't they sue google for using their content in the serps?

gniv2y ago

There are significant differences: attribution and snippetting. OpenAI probably cannot claim these.

mhss2y ago

And Google search doesn't "generate" new content that potentially puts out of business the very same entities it learned from.

mdaniel2y ago

currently, and the non-paywalled link: https://news.ycombinator.com/item?id=38781941

j / k navigate · click thread line to collapse