undefined | Better HN

0 pointswarkdarrior1y ago0 comments

(IANAL)

Model weights could be treated the same way phone books, encyclopedias, and other collections of data are treated. The copyright is over the collection itself, even if the individual items are not copyrightable.

0 comments

6 comments · 2 top-level

TMWNN1y ago· 4 in thread

>phone books, encyclopedias, and other collections of data are treated

Encyclopedias are copyrightable. Phone books are not.

skissane1y ago

> Encyclopedias are copyrightable. Phone books are not.

It depends on the jurisdiction. The US Supreme Court ruled that phone books are not copyrightable in the 1991 case Feist Publications, Inc., v. Rural Telephone Service Co.. However, that is not the law in the UK, which generally follows the 1900 House of Lords decision Walter v Lane that found that mere "sweat of the brow" is enough to establish copyright – that case upheld a publisher's copyright on a book of speeches by politicians, purely on the grounds of the human effort involved in transcribing them.

Furthermore, under its 1996 Database Directive, the EU introduced the sui generis database right, which is a legally distinct form of intellectual property from copyright, but with many of the same features, protecting mere aggregations of information, including phone directories. The UK has retained this after Brexit. However, EU directives give member states discretion over the precise legal mechanism of their implementation, and the UK used that discretion to make database rights a subset of copyright – so, while in EU law they are a technically distinct type of IP from copyright, under UK law they are an application of copyright. EU law only requires database rights to have a term of 15 years.

Do not be surprised if in the next couple of years the EU comes out with a "AI Model Weights Directive" establishing a "sui generis AI model weights right". And I'm sure US Congress will be interested in following suit. I expect OpenAI / Meta / Google / Microsoft / etc will be lobbying for them to do so.

ronsor1y ago

Encyclopedias may be collections of facts, but the writing is generally creative. Phone books are literally just facts. AI models are literally just facts.

margalabargala1y ago

> AI models are literally just facts.

Are they, or are they collections of probabilities? If they are probabilities, and those probabilities change from model to model, that seems like they might be copywritable.

If Google, OpenAI, Facebook, and Anthropic each train a model from scratch on an identical training corpus, they would wind up with four different models that had four differing sets of weights, because they digest and process the same input corpus differently.

That indicates to me that they are not a collection of facts.

1 more reply

roywiggins1y ago

What if I train an AI model on exactly one copyrighted work and all it does it spit that work back out?

eg if I upload Marvels_Avengers.mkv.onnx and it reliably reproduces the original (after all, it's just a fact that the first byte of the original file is OxF0, etc)

2 more replies

PittleyDunkin1y ago

Who gives a damn about copyright when this is clearly profiting off of someone else's work without compensation? Sometimes the law is inadequate and that's ok—the law just needs to change.

j / k navigate · click thread line to collapse