Who owns the code Claude Code wrote? (opens in new tab)

(legallayer.substack.com)

557 pointssenaevren1mo ago530 comments

530 comments

253 comments · 69 top-level

Arcuru1mo ago· 38 in thread

Personally, I think that the human directing the agent owns the copyright for whatever is produced, but the ability for the agent to build it in the first place is based off of stolen IP.

I'm concerned about the copyright 'washing' this enables though, especially in OSS, and I think the right thing for OSS devs to do is to try to publish resulting code with the strongest copyleft licensing that they are comfortable with - https://jackson.dev/post/moral-ai-licensing/

nadermx1mo ago

Funny how the copyright industry was able to spin copyright infringment into the pejorative "stealing". If you still have the item, what was stolen?

Dowling v. United States, 473 U.S. 207 (1985): The Supreme Court ruled that the unauthorized sale of phonorecords of copyrighted musical compositions does not constitute "stolen, converted or taken by fraud" goods under the National Stolen Property Act

tensor1mo ago

I still find the idea that "learning" from code is "stealing" kind of ridiculous.

10 more replies

NewsaHackO1mo ago

Everybody has had a complete 180 in terms of copyright protections. Before, nobody cared about downloading music, movies, TV shows, or pirating games. Now, when the copyright law is affecting them, they are gungho about protecting these billion-dollar companies' copyrights.

4 more replies

Neywiny1mo ago

I don't think it's unreasonable to consider it stolen potential profit, but agreed that's not how they spin it

blks1mo ago

“Stolen” as in “profited on IP against terms and conditions of the license”.

Aerroon1mo ago

Copyright isn't some natural state of being though, it's something that's granted to people by the government to "promote the progress of science and useful arts". If copyright hinders things then I think it's reasonable that exceptions would be made.

hxtk1mo ago

This analysis yields very different results under utilitarianism vs rule utilitarianism.

Under the former, you could argue, "What I'm doing is a science or useful art, so if copyright exists to advance those things then taking a more permissive interpretation of copyright to allow my efforts to succeed is in the spirit of the law."

Under the latter, you could argue, "Works get published because as a rule, researchers and artists know they have lawful recourse through copyright if the work gets used without their consent. The absence of that rule incentivizes safeguarding works by treating them as secret and each disclosure as a matter of personal trust, so the existence of that rule promotes the sciences and useful arts."

rectang1mo ago

If the LLM generates output that a court decides is sufficiently derivative, and especially (but not necessarily) if the LLM was trained on the source material being infringed, then whoever redistributes the derivative output is going to be liable for copyright infringement.

Creation of the LLM itself is transformative, but LLM output which infringes is not.

2ndorderthought1mo ago

Is it true then that if someone stole an entire code base from a vibe coded app from a non permissively licensed project and that person claimed that it was derived from an LLM and was not stolen at all that the person who stole the code is not a thief because it came from the same place? Or are they a thief because someone else copyrighted it? How do vibe coders protect themselves not knowing who else has the same derivative code or who holds the copyright first? Or can't they?

1 more reply

KallDrexx1mo ago

Do you think that human directing the agent owns copyright for any legal reason?

The case Community for Creative Non Violence Vs Reid (https://en.wikipedia.org/wiki/Community_for_Creative_Non-Vio...) solidifies a supreme court opinion that someone contracting a work and directing an author does not grant authorship to the commissioner of the work, it grants authorship to the person actually doing the work.

The author can grant authorship and copyright to the commissioner with a contract, but the monkey picture (and others) have solidified that only humans can be granted copyright. Since LLMs aren't human they can't hold copyright, and if the LLM doesn't have legal copyright then they don't have legal rights to assign copyright to you.

zarzavat1mo ago

It depends on what level of creative control you had over the code.

Code is protected by copyright as a literary work. The method is not protected by copyright, that would be the domain of patents. What's protected are the words.

If you say "Claude, build me a website about X" then you do not have any creative control over the literary work Claude is producing. You just told a machine to write it for you. Nor, like a compiler, is it derivative of any other work that you wrote.

If, on the other hand, you are working jointly with Claude to make specific changes to the code on a line-by-line basis, then you will have no problem claiming copyright over the code. Claude in this case is acting as a tool, but there's still a human making decisions about the code.

In the case where you wrote a bunch of markdown and then told Claude to generate the corresponding code but didn't have any involvement in writing the code itself, you could perhaps claim that the code is a derivative work of the markdown, a court would have to handle that case-by-case basis and evaluate how much control you exerted over the work.

1 more reply

marcus_holmes1mo ago

Interesting, though, that ownership of the code can still be transferred to the employer. So it's in the public domain (because not human authored) but owned by the employer (because the human and/or LLM was employed by the employer)? I don't really understand how this works.

3 more replies

Animats1mo ago

> only humans can be granted copyright.

No, a copyright application can be filed with a corporation listed as the author. Watch for the copyright notice at the end of the next major movie you see.

3 more replies

CWuestefeld1mo ago

but the ability for the agent to build it in the first place is based off of stolen IP.

I honestly don't understand why the attitude that underlies this is so prevalent.

When I write code, what I write and how I write it is informed by having read countless source code files over my education and my career. Just as I ingest all that experience to fine-tune how my later code is written, so does the LLM from the code it's seen.

The immediate retort to that is that the LLM is looking at code that wasn't its to read. But I don't think that's a valid objection. Pretty much by definition, everything I've learned from has a copyright on it, and other than my own code on my own time, that copyright is owned by someone else. Much of the code that's built up my understanding has been protected by NDA, or even defense-department classifications: it wasn't mine in any way. But it still informs how I do all my future coding.

By analogy: I'm also an artist, especially since my retirement. My approach to photography was influenced by Ansel Adams, and countless other artists whose works I've seen displayed in museums, or in publications and online. My current approach to painting was inspired by Bob Ross and others, and the teachers who have helped me develop. I've taken pieces of what I've seen in all their work, and all of that comes out in my photos and paintings, to varying degrees.

I've taken ideas from others in code and in art, and produced something (hopefully!) different by combining those bits with my own perspective. I don't think anyone has a claim on my product because of this relationship.

Likewise, I know that many of my successors have learned from my code (heck, I led teams, wrote one book about software development!). And I hope that someday my artwork has developed to the point where there's something in it that's worth someone else's attention to assimilate. I've never for a minute - even decades before the advent of LLMs - hoped or even imagined that my work would remain locked up with me, and that the ideas would follow me to the grave.

As they say, we are all standing on the shoulders of giants. None of us would be able to achieve the tiniest fraction of what we have, without assimilating what has come before us. Through many layers of inheritance it's constantly being incorporated in subsequent works.

In a few decades at best, I'll be dead. It probably won't be very long after that when people even forget my name. But the idea that something I've done - my work in developing software systems, or in my photography and painting - will continue to have ripples through time, inspires me and gives me hope that I'll have some tiny shred of immortality beyond my personal demise.

demorro1mo ago

Humans should have more legal privileges than machines, just as individuals should have more legal privileges than corporations. It's really as simple as that. I don't want to gripe around making up justifications, that's how the law should be and if it turns out not to be that, I'm going to be nettled.

I live in the UK, and most US law is based upon English common law, it's not some immutable code given to us from above. It's based upon assumptions and capabilities of the entities participating in the system at the time the law was codified. It can and should change to make more sense if those assumptions and capabilities shift massively.

2 more replies

missingcolours1mo ago

In many of those examples, there is payment to the creator of the works that others are learning from. Authors are paid for their books, when we listen to music on the radio the musician is paid royalties, etc. When you lead a team and mentor junior engineers you're being paid for your time.

The nature of the source material matters though. Training a model on open source software seems perfectly fair - it has explicitly been released to the public, and learning from the code has never been a contested use.

IMO the questions around coding models should be seen as less about LLMs and more as a subset of the conversation about large companies driving immense profits from the work of volunteers on open-source projects, i.e. it's more about open source than AI.

jacquesm1mo ago

Scale and the ability to generate a livelihood of your creations and/or the ability to control how what you have created is used, for instance, to demand attribution.

gspr1mo ago

> When I write code, what I write and how I write it is informed by having read countless source code files over my education and my career. Just as I ingest all that experience to fine-tune how my later code is written, so does the LLM from the code it's seen.

You are presumably human. We have granted humans specific exemptions in copyright law. We have not granted that to LLMs. Why are we so eager to?

4 more replies

atleastoptimal1mo ago

The attitude is derived from a general animus many have towards AI companies. They resent the efficacy of AI because it devalues individual expertise.

I can't imagine it really justifiable to say that training off data is the same as "stealing", when that same claim, that learned information that a person could retain and reproduce constitutes copyright infringement is the subject of many dystopian narratives, like this one, where once your brain is uploaded to the cloud you have to pay royalties based on every media product you remember.

https://www.youtube.com/watch?v=IFe9wiDfb0E

2 more replies

rspeele1mo ago

For another human being to look at my open source code, learn from it, get inspired by it, appreciate what I did, and let it influence their own creativity would bring me joy. That's why I open sourced it in the first place.

Few people ever actually read open source code, but I'd like to think on the rare occasions they do, they share a connection with the author. I know when I read somebody else's code, for me to understand it I have to be thinking about the problem the same way they were when they wrote it. I feel empathy with them and can sometimes picture the struggle, backtracking, and eureka moments they went through to come up with their solution.

Somehow I don't get the same warm fuzzy feelings about a machine powered by investor money ingesting my work automatically, in milliseconds, and coldly compressing it down to a few nudges on a few weights out of trillions of parameters. All so the machine can produce outputs on-demand for lazy users who will never know of me or appreciate my little contribution, and ultimately for the financial benefit of some billionaires who see me as an obsolete waste of space.

I guess I'm just irrational that way.

1 more reply

blks1mo ago

You’re confusing yourself with a commercial product. You’re not a product that was created by other human beings based on someone else’s IP.

1 more reply

jmyeet1mo ago

You can think that's how it should be. But that's not necessarily how it is. I'm reminded of the famous monkey selfie copyright dispute [1]. A photographer set up a camera and gave it to a monkey but after a legal dispute, courts decided nobody owned the copyright.

I can totally see this applying here as well.

Now this doesn't resolve the issue of AIs being trained on copyrighted works it had no rights to. The counterargument is that this is a derivative or transformative work but I don't believe that's settled law at all.

[1]: https://en.wikipedia.org/wiki/Monkey_selfie_copyright_disput...

varispeed1mo ago

I find idea that the code could be copyrightable as weak. There are only so many ways to write a for loop. Similarly you can't copyright schematics (apart from exact visual representation as form of art). Code is just a schematic.

gspr1mo ago

Let me get this straight: Since there are only so many ways to write a for loop, you doubt that for loops are copyrightable. From this you conclude that code, in general isn't copyrightable?

That's like saying "there's only so many ways to greet your neighbor, so any text that simply greets your neighbor isn't copyrightable – and therefore no text is copyrightable".

alok-g1mo ago

Note: IANAL

Copyrights already preclude short phrases for the same reason -- there are only so many ways in which short phrases could be produced. The moment a work becomes larger (large enough; AFAIK, the threshold is not precisely defined), the reasoning you applied fails to apply.

The Google-Oracle lawsuit did not decide whether APIs (when large in number) are copyrightable or not.

ako1mo ago

I've created my own DSL, and instruct Claude Code how to generate code for this DSL using skills.

Since this is a new language, and not documented on the web nor on Github, Claude's ability is not based off of stolen IP. At best it's trained on other language concepts, just like we can train ourselves on code on GitHub.

Maybe a good reason to create a new programming language?

alok-g1mo ago

Interesting, but I still do not think this is as easy. The AI model is still trained on some existing works, and it is generating code in the new DSL or programming language still based some higher level ideas and expressions it has consumed during training. You have added just one more level of indirection. The output cannot anymore be verbatim copy of some existing work or non-short snippets, however, the output may still carry "expression" that are substantially similar to something pre-existing.

Note: IANAL. The above is just from my current understanding.

jongjong1mo ago

This interpretation makes sense. I think even the 'fair use' clause in the US doesn't protect LLMs. One argument I've heard often is that LLMs synthesize their training set to produce novel output in the same way as a human would... That may be the case, but legally an LLM isn't a human. You can't look at the output of an LLM and say that it's 'fair use' with respect to its training set; it hasn't been established that AI has the same 'fair use' right as a human does; it's already pushing it that companies have this right (let alone an AI agent); anyway, that's just one problem... Also, this is ignoring the fact that the researchers who compiled the training set COPIED the original copyrighted data in order to produce that training set. They either copied the entire work into the training set or they fed the entire work directly into the LLM; in either case; at some point, the entire work was copied verbatim into the LLM's input layer before it was ingested by the AI. The researchers copied the copyrighted content without permission.

Also, when it comes to code, the case is even more damning because the vast majority of the code which LLMs are trained on was not only copyright but subject to an MIT license (at best) and even the MIT license, which is the most permissive license in existence, still says clearly:

"Permission is hereby granted, free of charge, to any person obtaining a copy of this software"

The word 'person' is used very intentionally here.

I think there should be several kinds of AI taxes which should be distributed to all copyright holders. There should be a tax to go to writers (and book authors), a tax to go to open source developers and a tax for the general population to distribute as UBI to account for small-form content like comments and photography...

People invested a lot of time building their entire careers around the assumption of copyright protection; so for it to be violated on such a scale would be a massive betrayal.

dredmorbius1mo ago

That's not what's been established to date in US caselaw:

THALER v. PERLMUTTER (2023). "[T]his case presents only the question of whether a work generated autonomously by a computer system is eligible for copyright. In the absence of any human involvement in the creation of the work, the clear and straightforward answer is the one given by the Register: No."

<https://caselaw.findlaw.com/court/us-dis-crt-dis-col/1149169...>.

amarant1mo ago

I could possibly see an argument for the owner being whoever paid for the tokens used, but honestly I think the argument for that is weaker than what you're suggesting; I'm merely playing devil's advocate here.

I don't think there's even a valid argument for any other ownership model, or at least none that I can think of.

jmaw1mo ago

I see the argument for whoever paid for the tokens. Or in the case of a free AI usage, the person who sent the prompt (or whoever they are acting on behalf of, i.e. the company they are working for at the time).

The primary issue being that it's all built on stolen data in the first place.

1 more reply

cess111mo ago

The LLM is just a database. It's like saying 'I own the copyright to what comes out of an API because I crafted the query' or 'I own the copyright to the responses I get from the bots on the Starship Titanic because I crafted the message they respond to'.

jacquesm1mo ago

No, that human owns the copyright on the prompt, not on the work product.

keithba1mo ago

That’s now how it works. The human using the tool (like claude code, etc) owns the copyright of the code generated.

1 more reply

alok-g1mo ago

If that were true, a developer may own copyright over the source code, but nothing on the compiled binaries, and I could download practically all software available as compiled binaries and use for free.

2 more replies

kridsdale11mo ago

So I’m responsible for pushing the giant boulder at the top of the hill.

The humans at the bottom who were crushed should blame the boulder, which happened to be moving.

1 more reply

saadn921mo ago

I agree with this sentiment, because the person directing the agent can still direct it in a way where it'll produce a better or worse output than another person directing it.

amelius1mo ago

I wonder what OSS licenses would have looked like if we saw all of this coming.

semiquaver1mo ago· 24 in thread

  > The US Copyright Office confirmed this in January 2025, and the Supreme Court declined to disturb it in March 2026 when it turned away the Thaler appeal. Works predominantly generated by AI without meaningful human authorship are not eligible for copyright protection, and that rule is now settled at the highest judicial level available.

Misstates the law. Denial of certiorari can happen for many reasons unrelated to the merits and does not settle the issue nationwide.

PaulDavisThe1st1mo ago

From TFA:

> When the Supreme Court declined to hear the Thaler appeal in March 2026, it did not endorse the lower court's reasoning or settle the question nationally. Cert denial means the Court chose not to hear the case, nothing more. What it does mean is that the DC Circuit's ruling stands, the Copyright Office's position is intact, and no court has yet gone the other way.

Your quoted text is no longer in TFA.

jibal1mo ago

Because the author acted on that comment.

semiquaver1mo ago

c.f. OP’s comments in this thread.

graemep1mo ago

It also contradicts everything else I have read about Thaler. AFAIK the ruling was that the AI could not hold copyright. Thaler waived any claim to the the copyright holder himself.

The last two bullet points on this page cover this:

https://www.authorsalliance.org/2025/03/19/thaler-v-perlmutt...

The site also explains the qualifications and experience in copyright law of the author of the above - unlike the article here.

greensoap1mo ago

Also, I don't think there is any example testing the conclusion. There is no case to point at that any of the factors they listed are sufficient to convey authorship. Would love to be pointed to a case where rejecting decisions and redirecting to a different approach was deemed human authorship. What we do know is that you can disclaim the part of the code a human didn't author. In fact, the Copyright Office requires you disclose and disclaim. If anyone out there has more factual and citable sources please share.

KallDrexx1mo ago

It's in fact the opposite from what I've read. In one of the supreme court cases cited by the copyright office itself in its opinion of AI works (https://en.wikipedia.org/wiki/Community_for_Creative_Non-Vio...) it is deemed that just you advising something to do the work for you, giving criticisms and revisions, isn't enough for authorship or co-authorship.

While it's not code related, the copyright office's opinion is a good read and I don't see any reason to believe it's opinion is different for works of text vs works of physical art: https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...

senaevrenOP1mo ago

You are right that no court has yet ruled that a specific set of human contributions to AI-assisted work was sufficient to establish authorship. What exists is the inverse: the Copyright Office has granted partial registrations where human-authored elements were separated from AI-generated elements, as in Zarya of the Dawn, where the human-written text was protected but the Midjourney images were not. The Allen v. Perlmutter case pending in Colorado is the first direct judicial test of whether iterative prompting and editing can constitute authorship. Until that decision, the positive threshold is genuinely unknown. The piece reflects this in the calibration section at the end, though your point is worth adding to the authorship discussion more explicitly.

senaevrenOP1mo ago

Fair and correct. Cert denial means the Court declined to hear the case, not that it endorsed the lower court's reasoning or settled the question nationally. The DC Circuit ruling stands and the Copyright Office's position is consistent, but that is stable doctrine rather than Supreme Court-settled law. Updated the piece to reflect this distinction accurately.

sowbug1mo ago

Since this is a tech audience... the Supreme Court uses a bounded priority queue. An unbounded queue would risk growing impractically large.

There are some kinds of cases where the Court has "original jurisdiction," meaning they must hear them, but those are very rare.

WillPostForFood1mo ago

Furthermore, we shouldn't even be looking to the Supreme Court at all for this. Congress needs to define the laws around AI and copyright. The Supreme Court is likely avoiding cases in the hope that the legislature gets its act together.

semiquaver1mo ago

100%. This is the real fix, we have new situations, we need new laws. Unfortunately Congress is currently broken.

jmyeet1mo ago

The Supreme Court declining to take up an issue is taking a position.

Now different circuits can take a different view of the same issue. This is a common reason why the Supreme Court will grant cert: to resolve a circuit split. Appeals court judges know this and have at times (allegedly) intentnionally split to force an issue to the Supreme Court.

Even without settling the issue appeals courts will look at how other circuits have ruled and be guided by their reasoning, generally. The fact that the Supreme Court declined to grant cert actually carries weight.

semiquaver1mo ago

  > The Supreme Court declining to take up an issue is taking a position.

No it is not.

  > “The denial of a writ of certiorari imports no expression of opinion upon the merits of the case, as the bar has been told many times.”

United States v. Carver, 260 U. S. 482, 490 (1923).

Moreover, SCOTUS does not decide issues, they decide cases.

  > “We are acutely aware, however, that we sit to decide concrete cases, and not abstract propositions of law.”

Upjohn Co. v. United States, 449 U. S. 383, 386 (1981).

greensoap1mo ago

the real issue is that the Thaler case was a different question: "Can AI be an author?" and the lower Court said no and SCOTUS left it along. But the question of "what is enough for the human to be the author" wasn't even part of the case. That is completely own checked.

2 more replies

senaevrenOP1mo ago

Fair point and worth being precise about. Cert denial is not meaningless: it leaves the lower court ruling intact, it signals the Court did not find the issue urgent enough to resolve now, and as you note, other circuits will look at the DC Circuit's reasoning. What it does not do is bind other circuits or establish Supreme Court precedent. The distinction matters here because if a Ninth Circuit case involving AI-generated code reaches a different conclusion, that circuit split would be live law regardless of the Thaler cert denial.

21asdffdsa121mo ago

Lets hire humans as pAIrrots? They see it, they rearrange it, they rename variables and then they "authored" it. What a job- to start for as junior, but if you understand whats happening, you may augment the AIs code by giving "feedback" with enough time.

streetfighter641mo ago

Free water but not electricity? I'll just hook up a generator to the shower...

These sorts of simplistic loopholes rarely work. Imagine if you could get copyright for the linux kernel by just rearranging it and renaming a few variables.

1 more reply

consp1mo ago

Ah the infamous "no I wrote it myself" submission in university coursework. Usually gets you a free visit to the guidance counsel and a bonus free mark (on your three strikes and you are out plagiarism form).

1 more reply

DrewADesign1mo ago

But it means that the appellate decision will retain precedence, no? Wouldn’t losing precedence be the primary legal effect of overturning that decision? All case law that hasn’t touched the Supreme Court could theoretically be challenged, but most of it isn’t, and it’s considered the law until it isn’t anymore, right? How would this be any different?

semiquaver1mo ago

The decision is binding only within the jurisdiction of the Court of Appeals for the D.C. Circuit.

So it’s not correct to say “because SCOTUS denied cert, Thaler is now binding national copyright law.”

Practically speaking, it is binding on the US Copyright office (one of the parties in the case) in CADC. And that’s important. But copyright litigation happens all across the country, while this ruling only directly constrains the relatively small number of cases within CADC.

2 more replies

freejazz1mo ago

It does settle the law in as far as maintaining the status quo.

matheusmoreira1mo ago

> meaningful human authorship

How is this defined? Is my code review "meaningful" ? Are my amendments and edits to the generated code "human authorship" ?

cooper_ganglia1mo ago

From the article:

> Specifying an objective to the model is not enough. Directing how the work is constructed is what counts.

3 more replies

wayeq1mo ago

read the article?

1 more reply

jugg1es1mo ago· 14 in thread

I want this question to have an interesting answer, but everyone knows that if this question ever goes to the courts, ownership will go to the people in charge with the money. The idea that Anthropic may not own Claude Code just because Claude wrote it is wishful thinking.

senaevrenOP1mo ago

The work-for-hire doctrine actually supports your intuition more than the AI authorship question does. The reason Anthropic likely owns Claude Code has little to do with whether Claude wrote it and everything to do with the employment contracts of the engineers who directed it. The DMCA takedown question is genuinely interesting though because DMCA requires the claimant to assert copyright ownership in good faith. If a court later found the codebase was predominantly AI-authored and therefore not copyrightable, the 8,000 takedowns could be challenged as bad faith DMCA claims. That is a different and more tractable legal question than the ownership one.

rasz1mo ago

Work-for-hire doctrine doesnt automagically absolve you from IP law. Microsoft and Intel already learned this in the nineties when they paid San Francisco Canyon Company to steal Apple code.

https://en.wikipedia.org/wiki/San_Francisco_Canyon_Company

LLMs are just code stealers, will gladly generate Carmacks inverse for you with original comments.

1 more reply

gpm1mo ago

I have trouble believing that the DMCA claims would be found to be in bad faith when they were made at a time when the question of what degree of human input is required to acquire copyright on AI generate code hasn't been resolved at all.

It doesn't seem like bad faith to think that copyright is stronger than the courts end up thinking, just being mistaken.

1 more reply

CWuestefeld1mo ago

I can't see how that can work.

As a developer, the fact that my source code passed through a compiler - an automated tool - doesn't give the author of the compiler any claim on my executable code.

As an artist, the fact that I used, e.g., Rebelle to paint a digital painting, or that I used Lightroom (including generative AI to fill, or other ML/AI tools to de-noise and sharpen my image) in editing a photograph, doesn't give EscapeMotion, Adobe, or Topaz, any claims to my product.

Why, then, would there be any chance that use of a tool like Claude - a tool that's super-advanced to be sure, but at the end of the day operates by way of a mathematical algorithms - would confer any claims to Anthropic?

If a court later found the codebase was predominantly AI-authored and therefore not copyrightable

Is figuring out the appropriate prompts to use in directing Clause qualitatively different than using a (much) higher-level abstraction in coding? That is, there was never any talk as we climbed the abstraction layer from machine code to assembly to Fortran or C to 4GLs to Rust etc., that the assembler/compiler/IDE builder would have any ownership claim on the produced executable. In what sense can Anthropic et al assert that their tool, which just transforms our directives to some lower-level representation, creates ownership of that lower-level representation?

embedding-shape1mo ago

Best part is, it's likely to have a different answer in every country, who knows what'll happen, not every country implicitly sides with the ones with the most money.

MarsIronPI1mo ago

Well, eventually it'll probably be added to the Berne Convention agreement or some such.

1 more reply

adrianN1mo ago

Depends on where they pay their taxes generally.

beej711mo ago

I love that genAI art will not be copyrightable and genAI code will be. The power of the Almighty Dollar at work.

conartist61mo ago

It's not wishful thinking, and ownership isn't a foregone conclusion.

Sure the courts could mint a communist society with a few weird decisions about property rights, but this being the US do you really suppose that's likely?

There's really no legal question of any kind that models aren't people and therefore cannot own property (and also cannot enter into legal contract as would be required to reassign the intellectual property they don't and can't own)

wongarsu1mo ago

The catch-22 is that the fact that models aren't people is only relevant if you treat them similar to a person. Like the US Copyright Office's opinion which treats it similar to a freelancer. If you treat the LLM as a machine similar to a camera, with the author expressing their existing intent through the tools of this machine, ownership is back on the table and more or less how it was before LLMs.

1 more reply

helterskelter1mo ago

I'm not sure Anthropic would appreciate the liability that ownership would imply.

helterskelter1mo ago

Too late to edit, but OpenAI certainly doesn't want ownership or liability, for the CSAM they've produced. They certainly don't want ownership/liability of code which does $ONLYAWFULTHING.

dfxm121mo ago

They won't want to own code that is malicious\illegal\used in crime, although it's really weird to me that no one (in LEO) seems to care that, for example, grok generates CSAM, revenge porn, probably other illegal things, so they'll probably get to have their cake and eat it too.

bombcar1mo ago

Those things have precise legal definitions which it may not be entirely clear that an LLM can even generate them - especially in the USA where the 1st covers things that many would think illegal (and are illegal in other countries).

cestith1mo ago· 11 in thread

I find it distasteful and disturbing that copyright infringement by the people training the LLM in violation of a license is considered contamination by the licensed code. It’s not contamination. The code didn’t seep into your codebase. If the LLM was trained in such a way that portions of code long enough to be protectable then the license was violated by humans. The liability for the problem doesn’t lie on the shoulders of the contributors to the originally licensed code. It lies on the people inserting it into your codebase without following the terms of the license.

The article also singles out the GPL repeatedly as a source of contamination. It doesn’t mention source-available proprietary licenses. It doesn’t mention code put online with no clear license, which according to the Bern Convention and the laws in at least the United States is automatically copyright protected with no license for use by others at all. It doesn’t talk about attribution for BSD-style or CC-SA-Attribution licenses. There’s no mention of leaked proprietary code. It just singles out GPL as some sort of unique problem.

This seems quite shoddy and biased for an article by someone who’s writing about the law.

dehrmann1mo ago

> training the LLM in violation of a license

Bartz v. Anthropic found that this is fair use, so the license doesn't play into it.

cestith1mo ago

If the trained LLM spits out large, recognizable portions of licensed code and you use it in your product don’t count on that case to keep you from defending yourself in court. The court found in Bartz v. Anthropic that training was fair use. They also found that pirating content to train against was not fair use, and Anthropic paid $1,500,000,000 in a settlement.

There are licenses on most software source code. If you redistribute works derived from that code, you must abide by those licenses or you are violating the copyright. That’s what’s meant by “piracy" here.

Now if you have an LLM that has trained on code and learned to actually write new software, only small snippets too short to be protected by copyright should be identical between the training material and the output. However, if you’re getting output that is substantial in size and recognizably derivative from the original that’s an issue that hasn’t yet as far as I’m aware been settled in court. One would hope the major player LLMs don’t copy and paste large functional chunks of existing programs.

It would certainly seem to me that the code you sell after using an LLM should meet the same standards for difference in implementation as if it was written by a human. That should apply to both copyright protection and patent protection.

3 more replies

vbarrielle1mo ago

I thought fair use was decided on a case by case basis, and could not be guaranteed? If true, wouldn't that mean that in other cases it could be ruled differently?

1 more reply

panzi1mo ago

Other than putting something into the public domain I don't really know any open source licence that doesn't require at least attribution. One can assume that 99.9% of training data had some sort of license requirements, so just blindly using it is a copyright violation. People just don't seem to care.

vablings1mo ago

It is probably fair that a huge share of code that is Foss is licensed under GPL, much larger than the share of source available proprietary licensed code

d0mine1mo ago

Here's github statistics from 2015 https://github.blog/open-source/open-source-license-usage-on...

MIT is used by more projects than GPL.

1 more reply

kube-system1mo ago

There is a lot of code on the internet that isn't accompanied by a FOSS license or any license that permits reuse, or any license at all.

cestith1mo ago

Is GPL a larger share of source out there than BSD, MIT, ISC, CC, BSL, Apache, and source available combined? Enough bigger that it is repeatedly mentioned as a singular issue without so much as the words “or other licenses”?

numbsafari1mo ago

I would have assumed the opposite is true. Do you have any data to back that up?

1 more reply

charonn01mo ago

They probably focused on the GPL because of its viral copyleft features.

cestith1mo ago

Do you suspect that an LLM that would recreate a substantial portion of a licensed work would honor any license? Even a 2-clause BSD one?

1 more reply

bko1mo ago· 11 in thread

This is all well and good as an intellectual exercise, but in real life none of this matters. Almost no one thinks their code is copyrightable or seriously thinks their code is a moat. I've written the same chunks of code for a number of employers as has every engineer. We've all taken chunks from stack overflow and other places without carefully considering attribution.

This comes up in a few places as a kind of vindictive battle. One example is Oracle suing Google for too closely mimicking their API in Android. Here is an example:

> private static void rangeCheck(int arrayLen, int fromIndex, int toIndex) {

    if (fromIndex > toIndex)

        throw new IllegalArgumentException("fromIndex(" +

fromIndex +

                                           ") > toIndex(" +

toIndex + ")");

    if (fromIndex < 0)

        throw new ArrayIndexOutOfBoundsException(fromIndex);

    if (toIndex > arrayLen)

        throw new ArrayIndexOutOfBoundsException(toIndex);

}

And it was deemed fair use by the Supreme Court. Other times high frequency hedge funds sued exiting employees, sometimes successfully. In America, anyone can sue you for any reason, so sure, you'll have Ellison take a feud up with Page and Brin all the way up to the Supreme Court.

In 99.9% of instances none of this matter. Sure there's the technical letter of the law but in practice, and especially now, none of this matters.

https://www.supremecourt.gov/opinions/20pdf/18-956_d18f.pdf

freedomben1mo ago

> Almost no one thinks their code is copyrightable or seriously thinks their code is a moat.

You'd be surprised! Among non-software management types, they often think of the code as extremely valuable IP and a trade secret. I'm a CTO and I've made comments before to non/less technical peers about how the code (generally speaking) isn't that big of a secret, and I routinely get shocked expressions. In one case the company almost passed on a big contract because it required disclosure of the source code (with an NDA). When I told them that was a silly reason and explained why, they got it, but the old way of thinking still permeates and is a hard habit to break.

Edit: Fixed errant copy pasta error. Glad that wasn't a password :-)

mbesto1mo ago

Totally agreed.

I work in M&A. Nearly every lawyer, accountant, investor, and software business owner thinks their code is solely valuable and a trade secret. I find it hilarious and try to be as diplomatic as possible about why it's not. They also willfully will give their client list to a potential acquirer but get super cagey they moment a third party provider asks for their code to be scanned.

This argument easily gets shut down when I asked why, Twitch, a $1B business didn't crater to their competition when their full codebase was leaked.

bko1mo ago

You're right, I guess maybe I mean in any serious actionable way. Senior, non technical people leave plenty of money on the table by thinking they're protecting something valuable or they have some kind of secret sauce. It's all silly is what I meant to say, and digging into the technicalities of whether your code is truly copyrightable is kind of pointless. It's all vibes.

1 more reply

BobbyTables21mo ago

I’ve worked at too many places where I mused that if someone gave the source code to the competitors, it’d likely drive the competitors out of business as they tried to use it.

Keeping it proprietary probably has the greatest value in preserving the company’s reputation…

hackingonempty1mo ago

Maybe LLM coding agents change the equation by making it much easier to adapt and use foreign and probably incomplete code. Getting you closer to competing with the original authors in a shorter amount of time than generating new code from scratch.

Nursie1mo ago

> Almost no one thinks their code is copyrightable

I think this is an unusual opinion.

Code may not be copyrightable in as small chunks as you put there, but in terms of larger pieces I think companies and individuals very often labour under the belief that code is intellectual property under copyright law.

If code isn't copyrightable, from where comes the GPL?

And why does anyone care if (for instance) some Microsoft code might have accidentally ended up in ReactOS, causing that project to need to go into a locked-down review mode for months or years? For that matter why do employers assert that they own the copyright in contracts?

I think it's the opposite - almost everyone thinks their code is copyrightable, outside of APIs and interop stuff, or things so simple as to be trivial.

croes1mo ago

> Almost no one thinks their code is copyrightable

Then why does reverse engineered code need to be a clean room implementation?

Ask any emulator developer or the developers of ReactOS

https://reactos.org/forum/viewtopic.php?t=21740

1 more reply

conartist61mo ago

Nobody ever talks about convergence.

You, right now, are taking about convergence.

If there is no artwork, there can be no copyright. If every character of the code to write is basically predetermined by the APIs you need to call, there is no artwork and no copyright.

Build a novel new API, and you'll be protected though.

sarchertech1mo ago

> Almost no one thinks their code is copyrightable

Every open source license is built on the premise that code is copyrightable.

adrian_b1mo ago

No.

It is based on the premise that if the proprietary licenses are valid, then also the open source licenses are valid.

So what is held as true is only the implication stated above and not the truth value of the claims that either kind of licenses are valid.

If the proprietary licenses are not valid, then it does not matter that also the open source licenses are not valid.

The open source licenses are intended as defenses against the people who would otherwise attempt to claim ownership of that code and apply a proprietary license to the code, i.e. exactly what now Anthropic and the like have done, together with their corporate customers.

Of course, if it is accepted that the code generated by an AI coding assistant is not copyrightable, then using it would not really be a violation of the original open source licenses. The problem is that even if this principle is the one accepted legally, at least for now, both Anthropic and their corporate customers appear to assume that they own the copyright for this code that should have been either non-copyrightable or governed by the original licenses of the code used for training.

1 more reply

Rietty1mo ago

Why were the HFT firms suing employees?

p0w3n3d1mo ago· 9 in thread

That's quite impressive approach from the companies' perspective. Let's first use claude code and then we'll think who the code belongs to.

I think that the gold rush approach happening right now around me (my company EMs forcing me to work with claude as fast as possible) show really short-sight of all the management people.

First - I lose my understanding of the code base by relying too much on claude code.

Second - we drop all the good coding practices (like XP, code review etc.) because claude is reviewing claude's code.

Third - we just take a big smelly dump on the teamwork - it's easier and cheaper to let one developer drive the whole change from backend to frontend, despite there are (or were) two different teams - one for FE, one for BE.

Fourth - code commenting was passe, as the code is documentation itself... Unless... there is a problem with the context (which is). So when the people were writing the code, they would not understand the over-engineered code because of their fault. But now we make a step back for our beloved claude because it has small context... It's unfair treatment.

I could go on and on. And all those cultural changes are because of money. So I dub this "goldrush", open my popcorn and see what happens next.

nicoburns1mo ago

> Third - we just take a big smelly dump on the teamwork - it's easier and cheaper to let one developer drive the whole change from backend to frontend, despite there are (or were) two different teams - one for FE, one for BE.

Agree with your other points, but IMO this one has always been better. You often need to design the backend and frontend to work with each other, and that requires a lot more coordination when it's separate teams.

ryandrake1mo ago

One of the few things I do kind of like about LLM-assisted coding is that it's helping to bring back "lone wolf" programming. We currently default to using massive teams to build massive software because of all the work involved, but teams have a huge communication/documentation cost, and a lot can leak and be lost the more communication has to happen to get things done. Code assistants cut down on the "all the work involved" part, and I think will help to bring one-man shops back into fashion.

yason1mo ago

On the other hand, separating FE and BE between two teams, necessitating proper interfaces, can often be considered a feature.

bearjaws1mo ago

I rarely see #3 yield better solutions, it's usually better to collaborate as a team on requirements and gotchas, but let one person own implementation.

p0w3n3d1mo ago

But both backend and front-end? Do everyone have to be full stack?

senaevrenOP1mo ago

The fourth point about code commenting is the one that connects directly to the ownership question. When developers write comments to explain intent, those comments are evidence of human creative direction. When Claude writes the code and the comments, and the developer merges without adding their own explanation of the architectural decisions, the record of human authorship disappears along with the institutional knowledge. The documentation problem and the copyright problem are the same problem.

sebastianconcpt1mo ago

Also, it's supremely easy do the wrong abstractions long term and compromise premature internal designs that will start to starve of human mental modeling, hence explaining with accountability how things work and what the plans are when an incident happens. Also, if the wrong generalizations are introduced, coded correctly and reviewed and approved by AIs, then who's even driving really?

eddyfromtheblok1mo ago

people quickly have forgotten: when copilot was announced, there were warnings not to use it for company code because of the license attribution problem. so what's changed? that anthropic is willing to defend and indemnify?

refulgentis1mo ago

I opened my popcorn for the unholy trinity of HN x law x AI, your comment was one of my faves, love the purple prose. :)

_flux1mo ago· 9 in thread

I think it should be pretty clear that if you provided the tool the specification for the code you want, you have already provided creative input.

After all, is this not what happens with compilers as well? LLM agents are just quite advanced compilers that don't require the specification to be as detailed as with traditional compilers.

yodon1mo ago

>it should be pretty clear that if you provided the tool the specification for the code you want, you have already provided creative input.

If you provided a human contractor with the specifications for the code you want, the courts have repeatedly made clear you have not provided the creative input from a copyright perspective, and the contractor needs to explicitly assign those rights to you if want to own the copyright on the code.

_flux1mo ago

Let's say we didn't have assemblers, but instead we would have three professions:

- Specifiers, who make the specification for the system

- Programmers, who write C code

- Machine encoders, that take that C code and write machine code for a CPU

Would it be that the copyright would then belong to programmers, if no other explicit assignments would be made?

---

Thinking about it, probably yes: copyright of the spec belongs to specifies, copyright of the C belong to programmers, and copyright of machine code to machine encoders. Or would it depend on the amount of optimizations the machine encoders would do, i.e. is it creative or not? And then does this relate to the task and copyrightability of C compiler output, where optimizations can sometimes surprise the developer?

1 more reply

anikom151mo ago

LLMs aren’t human.

senaevrenOP1mo ago

The compiler analogy is the right one to reach for and the Copyright Office addressed it directly: the question is not whether you provided input, it is whether the creative expression in the output reflects human authorship. With a traditional compiler, the programmer authors every expression in the source. With an LLM, the programmer authors the intent and the model makes the expressive decisions about structure, naming, pattern, and implementation. Whether that distinction matters legally is what Allen v. Perlmutter is working through right now. The summary judgment briefing completed in early 2026 and it may be the next landmark ruling on exactly this question.

everforward1mo ago

Specifications are not necessarily creative input. Eg if I write a prompt that just says “write a rate limiter in Python”, there’s really no creative input. I didn’t decide on the API, or the algorithm to bucket requests, or where to store counters, or etc. I just gave it statements of fact, which are inherently not creative.

Compilers are different in that the resulting binaries are not separately copyrighted. They are the same object to the Copyright Office because one produces the other, in the same way that converting an image to a PDF is still the same copyright.

LLMs don’t do that. The stuff coming in may not be copyrighted, and may not be copyrightable. The stuff that comes out is not a rote series of transformations, there are decisions being made. In common use, running a prompt 10 times might yield 10 meaningfully different results.

I’m dubious the outcome will be “any level of prompting is enough creativity”.

d01001mo ago

The trick is to constrain the LLM to program in a very defined coding style

If I make the LLM generate code that follows my own code architecture and style, that should be enough creative input

1 more reply

pocksuppet1mo ago

Fine then that's not copyrightable at all. Just like hello world isn't copyrightable, whether in source form or compiled form.

xmcp1231mo ago

This is actually the opposite of what the copyright office has said. Directly addressing AI generated code/prompts, they compared it to someone who is commissioning art, describing to the artist what they want.

The copyright falls to the artist, not the person commissioning it.

Complicated in this case, because there is no artist.

hypercube331mo ago

To me this is like asking who owns the binary files a compiler generates.

jhbadger1mo ago· 9 in thread

This is of course assuming you take AI-generated code unchanged. But you don't, in my experience. And that generates a new work fully copyrightable even if the original wasn't. Just like how the fad a decade or so ago of taking Tolstoy and Jane Austen works and adding new elements -- "Android Karenina" and "Sense and Sensibility and Sea Monsters" are copyrighted works even if the majority of the text in them was from public domain sources.

FartyMcFarter1mo ago

The article addresses this explicitly:

> Works predominantly generated by AI without meaningful human authorship are not eligible for copyright protection

Note the word "predominantly", and the discussion that follows in the article about what the courts and the copyright office said.

1 more reply

Luker881mo ago

No such assumption is made in the article.

Nor does it give a single answer.

Mere prompting is still not enough for copyright, and the problem is unsolved on how much contribution a human needs to make to the generated code.

In the case for generated images copyright has been assigned only to the human-modified parts.

Even worse, it will be slightly different in other nations.

The only one that accepts copyright for the unchanged output of a prompt is China.

1 more reply

conartist61mo ago

I'm sure it's not quite that simple. Only parts the parts of those knock-off works that aren't public domain could be copyrightable. If you only own the copyright to ten lines in a 10k line codebase, then it's probably fair use for someone else to just to take the whole thing.

Plus what if Anna Karenina was GPL?

1 more reply

brianwawok1mo ago

You use humans to edit AI code? When you level up you are just using AI to write, AI to review, AI to edit, AI to test. Not a lot of steps left for meat bags.

3 more replies

exe341mo ago

> This is of course assuming you take AI-generated code unchanged. But you don't, in my experience. And that generates a new work fully copyrightable even if the original wasn't.

That's not how copyright works. The modified version is derivative. You can't just take the Linux kernel, make some changes, and slap a new license on it.

throwatdem123111mo ago

Ok what about all the Anthropic’s engineers who say they don’t write code at all and it’s 100% AI-generated?

gchamonlive1mo ago

> This is of course assuming you take AI-generated code unchanged.

How much code do you need to change in order for it to be original? One line? 10%? More than 50%?

That's arbitrary and quite unproductive convo to be honest.

1 more reply

6stringmerc1mo ago

Wrong. This territory was heavily covered in music before this code concept - it has to be “transformative” in the eyes of the law. Even going in and cleaning up code or adding 10-25% new code won’t pass this threshold. Don't bother arguing with me on this, just accept reality and deal with it.

1 more reply

mzl1mo ago

If you modify the work, that creates a derived work from whatever copyright the original works has, not a new work that is fully copyrightable.

As the article says in the Tl;DR at the top the code may be contaminated by open source licenses

> Agentic coding tools like Claude Code, Cursor, and Codex generate code that may be uncopyrightable, owned by your employer, or contaminated by open source licenses you cannot see

alienll1mo ago· 7 in thread

This is the same shape as the image cases.

Zarya of the Dawn already settled it for Midjourney output: human-written elements were protected, AI-generated images were not. The character design didn't get copyright even though the human picked, prompted, and curated. Code isn't different. Prompting Claude to produce a function is closer to prompting Midjourney to produce a frame than to writing the function yourself.

The reason it feels different to engineers is that we're used to thinking of the compiler as the analogy. But a compiler is deterministic — same input, same output. An LLM isn't. That's the line the Copyright Office is drawing, and image cases got there first.

protocolture1mo ago

Depends on the scale of LLM involvement, the copyright office left a pretty big carve out for things that are human sourced and then modified by LLM, or the reverse, LLM output thats modified by human intention. (They had to do this because there are already pseudo random elements to digital artwork, like say, render clouds and render noise, that might otherwise poison an artwork). In fact I dont think this has been tested with Highlight area > Prompt a change to this area of the image workflows.

They also mention in the same document that were LLMs to more closely approximate deterministic tools, they would be open to reevaluating. That is Requesting X gets X without substantial wiggle room.

I dont think that last part has been tested with an extremely large set of prompts and human generated input to create a more deterministic output. Even outside of code, where you see large prompts, creative writing LLM tools, NovelAI or Sudowrite for instance can have pages and pages of spec for the LLM, sometimes close to 50% of the size of the final output.

Then there's testing, review etc, human processes confirming that the output meets spec, updating it where needed intelligently.

There are also foreign courts, with similar rules about human intention, that have found in favor of prompts only, where it could be demonstrated that multiple rounds of prompts were used to refine the image.

I wouldnt call this settled at all tbh. And to be honest, a lot of this doesnt require exposure. you dont need to own up to LLM use in a lot of settings, proving LLM use is so difficult its easy to jump up the ladder from LLM (100%) to LLM (50%) and ultimately claim ownership.

The people who will get busted for this are basically just super lazy leaving ChatGPT responses in, failing to pay an editor, failing to modify images for anything more than layouts.

FrostKiwi1mo ago

> But a compiler is deterministic — same input, same output. An LLM isn't.

Temperature 0 determinism is subject to active research. NVIDIA tried but failed so far, DeepSeek V4 seems to have done it. I hope judges won't be swayed by this an AI generated code will classified as uncopyrightable, just like Images are.

alienll1mo ago

Fair point on temp-0. But I don't think determinism is what the courts will hang it on. A deterministic LLM still makes the expressive choices — naming, structure, control flow — that the human didn't make. The image cases didn't turn on whether you could re-roll the same Midjourney frame. They turned on who made the creative decisions. Same logic should hold for code.

Onavo1mo ago

But is there anything stopping a human from applying for copyright in their own name? Does the fact that somebody can recreate the prompt invalidate their claim?

SlinkyOnStairs1mo ago

What you're asking is, "could someone do fraud" and "would being found out invalidate their copyright". To both of which the answer is generally, yes.

It'd be a form of plagiarism, just with different consequences to the most common form.

alienll1mo ago

Filing isn't the gate, registration is.

Copyright Office requires you to disclose AI involvement and disclaim the AI-generated parts. Zarya of the Dawn is the example — applicant filed for the whole graphic novel, got partial registration on the human-written text, refused on the Midjourney images. The reproducibility of the prompt isn't really the test. The test is whether a human made the expressive choices.

2 more replies

JAlexoid1mo ago

AFIK: Even the slightest modification of the work is transformative and will produce copyrighted material.

It does not have to be substantial transformation.

reorder96951mo ago· 5 in thread

The whole thing with GPL code seems like a mess and surely couldn't be set as actual precedent, right? It is totally infeasible for me to check every single GPL project on every code hosting platform to see if the code Claude etc produced is too similar. If a set of training data used for the model was released to check against that would be one thing, but you can't honestly expect someone to check every repo available from all time to see if a model (that you are not informed of what it was trained on and therefore could reproduce) might've reproduced code from it.

That's not at all like checking the dependency chain of a dependency or anything as you can just read the licence of anything you're choosing to use. Surely the precedent would have to be that a model trained on GPL code has itself been infected by GPL, and therefore must have all source/weights released too if the assumption here is that it can have embedded the code well enough to be able to reproduce it?

akersten1mo ago

> Surely the precedent would have to be that a model trained on GPL code has itself been infected by GPL, and therefore must have all source/weights released

I don't see how this follows, unless we also agree that humans who have ever read any GPL code are themselves permanently tainted and therefore cannot produce anything that isn't influenced even slightly by said code.

Is it just because we think the robot does a better job at learning than we do? It's an impossible line to draw, I agree, but I don't agree that the answer is "well then everything must be considered tainted," I say the answer is "ignore a vestigial concern of a bygone era."

1 more reply

tremon1mo ago

Duplicating BSD-licensed code without copyright attribution and mention of the original license is just as much a violation of the original copyright -- that applies regardless of additional copyleft requirements imposed by the GPL. A different but no less serious restriction applies to all the code examples on MSDN: the license disallows using the samples in production code.

LLMs are effectively copyright laundering machines, and barring any indemnification clauses in the ToS (of course there are none), full liability lies with the user.

cozzyd1mo ago

There's an easy solution... release your code as GPL :)

(but that doesn't protect you against GPL-incompatible copyleft licenses, I guess)

charonn01mo ago

> It is totally infeasible for me to check every single GPL project on every code hosting platform to see if the code Claude etc produced is too similar.

I would say that choosing a tool that makes it infeasible doesn't actually excuse you from doing it.

perlgeek1mo ago

> but you can't honestly expect someone to check every repo available from all time to see if a model [...] might've reproduced code from it.

Well, if you care about not violating any licenses, you could buy services from an LLM provider that was only trained on code in the Public Domain (or code that the LLM provider licensed for that purpose), and/or buy some kind of legal guarantee from the LLM provider that the code produced is "clean".

Of course, that'd be much more expensive than current offerings, but it would reflect the real cost of software development, not just YOLOing it, from a legal perspective.

When I wrote a book, part of the contract with my publisher was that I had to attest that I actually wrote the book myself, that quotes were properly attributed etc. If you buy code-writing services, why shouldn't it contain similar clauses?

fsckboy1mo ago· 4 in thread

it's well known that recipes cannot be copyrighted. But recipes still are protected intellectual property by trade secret law if they are treated as a secret by the holder of the recipe.

Claude code itself is a trade secret, and it is not open source, so its own copyrightability is moot till you get your hands on a copy of it with clean hands.

Recipes cannot be copyrighted because they are not expressions of human creativity. Software written by AIs are also not expressions of human creativity, so the balance is tilted in favor of AI generated copy not being copyrightable.

The Supreme Court or legislation could change this, and I'd guess there will be a movement to go in that direction, but till something like that succeeded it's not so.

lelanthran1mo ago

> But recipes still are protected intellectual property by trade secret law if they are treated as a secret by the holder of the recipe.

Trade secrets aren't very well protected, though.

You can sue the person who leaked/stole your secret, but if others keep sharing it once it is leaked you can do nothing to them.

fsckboy1mo ago

i wasn't advocating for trade secrets as "equal" or "the way to go", i was trying to explain in simple terms how to think about copyright issues in concordance with the existing legal structures

people here who have not much experience were intellectually trying to reinvent wheels and I wanted to save them time in structuring their arguments. I have been exposed to various tips of the legal iceberg and was thrilled to learn what I learned and trying to pass it on.

ndiddy1mo ago

> Claude code itself is a trade secret, and it is not open source, so its own copyrightability is moot till you get your hands on a copy of it with clean hands.

In this case Anthropic published the Claude Code source map file on npm themselves. https://venturebeat.com/technology/claude-codes-source-code-...

Culonavirus1mo ago

> Software written by AIs are also not expressions of human creativity

I mean I'm not the biggest fan of AI on the planet by any means (which I think my post history would prove, lol), but isn't prompt design and steering the AI "human creativity"? In one of my AI-assisted projects I spent like a week in unending threads of posts trying to make the AI do stuff the way I wanted, testing the output, finding a bazillion of bugs and "basic bitch" solutions, asking for more robust this and edge case that. It felt like I wrote a novel. How is that not creativity (Crayon-eater or Picasso, creativity is creativity)?

2 more replies

palata1mo ago· 4 in thread

One question I have is this: if an employee produces code predominantly generated by AI, it means that it is not copyrightable. Does that mean that the employee can take that code and publish it on the Internet?

Or is it still IP even if it is not copyrightable? That would feel weird: if it's in the public domain, then it's not IP, is it?

senaevrenOP1mo ago

That is exactly the right question and the answer is genuinely strange. Uncopyrightable work falls into the public domain, which means anyone can use it, copy it, or build on it freely. The employer can still call it a trade secret and protect it through confidentiality obligations in employment contracts, but that protection is contractual rather than property-based. A trade secret loses protection the moment it is disclosed. So the employer's claim over purely AI-generated code is essentially: "you cannot share this" rather than "we own this." Those are meaningfully different legal positions, and most companies have not thought through which one they actually have.

2 more replies

BlackFly1mo ago

A recipe isn't copyrightable but is still protected under trade secret law. I imagine that the same would apply. I think the major difference with software copyright is that I can just decompile your binary or copy a binary and give it to other people. For SAAS companies that don't distribute binaries, I imagine they basically have the same protections against rogue employees.

ModernMech1mo ago

Presumably company policy would be implicated here, not copyright law. Whether or not it's copyrightable, what you create using AI is work product.

cillian641mo ago

To look at it another way, just because some code I work on at my job is derived from open source MIT-licensed code doesn't mean I personally have the right to distribute it if my company doesn't want me to. I'd guess this comes under some generic "confidential information" clause in the employment contract.

1 more reply

tiku1mo ago· 4 in thread

So by this logic my auto complete function before Ai also wrote 50% of my code and is not made by me, because I didn't type it.

What should matter is intent, the human that gives the orders.

gspr1mo ago

> What should matter is intent, the human that gives the orders.

I'd like to hear more nuance with regards to this line of reasoning. Can you conceive of a model that contains highly non-trivial representations of IP owned by others than yourself? Can you conceive that you might "order" the model to "produce" that IP? What happens then?

Try this both for "open source code" as the IP, and "the novel I wrote", and "latest Hollywood movie". The model does not have to be a real model currently available. It's just a thought experiment.

Try also to elaborate on the sliding scale between "an AI model" and "a compression system".

perlgeek1mo ago

The auto complete function doesn't really change what you write, so it doesn't remove the human creativity.

panny1mo ago

>What should matter is intent, the human that gives the orders.

If you are instructed by your professor to write an application, do you own the copyright or the professor?

Suddenly, you think you own the copyright again. In fact, in every case, you think you own the copyright. Because of your feelings. That's a common opinion here on HN too. You don't have this opinion by any logical stance. Nor by any legal doctrine.

The fact is: Copyright law applies to human authors. AI is not a human.

https://www.congress.gov/crs-product/LSB10922

pjmlp1mo ago

Well, have you actually read the license for the auto complete function?

Example,

https://marketplace.visualstudio.com/items/VisualStudioExptT...

1 more reply

metalcrow1mo ago· 4 in thread

"if Claude was trained on the LGPL-licensed codebase and its output reflects patterns learned from that code, can the output be treated as license-free? The emerging legal consensus is probably not, and assuming it can creates significant liability for anyone shipping that code commercially."

Is there any citation for this "legal consensus"? I was not aware there was any evidence backed stances on this topic as of yet

onlyrealcuzzo1mo ago

This sounds like a problem that's pretty easy to get around.

CC does not need LGPL code. There's more than enough BSD and Apache code to go around.

And they can generate synthetic data that is better than LGPL for their training.

It's also a problem that does not seem feasible to meaningfully enforce.

It's easy to generate CC code and lie and say you didn't. It would be hard to prove that you did, especially if you took any precautions to make it even slightly difficult that you did.

1 more reply

NoMoreNicksLeft1mo ago

With sufficient obfuscation (which models seem to provide intrinsically), how would anyone know to sue? On top of that, only the most major sorts of litigation have the legal force to pierce even the flimsiest of obfuscation... this is likely all moot.

If some GPL-licensed group were to sue some commercial software project that they do not have the source code for, what would even give it away? But they throw $1 million at a lawyer who can at least get it to the discovery phase somehow, and the source code is provided. It looks to be shit, but maybe an expert witness would come along and say "that looks inspired by the open source project". Where does it go from there? The model is a black box, but maybe you've got a superhero lawyer who manages to rope in Anthropic or OpenAI, and you can see how it produced the code given those prompts. What now? Are there any expert witnesses who both could say and would say that it was "bulk copying-pasting code". And if it were, what jury is going to go for that theory of the crime? Copying-and-pasting, but the code doesn't match, except in short little strings that any code might match. This isn't a slamdunk, and it's not going to proceed very far unless it's another Google-vs-Oracle shitfest.

senaevrenOP1mo ago

The chardet dispute is the closest thing to an active test case on this specific question, and you are right that it has not resolved into settled law. "Emerging legal consensus" was imprecise. The more accurate framing is: the legal community's working assumption, based on how copyright doctrine treats derivative works, is that training-data provenance travels with the output. That assumption has not been tested definitively in court yet.

senaevrenOP1mo ago

thanks for this; it's definitely a fair point. I updated the piece to reflect this

qsera1mo ago· 3 in thread

More interesting question is "Who wants to own it"...

The answer is probably "Nobody"!

nine_k1mo ago

Depending on the scale. If you ask Clause to one-shot an app from a nebulous description, you get a prototype which you would understandably loathe to own the code of. If you plan carefully and limit the scope, you get code that you understand, can approve of, and are okay owning further down the line.

1 more reply

jumploops1mo ago

At what point is liability the only "job" left for humans?

1 more reply

onlyrealcuzzo1mo ago

Presumably, every company that has non-LGPL CC code in production wants to own it...

1 more reply

gorgoiler1mo ago· 3 in thread

Three things matter when it comes to eating my breakfast sandwich:

1/ Was the pork in my sausage reared on a farm that meets agricultural standards?

2/ Was the food handled safely by the kitchen that cooked my food?

3/ Does the owner of the diner pay kitchen wages in accordance with labor law?

By contrast, I have no idea what went into the models I use, what system prompts have prejudiced it, and whose IP has been exploited in pursuit of my answer.

That’s being charitable, really. In practice the open secret of the AI industry is that the vast majority of training data, for want of a better word even if it is likely to be the most precise description, is stolen data.

amelius1mo ago

Probably, yes, but the burden of proof is with us not them.

I'm already glad some companies have the guts to open their models because proving it for open models is probably a lot easier than for a model behind a service.

2 more replies

devsda1mo ago

The media industry loves to quote ridiculous numbers on lost revenue due to piracy etc. May be a rough ballpark numbers will get them to do something about this theft.

Can someone put a rough estimate on potential revenue loss (direct and incidental) from training AI with industry wise breakup.

1 more reply

ap991mo ago

What's an example of data that might have been stolen?

dang1mo ago· 3 in thread

Could you please stop posting generated comments to HN? It's not allowed here, and it looks like you've done it over 30 times already.

(Of course, there's no way to be certain of this, but it's what our software thinks, and the overall pattern is pretty convincing.)

See https://news.ycombinator.com/newsguidelines.html#generated and https://news.ycombinator.com/item?id=47340079

woolion1mo ago

On that matter, wouldn't an AI flag for submissions help hn? I wouldn't flag a submission for LLM style as it is too harsh, but I don't want to read them -- if only because I don't like LLM prose.

There are so many submissions where most of the discussion is about whether the content has any human effort behind, or the LLM was just a purely assistive role like translating. It's really devaluing hn, IMO. Not sure how much an AI flag would help, or introduce new issues, given how difficult the problem is, though.

senaevrenOP1mo ago

You are definitely right to flag it, apologize for that. I used an AI assistant for the replies, and I will make sure not to use one going forward.

3 more replies

simonebrunozzi1mo ago

Curious: how do you exactly detect an AI-generated comment?

1 more reply

ottah1mo ago· 2 in thread

My opinion, copyright has mattered very little in the corporate world. Copyright is effectively meaningless with SaaS, and the compiled software ran on your machine is protected more by technical controls and EULAs. A world where copyright didn't exist for software would look nearly the same for the commercial world. Trade secrets, NDAs, and employment contracts bind workers more than copyright. The only thing that the question of copyright has real world impact is open source, but even then only for more restrictive licenses such as gpl.

thyrsus1mo ago

What is being licensed by the End User License Agreement (EULA) is the copyright on the code and its artefacts (executable bytes, etc.) - you can't have an EULA without having the copyright to license.

1 more reply

pocksuppet1mo ago

Plus companies just violate GPL everywhere billions of times with impunity (see: every phone ever) and nothing happens to them.

daishi551mo ago· 2 in thread

I’m no lawyer but I feel that meta, my employer, wouldn’t be letting us go hog-wild with Claude code if they weren’t completely confident that they fully owned the outputs, whether we change it or not.

senaevrenOP1mo ago

Meta's confidence almost certainly rests on the employment contracts and IP assignment clauses, not on a legal theory that AI output is inherently copyrightable. The enterprise agreement with Anthropic assigns outputs to the licensee. The employment contract assigns work product to Meta. Those two documents together give Meta a defensible ownership position regardless of the authorship question. The interesting gap is for developers using personal accounts or consumer plans on side projects, where neither of those documents exists.

1 more reply

sarchertech1mo ago

There’s so much FOMO right now around AI that no one is thinking clearly. I wouldn’t be so confident in your company.

2 more replies

heysoup1mo ago· 2 in thread

Claude don't write code? The LLM writes code. Claude loops the LLM into writing consistent code. Humans loop Claude into consistently looping the LLM.

Who own's the code? Who owns a potato? If the code is the produce of the LLM and that costs tokens, the owner of the code is the one who paid for the tokens. Money, time or attention, someone pays for the tokens, owns the code.

PokemonNoGo1mo ago

>Who owns a potato?

I don't get what this analogy is trying to tell me but I know nothing about potato law. Is this about the Belgian potato surplus?

1 more reply

cousinbryce1mo ago

I pay to listen to songs. Do I own them?

1 more reply

heikkilevanto1mo ago· 2 in thread

Ownership is one question. IMO, a more interesting question is who is responsible when the code does some real-life damage.

mock-possum1mo ago

Why should it be any different than it ever was? If a release manager checked it but didn’t catch the vulnerability, they have some culpability. If the developer shipped the code without checking it, they have some culpability too. Ultimately, if they both work under an organization that they report to, they’re responsible to that organization, which is, in turn, accountable to its customers (and investors perhaps.)

LLMs really change nothing about this.

ACCount371mo ago

No one. The usual.

skadge1mo ago· 2 in thread

This seems to be grounded in US law. Does anyone know if the same rules would apply in eg EU law?

nairboon1mo ago

Copyright law kind of transcends national borders by certain international treaties like the Berne Convention. Which is why the US copyright holders could enforce their "woulnd't steal a car" threats in Europe.

zvr1mo ago

Most of this is based on Copyright legal framework, which is surprisingly homogeneous around the world. The discussions about ownership of AI-generated material are exactly the same in EU.

smashed1mo ago· 2 in thread

The "if you generated the code at work using company tools, it's owned by your employer" affirmation in the article makes no sense to me?

If computer generated code is not copyrightable, ownership cannot be reassigned either.

croes1mo ago

How is it for human developers now if the company tool is a cloud tool and not running on company servers?

conartist61mo ago

It is copyrightable. A *human* can copyright code they wrote.

1 more reply

DeathArrow1mo ago· 2 in thread

I have a wood cutting machine and some wood. Who owns the timber?

bell-cot1mo ago

Sadly, IP "ownership" and copyright law are vastly more complex than ownership of physical stuff.

Or were you planning to reproduce the (say) Ford Motor Company's trademarked symbol in wood? If so, you're right back in the stinkin' swamp.

croes1mo ago

What is the wood in your example?

This is like a machine you ask for timber and you get timber but you didn’t need to provide any wood

e12e1mo ago· 1 in thread

Seems to gloss over other kinds of contamination, beyond GPL code. Code from pirated text books, the problem with the entire language model being trained on copyright data, and on the possibility of the training data containing various copyrighted code.

embedding-shape1mo ago

> Code from pirated text books

Anthropic "solved" this by intermingling the texts extracted from pirated books (illegal) with texts extracted from the physical books they bought and destroyed (legal), so no one can clearly say if the copyrighted material it spits out came from a legal source or not. Everyone rejoiced.

3 more replies

hackingonempty1mo ago· 1 in thread

Nobody disputes that I own the copyright in a sound recording I made just by pushing the red button on my recorder. So it is a mystery to me that copyright to any sort of human conditioned machine generation is in dispute.

senaevrenOP1mo ago

The sound recording analogy breaks down at the point where the recorder makes no creative decisions. Pressing record captures what is already there. Prompting Claude generates something that did not exist, through decisions the model makes about structure, naming, pattern, and implementation. The closer analogy is hiring a session musician and telling them the key and tempo. You own the recording under work-for-hire if they signed the right contract, but the creative expression in the performance is theirs unless explicitly assigned. The button you push to start the model is not the same button as the one on the recorder.

2 more replies

TheFirstNubian1mo ago· 1 in thread

The elephant in the room, of course, is what constitutes “meaningful human authorship.” However, I cannot shake off the feeling that all user interactions with these AI models are being logged. Perhaps this may turn out to be the bigger concern in a potential legal battle than code authorship.

senaevrenOP1mo ago

The meaningful human authorship question is the elephant, agreed, and the regulators have deliberately refused to quantify it for exactly the reason you describe any bright line number becomes a target to game rather than a standard to meet.

The logging point is sharper than it might appear. In a copyright dispute over AI-assisted code, interaction logs could cut both ways. A plaintiff trying to establish human authorship would want the logs to show substantial architectural redirection, multiple rejections of Claude output, and documented reasoning for structural decisions. A defendant challenging that authorship claim would subpoena the same logs to show verbatim acceptance of output without modification.

The practical implication i guess here,that the developers who want to preserve a copyright claim over AI-assisted code should treat their prompt history as a legal document from the start. It seems all over the world the logs are the evidence. Whether they help or hurt depends entirely on what they show.

1 more reply

tommy29tmar1mo ago· 1 in thread

Maybe the useful test is not “who wrote this line?” but “can you show how it went from requirement/prompt/context to diff to human review/tests?” If you can’t, ownership is only one issue. You also can’t tell what was accepted as engineering work versus just copied output.

senaevrenOP1mo ago

This is actually closer to how the Copyright Office thinks about it than the article makes clear. The registration guidance that emerged from the Thaler proceedings specifically asks applicants to describe the human creative contributions and how the AI was used. A documented workflow showing requirement, architectural decision, rejection of AI output, human restructuring, and review creates a paper trail that maps directly onto what the Office looks for. The can you show how it got here test you are describing is the practical version of the legal standard.

reliablereason1mo ago· 1 in thread

This is like asking:

"Who owns the text microsoft word helped you write?"

Claude code is a software tool not a legal entity.

freejazz1mo ago

Not if claude does the writing. MS doesn't write things for you, and if it did, you would not be entitled to a copyright in whatever it wrote for you.

1 more reply

padmabushan1mo ago· 1 in thread

First answer who owns the model built with public data

senaevrenOP1mo ago

The model ownership question and the output ownership question run on separate legal tracks and the piece focuses on the second deliberately. On the first: the model weights are owned by Anthropic under work-for-hire from their engineers regardless of what the training data contained. Training data copyright infringement is a separate tort claim against Anthropic, not a basis for anyone else to claim ownership of the model. The Bartz settlement resolved the pirated books claim without disturbing Anthropic's ownership of the weights. Owning the training data does not give you ownership of the model trained on it, any more than owning the paint gives you ownership of the painting.

kouru2251mo ago· 1 in thread

IMO this is the greatest argument against AI as technofascism. The general public seems to believe that AI will usher in technofascism by claiming corporate ownership of AI output: the independent entrepreneur will be unable to compete against the corporations compute, every piece of data about you will be stolen and monetized by AI, and you will own nothing.

But AI might in fact do the exact opposite and reverse the privatization trend that the West has been going through for the last 400 years. All of our copyright laws rely on the idea that there is a human consciousness behind the copyright. The more AI has input, the less we can claim ownership. If AI returns everything to the commons, then it results in a much more egalitarian world.

Hilariously, many people, especially artists, see the return of the commons as an assault against them. They’re so captured by copyright that they assume any infringement on their copyright is inherently fascist. It’s ridiculous. Copyright is a corporations number 1 weapon when it comes to creating a moat and keeping the masses out.

The original intent of copyright, in fact, was an incentive to return an idea to the commons. Experts used to hide their discoveries in order to keep them for themselves. Copyright provided an opportunity to release this knowledge and still profit. There were even several cases where it was established that those who claimed copyright could retain copyright even if the idea had been previously discovered. This created a huge incentive: release the knowledge or risk having your process copyrighted by the opposition. But that system worked because copyright could only exist for so long (14 years, doubled if they filed again.)

Now copyright is a lifelong sentence at almost 100 years. The entire purpose of it has been undermined. Corporations own all your childhood and by the time you can profit off of it, it’s outdated.

A world where the mainstream is primarily a commons seems to me like an egalitarian world. I’d like to live in that world.

senaevrenOP1mo ago

The original bargain you describe, limited term in exchange for public disclosure, is exactly what makes the current situation strange. If AI-generated output falls into the public domain immediately, that is actually closer to the original intent of copyright than 95-year terms. The legal question is whether that outcome happens by design or by accident, and what it means for the people building products on top of AI-generated codebases right now.

1 more reply

Isamu1mo ago· 1 in thread

Copyright has a lot to do with what we as a society want to protect and encourage. We want to protect an author that put the hours into creating a book, as opposed to the person creating a copy of that work. The person copying can claim they put in work too but the claim is not strong enough to override our preference to protect original authors.

Part of the problem with generated works is that it is lower effort like the person copying something. It’s not an activity that demands special protection like original authorship. I believe this is a large part of the reasoning.

torben-friis1mo ago

AI is a monster to our current copyright system - monster in the philosophical sense, that is, an example that destroys the concept.

First, its creation is (claimed to be) extremely useful for society, but in order to be created it requires ignoring copyright for pretty much everything ever written. Something we kinda shrugged under the table.

Then, it introduces an extreme jump down in creation effort - so if the focus is protection of effortful creation, nothing with AI use qualifies. But of course, you'd want society to benefit from effortlessness in general, spending more effort than needed in a task is the opposite of efficiency.

gspr1mo ago

I'm still flabbergasted that people – and big, visible companies with big targets on their backs – choose to keep on using the output of LLMs without having an answer to these questions.

And I'm worried that once that has been sufficiently normalized, laws and interpretations of them will adapt to whatever best suits those users. Which will mean copyrightwashing of FOSS. My only hope then is that surely if free software can be copyright-washed by the big guys, then so can the little guy copyright-wash the big guys' blockbuster movies or whatever, which might lead to some sort of reckoning.

jerleth1mo ago

> What to preserve: Commit messages that describe what you changed and why, not just what the AI generated. “Restructured Claude’s module architecture, rejected initial state management approach, rewrote error handling from scratch” is evidence. “Add rate limiting module” is not.

> The second commit message versus the first is the difference between a defensible authorship claim and a clean “Claude wrote this” record.

That makes no sense to me, as the commit message is probably LLM generated as well. (and even easier to generate as it doesn't have to compile or pass automated tests).

bearjaws1mo ago

Article is incredibly fear mongering.

Twice in my career the owners of a company have wanted to sue competitors for stealing their "product" after poaching our staff.

Each time, the lawyers came in and basically told us that suing them for copyright is suicide, will inevitably be nearly impossible to prove, and money would be better spent in many other areas.

In fact, we ended up suing them (and they settled) for stealing our copyrighted clinical content, which they copied so blatantly they left our own typos and customer support phone number in it.

Go ahead, try to sue over your copyrighted code, 10 years and 100M later you will end up like Google v Oracle. What if the code is even 5% different? What about elements dictated by external constraints; hardware, industry standards, common programming practices, these aren't copyrightable.

Then you have merger doctrine, how many ways can we really represent the same basic functions?

Same goes with the copyleft argument, "code resembling copyleft" is incredibly vague, it would need to be verbatim the code, not resembling. Then you have the history of copyleft, there have been many abuses of copyleft and only ~10 notable lawsuits. Now because AI wrote it (which makes it _even harder_ to enforce), we will see a sudden outburst of copyleft cases? I doubt it.

Ultimately anyone can sue you for any reason, nothing is stopping anyone right now from suing you claiming AI stole their copyleft code.

1 more reply

joshka1mo ago

If you want to go much deeper, https://www.copyright.gov/ai/ is particularly good at least on the side of comprehensiveness.

zuzululu1mo ago

I think it's pretty clear cut, whoever is paying for your agentic coding tool subscription is part of the litmus test.

I use my own computer, I pay for my own subscription and I build my open source projects then the code belongs to me.

If I use my company's computer, they pay for my subscription and we work on the company's projects then the code belongs to the company.

In any step of the way if some copy-left or any other form of exotic open source license is violated, who pays for discovery? Is it someone in Russia who created a popular OSS library that is now owed? How will it be enforced?

bandrami1mo ago

This is a big question that makes my employer nervous about using LLM-generated code, along with the even-more-unresolved question "what happens if the LLM outputs an algorithm that is protected by patent?" (particularly worrying because we know the base training included patent descriptions.) Questionable copyright can often be worked around (particularly since we don't distribute source) but infringing on a patent can destroy a company.

randyrand1mo ago

Normally this solved with an employment contract: "Anything you write, the copyright is transferred to your employer"

dash21mo ago

I wrote an R library doing some simple regressions using the GPU, with Claude. I asked it to provide the same API as lm, glm and some other base R functions. It copied their code wholesale without mentioning it to me. So, now my library is GPL… which is not a big deal in this context, but it was quite a shock.

hmokiguess1mo ago

Tangential but I find this an interesting parallel from a few years ago:

https://www.vice.com/en/article/musicians-algorithmically-ge...

kazinator1mo ago

> Code that Claude Code or Cursor generated and you accepted without meaningful modification may not be copyrightable by anyone.

Except if it happens to regurgitate a significant excerpt of some existing work, then the authors of that can assert their copyright; i.e. claim that it infringes.

giancarlostoro1mo ago

Did Claude Code not start out as human input? Would it not be safe to say that a reasonable amount of it is still human input? But also, just because its mysteriously "not theirs" doesn't mean they magically have to give you the code.

mensetmanusman1mo ago

It’s the same as photography. No photographer built the multibillion dollar supply chain for the optics train in a camera, nor did they build the city scape they are enjoying as a background, they simply set the stage and push a button.

raggi1mo ago

Lawyers I have spoken to have stated strongly that they believe collective works doctrine will provide strong protections for most mature and sizable software. I see no mention of these considerations here.

GhostDriftInc1mo ago

The documentation advice is practical, but commit messages and prompt logs are self-reported. "Meaningful human authorship" needs a verifiable evidentiary chain, not attestations.

sidewndr461mo ago

Note to anyone reading this: the author is actively reading the comments and updating the piece based off reported issues. As a result, no meaningful discussion will take place here.

jillesvangurp1mo ago

Good overview of the issues. I'm sure there are a few nits to pick with that.

But something that is overlooked is that the world is bigger than the US and it's an absolute zoo out there in terms of copyright laws in different countries. Anything you think you might understand about this topic goes out of the window if you have international customers or provide software services outside the US. Or are not actually based there to begin with. And there are treaties between countries to consider as well.

Courts tend to try to be consistent with previous rulings, interpretations, etc. When it comes to copyright, there are a few centuries of such rulings. The commonly held opinions among developers that aren't lawyers are that AI is somehow different. And of course since the law hasn't actually changed, the simple legal question then becomes "How?". And the answer to that seems to involve a lot of different notions.

For example, "AIs are not people, and therefore any content produced by them isn't covered by copyright to begin with" is one of the notions brought up in the article. A lawyer might have some legal nits to pick with that one but it seems to broadly be the common interpretation. So AI's don't violate copyright by doing what they do. In the same way you can't charge a Xerox machine with copyright infringements. Or Xerox. But you could go after a person using one.

And another notion is that any content distributed by a human can be infringing on somebody else's copyright and that party can try to argue their case in a court and ask for compensation. Note that that sentence doesn't involve the word AI in any way. How the infringing party creates/copies the content is actually irrelevant. Either it infringes or it doesn't. You could be using AI, a Tibetan Monk copying things by hand, trained monkeys hitting the keyboard randomly, a photo copier, or whatever. It does not really matter from a legal point of view. All that matters is that you somehow obtained a copy of an apparently copyrighted work. AI is just yet another way to create copies and not in any way special here.

There are of course lots of legal fine points to make to how models are trained, how training data is handled, etc. But if you break each of those down it boils down to "this large blob of random numbers doesn't really resemble the shape or form of some copyrighted thing" and "Anthropic used dodgy means to get their hands on copies of copyrighted work". I actually received a letter inviting me to claim some money back from them recently, like many other copyright holders.

mlmonkey1mo ago

On a related note, another question: who owns the paper that Claude (or OpenAI) wrote? Should such paper submissions in conferences call out the model(s) used to write the paper itself?

lofaszvanitt1mo ago

This is a non issue, since any complex thing needs a lot of human oversight, otherwise it's nothing more than a multitentacled monstrosity.

pfortuny1mo ago

You don't but nevertheless you bear the responsibility of making it public (whether in soyrce or binary form). That is what Anthropic would like.

6d6b731mo ago

LLMs are just tools we use. If I program an app in C++, do I not own the rights to the executable because my compiler wrote machine code for me?

unholiness1mo ago

> Here is the legal baseline, in plain terms:

This particular AI-ism really encapsulates what annoys me about some AI-isms. I don't mind the delves and the em-dashes that just give away the AI source of what otherwise might be good text. But these structural pieces just feel fundamentally not for the reader. Part of it is blatant pick-me language for the human feedback ("hey look you wanted plain language I did that") and part of it feels like it's just helping the future token stream (thinking-like tokens polluting the actual text).

The not-this-but-that, the sycophancy, the symbolizing-vague-significance, they all have this flavor of serving a process that's no longer there as I now need to read it. It gives a similar sickening feeling to the one I get seeing something designed by committee.

briandw1mo ago

Your employer can claim your code if you use their tools to produce it. Nothing new here. This has nothing to do with AI tooling.

everdrive1mo ago

Well I don't own anything I write while working on my company. Maybe my company and Claude can fight over who owns it.

jMyles1mo ago

There is no such thing as ownership of a pattern of information. It has been an illusion, and that illusion is now fading.

mock-possum1mo ago

I do. I used a tool to create it. I own the things I create.

Anything else is just bullshit equivocation.

mifydev1mo ago

Missed opportunity for a tongue twister:

Who coded the code Claude Code code?

ikrenji1mo ago

the entire US economy rides on AI. no ruling throwing a wrench into the multi trillion engine is ever going to be permitted to happen

rnxrx1mo ago

The idea that the provenance of a given tool's code inherently pollutes the material it's used with seems kind of illogical. Wouldn't it follow from this premise that any code written using open source IDEs and debugged with open source debuggers and other tooling would itself then be considered copyleft? Are works written with LibreOffice not copyrightable?

There's obviously a huge issue with the legitimacy and ownership of training data being fed to LLMs. That seems like an issue between the owners of that IP and the people training the models and selling them as services more than the people using the tool. Isn't this just another flavor of SCO trying to extort money out of companies using Linux?

po1nt1mo ago

Who owns the code my keyboard wrote?

aakresearch1mo ago

It seems that author unironically advises to write your commit messages like this: "Restructured Claude’s module architecture, rejected initial state management approach, rewrote error handling from scratch", to have a chance at defense in potential court hearing. I find it funny, if vindicating for my personal approach. If the expectation is to "restructure, reject, rewrite" what "AI" spits out, why use "AI" at all at this point???

threepts1mo ago

Whoever pays for the tokens.

pelasaco1mo ago

so as i understood GPL dont cover code written by agents?

teeray1mo ago

What if no meaningful thought was put into the code (entirely vibe-coded slop), but it’s made for your employer? Shouldn’t the work be uncopyrightable?

nicman231mo ago

i do, all of it. sorry

whattheheckheck1mo ago

Ask chatgpt deep research citing court cases and it shows dark factory swe code are not copyrightable under current precedents.

Even steering it with prompts isn't enough. The guy couldn't copyright the image he made with ai, code is no different.

Maybe prompts written by humans are copyrightable.

Can't wait for the Billionaires to entrench in court they can steal everything for these machines and claim it as their own and maybe even reach for anything that it helps produce. Fuck that

giannicmptr10001mo ago

yo Mama

-Claude

theteapot1mo ago

That was a rather unhelpful TL;DR.

j / k navigate · click thread line to collapse

530 comments

253 comments · 69 top-level

Arcuru1mo ago· 38 in thread

Personally, I think that the human directing the agent owns the copyright for whatever is produced, but the ability for the agent to build it in the first place is based off of stolen IP.

nadermx1mo ago

Funny how the copyright industry was able to spin copyright infringment into the pejorative "stealing". If you still have the item, what was stolen?

tensor1mo ago

I still find the idea that "learning" from code is "stealing" kind of ridiculous.

10 more replies

NewsaHackO1mo ago

4 more replies

Neywiny1mo ago

I don't think it's unreasonable to consider it stolen potential profit, but agreed that's not how they spin it

blks1mo ago

“Stolen” as in “profited on IP against terms and conditions of the license”.

Aerroon1mo ago

hxtk1mo ago

This analysis yields very different results under utilitarianism vs rule utilitarianism.

rectang1mo ago

Creation of the LLM itself is transformative, but LLM output which infringes is not.

2ndorderthought1mo ago

1 more reply

KallDrexx1mo ago

Do you think that human directing the agent owns copyright for any legal reason?

zarzavat1mo ago

It depends on what level of creative control you had over the code.

Code is protected by copyright as a literary work. The method is not protected by copyright, that would be the domain of patents. What's protected are the words.

1 more reply

marcus_holmes1mo ago

3 more replies

Animats1mo ago

> only humans can be granted copyright.

No, a copyright application can be filed with a corporation listed as the author. Watch for the copyright notice at the end of the next major movie you see.

3 more replies

CWuestefeld1mo ago

but the ability for the agent to build it in the first place is based off of stolen IP.

I honestly don't understand why the attitude that underlies this is so prevalent.

demorro1mo ago

2 more replies

missingcolours1mo ago

jacquesm1mo ago

Scale and the ability to generate a livelihood of your creations and/or the ability to control how what you have created is used, for instance, to demand attribution.

gspr1mo ago

You are presumably human. We have granted humans specific exemptions in copyright law. We have not granted that to LLMs. Why are we so eager to?

4 more replies

atleastoptimal1mo ago

The attitude is derived from a general animus many have towards AI companies. They resent the efficacy of AI because it devalues individual expertise.

https://www.youtube.com/watch?v=IFe9wiDfb0E

2 more replies

rspeele1mo ago

I guess I'm just irrational that way.

1 more reply

blks1mo ago

You’re confusing yourself with a commercial product. You’re not a product that was created by other human beings based on someone else’s IP.

1 more reply

jmyeet1mo ago

I can totally see this applying here as well.

[1]: https://en.wikipedia.org/wiki/Monkey_selfie_copyright_disput...

varispeed1mo ago

gspr1mo ago

Let me get this straight: Since there are only so many ways to write a for loop, you doubt that for loops are copyrightable. From this you conclude that code, in general isn't copyrightable?

That's like saying "there's only so many ways to greet your neighbor, so any text that simply greets your neighbor isn't copyrightable – and therefore no text is copyrightable".

alok-g1mo ago

Note: IANAL

The Google-Oracle lawsuit did not decide whether APIs (when large in number) are copyrightable or not.

ako1mo ago

I've created my own DSL, and instruct Claude Code how to generate code for this DSL using skills.

Maybe a good reason to create a new programming language?

alok-g1mo ago

Note: IANAL. The above is just from my current understanding.

jongjong1mo ago

"Permission is hereby granted, free of charge, to any person obtaining a copy of this software"

The word 'person' is used very intentionally here.

People invested a lot of time building their entire careers around the assumption of copyright protection; so for it to be violated on such a scale would be a massive betrayal.

dredmorbius1mo ago

That's not what's been established to date in US caselaw:

<https://caselaw.findlaw.com/court/us-dis-crt-dis-col/1149169...>.

amarant1mo ago

I don't think there's even a valid argument for any other ownership model, or at least none that I can think of.

jmaw1mo ago

The primary issue being that it's all built on stolen data in the first place.

1 more reply

cess111mo ago

jacquesm1mo ago

No, that human owns the copyright on the prompt, not on the work product.

keithba1mo ago

That’s now how it works. The human using the tool (like claude code, etc) owns the copyright of the code generated.

1 more reply

alok-g1mo ago

2 more replies

kridsdale11mo ago

So I’m responsible for pushing the giant boulder at the top of the hill.

The humans at the bottom who were crushed should blame the boulder, which happened to be moving.

1 more reply

saadn921mo ago

I agree with this sentiment, because the person directing the agent can still direct it in a way where it'll produce a better or worse output than another person directing it.

amelius1mo ago

I wonder what OSS licenses would have looked like if we saw all of this coming.

semiquaver1mo ago· 24 in thread

  > The US Copyright Office confirmed this in January 2025, and the Supreme Court declined to disturb it in March 2026 when it turned away the Thaler appeal. Works predominantly generated by AI without meaningful human authorship are not eligible for copyright protection, and that rule is now settled at the highest judicial level available.

Misstates the law. Denial of certiorari can happen for many reasons unrelated to the merits and does not settle the issue nationwide.

PaulDavisThe1st1mo ago

From TFA:

Your quoted text is no longer in TFA.

jibal1mo ago

Because the author acted on that comment.

semiquaver1mo ago

c.f. OP’s comments in this thread.

graemep1mo ago

It also contradicts everything else I have read about Thaler. AFAIK the ruling was that the AI could not hold copyright. Thaler waived any claim to the the copyright holder himself.

The last two bullet points on this page cover this:

https://www.authorsalliance.org/2025/03/19/thaler-v-perlmutt...

The site also explains the qualifications and experience in copyright law of the author of the above - unlike the article here.

greensoap1mo ago

KallDrexx1mo ago

senaevrenOP1mo ago

sowbug1mo ago

Since this is a tech audience... the Supreme Court uses a bounded priority queue. An unbounded queue would risk growing impractically large.

There are some kinds of cases where the Court has "original jurisdiction," meaning they must hear them, but those are very rare.

WillPostForFood1mo ago

semiquaver1mo ago

100%. This is the real fix, we have new situations, we need new laws. Unfortunately Congress is currently broken.

jmyeet1mo ago

The Supreme Court declining to take up an issue is taking a position.

semiquaver1mo ago

  > The Supreme Court declining to take up an issue is taking a position.

No it is not.

  > “The denial of a writ of certiorari imports no expression of opinion upon the merits of the case, as the bar has been told many times.”

United States v. Carver, 260 U. S. 482, 490 (1923).

Moreover, SCOTUS does not decide issues, they decide cases.

  > “We are acutely aware, however, that we sit to decide concrete cases, and not abstract propositions of law.”

Upjohn Co. v. United States, 449 U. S. 383, 386 (1981).

greensoap1mo ago

2 more replies

senaevrenOP1mo ago

21asdffdsa121mo ago

streetfighter641mo ago

Free water but not electricity? I'll just hook up a generator to the shower...

These sorts of simplistic loopholes rarely work. Imagine if you could get copyright for the linux kernel by just rearranging it and renaming a few variables.

1 more reply

consp1mo ago

1 more reply

DrewADesign1mo ago

semiquaver1mo ago

The decision is binding only within the jurisdiction of the Court of Appeals for the D.C. Circuit.

So it’s not correct to say “because SCOTUS denied cert, Thaler is now binding national copyright law.”

2 more replies

freejazz1mo ago

It does settle the law in as far as maintaining the status quo.

matheusmoreira1mo ago

> meaningful human authorship

How is this defined? Is my code review "meaningful" ? Are my amendments and edits to the generated code "human authorship" ?

cooper_ganglia1mo ago

From the article:

> Specifying an objective to the model is not enough. Directing how the work is constructed is what counts.

3 more replies

wayeq1mo ago

read the article?

1 more reply

jugg1es1mo ago· 14 in thread

senaevrenOP1mo ago

rasz1mo ago

Work-for-hire doctrine doesnt automagically absolve you from IP law. Microsoft and Intel already learned this in the nineties when they paid San Francisco Canyon Company to steal Apple code.

https://en.wikipedia.org/wiki/San_Francisco_Canyon_Company

LLMs are just code stealers, will gladly generate Carmacks inverse for you with original comments.

1 more reply

gpm1mo ago

It doesn't seem like bad faith to think that copyright is stronger than the courts end up thinking, just being mistaken.

1 more reply

CWuestefeld1mo ago

I can't see how that can work.

As a developer, the fact that my source code passed through a compiler - an automated tool - doesn't give the author of the compiler any claim on my executable code.

If a court later found the codebase was predominantly AI-authored and therefore not copyrightable

embedding-shape1mo ago

Best part is, it's likely to have a different answer in every country, who knows what'll happen, not every country implicitly sides with the ones with the most money.

MarsIronPI1mo ago

Well, eventually it'll probably be added to the Berne Convention agreement or some such.

1 more reply

adrianN1mo ago

Depends on where they pay their taxes generally.

beej711mo ago

I love that genAI art will not be copyrightable and genAI code will be. The power of the Almighty Dollar at work.

conartist61mo ago

It's not wishful thinking, and ownership isn't a foregone conclusion.

Sure the courts could mint a communist society with a few weird decisions about property rights, but this being the US do you really suppose that's likely?

wongarsu1mo ago

1 more reply

helterskelter1mo ago

I'm not sure Anthropic would appreciate the liability that ownership would imply.

helterskelter1mo ago

Too late to edit, but OpenAI certainly doesn't want ownership or liability, for the CSAM they've produced. They certainly don't want ownership/liability of code which does $ONLYAWFULTHING.

dfxm121mo ago

bombcar1mo ago

cestith1mo ago· 11 in thread

This seems quite shoddy and biased for an article by someone who’s writing about the law.

dehrmann1mo ago

> training the LLM in violation of a license

Bartz v. Anthropic found that this is fair use, so the license doesn't play into it.

cestith1mo ago

3 more replies

vbarrielle1mo ago

I thought fair use was decided on a case by case basis, and could not be guaranteed? If true, wouldn't that mean that in other cases it could be ruled differently?

1 more reply

panzi1mo ago

vablings1mo ago

It is probably fair that a huge share of code that is Foss is licensed under GPL, much larger than the share of source available proprietary licensed code

d0mine1mo ago

Here's github statistics from 2015 https://github.blog/open-source/open-source-license-usage-on...

MIT is used by more projects than GPL.

1 more reply

kube-system1mo ago

There is a lot of code on the internet that isn't accompanied by a FOSS license or any license that permits reuse, or any license at all.

cestith1mo ago

numbsafari1mo ago

I would have assumed the opposite is true. Do you have any data to back that up?

1 more reply

charonn01mo ago

They probably focused on the GPL because of its viral copyleft features.

cestith1mo ago

Do you suspect that an LLM that would recreate a substantial portion of a licensed work would honor any license? Even a 2-clause BSD one?

1 more reply

bko1mo ago· 11 in thread

This comes up in a few places as a kind of vindictive battle. One example is Oracle suing Google for too closely mimicking their API in Android. Here is an example:

> private static void rangeCheck(int arrayLen, int fromIndex, int toIndex) {

    if (fromIndex > toIndex)

        throw new IllegalArgumentException("fromIndex(" +

fromIndex +

                                           ") > toIndex(" +

toIndex + ")");

    if (fromIndex < 0)

        throw new ArrayIndexOutOfBoundsException(fromIndex);

    if (toIndex > arrayLen)

        throw new ArrayIndexOutOfBoundsException(toIndex);

}

In 99.9% of instances none of this matter. Sure there's the technical letter of the law but in practice, and especially now, none of this matters.

https://www.supremecourt.gov/opinions/20pdf/18-956_d18f.pdf

freedomben1mo ago

> Almost no one thinks their code is copyrightable or seriously thinks their code is a moat.

Edit: Fixed errant copy pasta error. Glad that wasn't a password :-)

mbesto1mo ago

Totally agreed.

This argument easily gets shut down when I asked why, Twitch, a $1B business didn't crater to their competition when their full codebase was leaked.

bko1mo ago

1 more reply

BobbyTables21mo ago

I’ve worked at too many places where I mused that if someone gave the source code to the competitors, it’d likely drive the competitors out of business as they tried to use it.

Keeping it proprietary probably has the greatest value in preserving the company’s reputation…

hackingonempty1mo ago

Nursie1mo ago

> Almost no one thinks their code is copyrightable

I think this is an unusual opinion.

If code isn't copyrightable, from where comes the GPL?

I think it's the opposite - almost everyone thinks their code is copyrightable, outside of APIs and interop stuff, or things so simple as to be trivial.

croes1mo ago

> Almost no one thinks their code is copyrightable

Then why does reverse engineered code need to be a clean room implementation?

Ask any emulator developer or the developers of ReactOS

https://reactos.org/forum/viewtopic.php?t=21740

1 more reply

conartist61mo ago

Nobody ever talks about convergence.

You, right now, are taking about convergence.

If there is no artwork, there can be no copyright. If every character of the code to write is basically predetermined by the APIs you need to call, there is no artwork and no copyright.

Build a novel new API, and you'll be protected though.

sarchertech1mo ago

> Almost no one thinks their code is copyrightable

Every open source license is built on the premise that code is copyrightable.

adrian_b1mo ago

No.

It is based on the premise that if the proprietary licenses are valid, then also the open source licenses are valid.

So what is held as true is only the implication stated above and not the truth value of the claims that either kind of licenses are valid.

If the proprietary licenses are not valid, then it does not matter that also the open source licenses are not valid.

1 more reply

Rietty1mo ago

Why were the HFT firms suing employees?

p0w3n3d1mo ago· 9 in thread

That's quite impressive approach from the companies' perspective. Let's first use claude code and then we'll think who the code belongs to.

I think that the gold rush approach happening right now around me (my company EMs forcing me to work with claude as fast as possible) show really short-sight of all the management people.

First - I lose my understanding of the code base by relying too much on claude code.

Second - we drop all the good coding practices (like XP, code review etc.) because claude is reviewing claude's code.

I could go on and on. And all those cultural changes are because of money. So I dub this "goldrush", open my popcorn and see what happens next.

nicoburns1mo ago

ryandrake1mo ago

yason1mo ago

On the other hand, separating FE and BE between two teams, necessitating proper interfaces, can often be considered a feature.

bearjaws1mo ago

I rarely see #3 yield better solutions, it's usually better to collaborate as a team on requirements and gotchas, but let one person own implementation.

p0w3n3d1mo ago

But both backend and front-end? Do everyone have to be full stack?

senaevrenOP1mo ago

sebastianconcpt1mo ago

eddyfromtheblok1mo ago

refulgentis1mo ago

I opened my popcorn for the unholy trinity of HN x law x AI, your comment was one of my faves, love the purple prose. :)

_flux1mo ago· 9 in thread

I think it should be pretty clear that if you provided the tool the specification for the code you want, you have already provided creative input.

After all, is this not what happens with compilers as well? LLM agents are just quite advanced compilers that don't require the specification to be as detailed as with traditional compilers.

yodon1mo ago

>it should be pretty clear that if you provided the tool the specification for the code you want, you have already provided creative input.

_flux1mo ago

Let's say we didn't have assemblers, but instead we would have three professions:

- Specifiers, who make the specification for the system

- Programmers, who write C code

- Machine encoders, that take that C code and write machine code for a CPU

Would it be that the copyright would then belong to programmers, if no other explicit assignments would be made?

---

1 more reply

anikom151mo ago

LLMs aren’t human.

senaevrenOP1mo ago

everforward1mo ago

I’m dubious the outcome will be “any level of prompting is enough creativity”.

d01001mo ago

The trick is to constrain the LLM to program in a very defined coding style

If I make the LLM generate code that follows my own code architecture and style, that should be enough creative input

1 more reply

pocksuppet1mo ago

Fine then that's not copyrightable at all. Just like hello world isn't copyrightable, whether in source form or compiled form.

xmcp1231mo ago

The copyright falls to the artist, not the person commissioning it.

Complicated in this case, because there is no artist.

hypercube331mo ago

To me this is like asking who owns the binary files a compiler generates.

jhbadger1mo ago· 9 in thread

FartyMcFarter1mo ago

The article addresses this explicitly:

> Works predominantly generated by AI without meaningful human authorship are not eligible for copyright protection

Note the word "predominantly", and the discussion that follows in the article about what the courts and the copyright office said.

1 more reply

Luker881mo ago

No such assumption is made in the article.

Nor does it give a single answer.

Mere prompting is still not enough for copyright, and the problem is unsolved on how much contribution a human needs to make to the generated code.

In the case for generated images copyright has been assigned only to the human-modified parts.

Even worse, it will be slightly different in other nations.

The only one that accepts copyright for the unchanged output of a prompt is China.

1 more reply

conartist61mo ago

Plus what if Anna Karenina was GPL?

1 more reply

brianwawok1mo ago

You use humans to edit AI code? When you level up you are just using AI to write, AI to review, AI to edit, AI to test. Not a lot of steps left for meat bags.

3 more replies

exe341mo ago

> This is of course assuming you take AI-generated code unchanged. But you don't, in my experience. And that generates a new work fully copyrightable even if the original wasn't.

That's not how copyright works. The modified version is derivative. You can't just take the Linux kernel, make some changes, and slap a new license on it.

throwatdem123111mo ago

Ok what about all the Anthropic’s engineers who say they don’t write code at all and it’s 100% AI-generated?

gchamonlive1mo ago

> This is of course assuming you take AI-generated code unchanged.

How much code do you need to change in order for it to be original? One line? 10%? More than 50%?

That's arbitrary and quite unproductive convo to be honest.

1 more reply

6stringmerc1mo ago

1 more reply

mzl1mo ago

If you modify the work, that creates a derived work from whatever copyright the original works has, not a new work that is fully copyrightable.

As the article says in the Tl;DR at the top the code may be contaminated by open source licenses

> Agentic coding tools like Claude Code, Cursor, and Codex generate code that may be uncopyrightable, owned by your employer, or contaminated by open source licenses you cannot see

alienll1mo ago· 7 in thread

This is the same shape as the image cases.

protocolture1mo ago

They also mention in the same document that were LLMs to more closely approximate deterministic tools, they would be open to reevaluating. That is Requesting X gets X without substantial wiggle room.

Then there's testing, review etc, human processes confirming that the output meets spec, updating it where needed intelligently.

The people who will get busted for this are basically just super lazy leaving ChatGPT responses in, failing to pay an editor, failing to modify images for anything more than layouts.

FrostKiwi1mo ago

> But a compiler is deterministic — same input, same output. An LLM isn't.

alienll1mo ago

Onavo1mo ago

But is there anything stopping a human from applying for copyright in their own name? Does the fact that somebody can recreate the prompt invalidate their claim?

SlinkyOnStairs1mo ago

What you're asking is, "could someone do fraud" and "would being found out invalidate their copyright". To both of which the answer is generally, yes.

It'd be a form of plagiarism, just with different consequences to the most common form.

alienll1mo ago

Filing isn't the gate, registration is.

2 more replies

JAlexoid1mo ago

AFIK: Even the slightest modification of the work is transformative and will produce copyrighted material.

It does not have to be substantial transformation.

reorder96951mo ago· 5 in thread

akersten1mo ago

> Surely the precedent would have to be that a model trained on GPL code has itself been infected by GPL, and therefore must have all source/weights released

1 more reply

tremon1mo ago

LLMs are effectively copyright laundering machines, and barring any indemnification clauses in the ToS (of course there are none), full liability lies with the user.

cozzyd1mo ago

There's an easy solution... release your code as GPL :)

(but that doesn't protect you against GPL-incompatible copyleft licenses, I guess)

charonn01mo ago

> It is totally infeasible for me to check every single GPL project on every code hosting platform to see if the code Claude etc produced is too similar.

I would say that choosing a tool that makes it infeasible doesn't actually excuse you from doing it.

perlgeek1mo ago

> but you can't honestly expect someone to check every repo available from all time to see if a model [...] might've reproduced code from it.

Of course, that'd be much more expensive than current offerings, but it would reflect the real cost of software development, not just YOLOing it, from a legal perspective.

fsckboy1mo ago· 4 in thread

it's well known that recipes cannot be copyrighted. But recipes still are protected intellectual property by trade secret law if they are treated as a secret by the holder of the recipe.

Claude code itself is a trade secret, and it is not open source, so its own copyrightability is moot till you get your hands on a copy of it with clean hands.

The Supreme Court or legislation could change this, and I'd guess there will be a movement to go in that direction, but till something like that succeeded it's not so.

lelanthran1mo ago

> But recipes still are protected intellectual property by trade secret law if they are treated as a secret by the holder of the recipe.

Trade secrets aren't very well protected, though.

You can sue the person who leaked/stole your secret, but if others keep sharing it once it is leaked you can do nothing to them.

fsckboy1mo ago

i wasn't advocating for trade secrets as "equal" or "the way to go", i was trying to explain in simple terms how to think about copyright issues in concordance with the existing legal structures

ndiddy1mo ago

> Claude code itself is a trade secret, and it is not open source, so its own copyrightability is moot till you get your hands on a copy of it with clean hands.

In this case Anthropic published the Claude Code source map file on npm themselves. https://venturebeat.com/technology/claude-codes-source-code-...

Culonavirus1mo ago

> Software written by AIs are also not expressions of human creativity

2 more replies

palata1mo ago· 4 in thread

Or is it still IP even if it is not copyrightable? That would feel weird: if it's in the public domain, then it's not IP, is it?

senaevrenOP1mo ago

2 more replies

BlackFly1mo ago

ModernMech1mo ago

Presumably company policy would be implicated here, not copyright law. Whether or not it's copyrightable, what you create using AI is work product.

cillian641mo ago

1 more reply

tiku1mo ago· 4 in thread

So by this logic my auto complete function before Ai also wrote 50% of my code and is not made by me, because I didn't type it.

What should matter is intent, the human that gives the orders.

gspr1mo ago

> What should matter is intent, the human that gives the orders.

Try this both for "open source code" as the IP, and "the novel I wrote", and "latest Hollywood movie". The model does not have to be a real model currently available. It's just a thought experiment.

Try also to elaborate on the sliding scale between "an AI model" and "a compression system".

perlgeek1mo ago

The auto complete function doesn't really change what you write, so it doesn't remove the human creativity.

panny1mo ago

>What should matter is intent, the human that gives the orders.

If you are instructed by your professor to write an application, do you own the copyright or the professor?

The fact is: Copyright law applies to human authors. AI is not a human.

https://www.congress.gov/crs-product/LSB10922

pjmlp1mo ago

Well, have you actually read the license for the auto complete function?

Example,

https://marketplace.visualstudio.com/items/VisualStudioExptT...

1 more reply

metalcrow1mo ago· 4 in thread

Is there any citation for this "legal consensus"? I was not aware there was any evidence backed stances on this topic as of yet

onlyrealcuzzo1mo ago

This sounds like a problem that's pretty easy to get around.

CC does not need LGPL code. There's more than enough BSD and Apache code to go around.

And they can generate synthetic data that is better than LGPL for their training.

It's also a problem that does not seem feasible to meaningfully enforce.

It's easy to generate CC code and lie and say you didn't. It would be hard to prove that you did, especially if you took any precautions to make it even slightly difficult that you did.

1 more reply

NoMoreNicksLeft1mo ago

senaevrenOP1mo ago

thanks for this; it's definitely a fair point. I updated the piece to reflect this

qsera1mo ago· 3 in thread

More interesting question is "Who wants to own it"...

The answer is probably "Nobody"!

nine_k1mo ago

1 more reply

jumploops1mo ago

At what point is liability the only "job" left for humans?

1 more reply

onlyrealcuzzo1mo ago

Presumably, every company that has non-LGPL CC code in production wants to own it...

1 more reply

gorgoiler1mo ago· 3 in thread

Three things matter when it comes to eating my breakfast sandwich:

1/ Was the pork in my sausage reared on a farm that meets agricultural standards?

2/ Was the food handled safely by the kitchen that cooked my food?

3/ Does the owner of the diner pay kitchen wages in accordance with labor law?

By contrast, I have no idea what went into the models I use, what system prompts have prejudiced it, and whose IP has been exploited in pursuit of my answer.

amelius1mo ago

Probably, yes, but the burden of proof is with us not them.

I'm already glad some companies have the guts to open their models because proving it for open models is probably a lot easier than for a model behind a service.

2 more replies

devsda1mo ago

The media industry loves to quote ridiculous numbers on lost revenue due to piracy etc. May be a rough ballpark numbers will get them to do something about this theft.

Can someone put a rough estimate on potential revenue loss (direct and incidental) from training AI with industry wise breakup.

1 more reply

ap991mo ago

What's an example of data that might have been stolen?

dang1mo ago· 3 in thread

Could you please stop posting generated comments to HN? It's not allowed here, and it looks like you've done it over 30 times already.

(Of course, there's no way to be certain of this, but it's what our software thinks, and the overall pattern is pretty convincing.)

See https://news.ycombinator.com/newsguidelines.html#generated and https://news.ycombinator.com/item?id=47340079

woolion1mo ago

On that matter, wouldn't an AI flag for submissions help hn? I wouldn't flag a submission for LLM style as it is too harsh, but I don't want to read them -- if only because I don't like LLM prose.

senaevrenOP1mo ago

You are definitely right to flag it, apologize for that. I used an AI assistant for the replies, and I will make sure not to use one going forward.

3 more replies

simonebrunozzi1mo ago

Curious: how do you exactly detect an AI-generated comment?

1 more reply

ottah1mo ago· 2 in thread

thyrsus1mo ago

1 more reply

pocksuppet1mo ago

Plus companies just violate GPL everywhere billions of times with impunity (see: every phone ever) and nothing happens to them.

daishi551mo ago· 2 in thread

senaevrenOP1mo ago

1 more reply

sarchertech1mo ago

There’s so much FOMO right now around AI that no one is thinking clearly. I wouldn’t be so confident in your company.

2 more replies

heysoup1mo ago· 2 in thread

Claude don't write code? The LLM writes code. Claude loops the LLM into writing consistent code. Humans loop Claude into consistently looping the LLM.

PokemonNoGo1mo ago

>Who owns a potato?

I don't get what this analogy is trying to tell me but I know nothing about potato law. Is this about the Belgian potato surplus?

1 more reply

cousinbryce1mo ago

I pay to listen to songs. Do I own them?

1 more reply

heikkilevanto1mo ago· 2 in thread

Ownership is one question. IMO, a more interesting question is who is responsible when the code does some real-life damage.

mock-possum1mo ago

LLMs really change nothing about this.

ACCount371mo ago

No one. The usual.

skadge1mo ago· 2 in thread

This seems to be grounded in US law. Does anyone know if the same rules would apply in eg EU law?

nairboon1mo ago

zvr1mo ago

Most of this is based on Copyright legal framework, which is surprisingly homogeneous around the world. The discussions about ownership of AI-generated material are exactly the same in EU.

smashed1mo ago· 2 in thread

The "if you generated the code at work using company tools, it's owned by your employer" affirmation in the article makes no sense to me?

If computer generated code is not copyrightable, ownership cannot be reassigned either.

croes1mo ago

How is it for human developers now if the company tool is a cloud tool and not running on company servers?

conartist61mo ago

It is copyrightable. A *human* can copyright code they wrote.

1 more reply

DeathArrow1mo ago· 2 in thread

I have a wood cutting machine and some wood. Who owns the timber?

bell-cot1mo ago

Sadly, IP "ownership" and copyright law are vastly more complex than ownership of physical stuff.

Or were you planning to reproduce the (say) Ford Motor Company's trademarked symbol in wood? If so, you're right back in the stinkin' swamp.

croes1mo ago

What is the wood in your example?

This is like a machine you ask for timber and you get timber but you didn’t need to provide any wood

e12e1mo ago· 1 in thread

embedding-shape1mo ago

> Code from pirated text books

3 more replies

hackingonempty1mo ago· 1 in thread

senaevrenOP1mo ago

2 more replies

TheFirstNubian1mo ago· 1 in thread

senaevrenOP1mo ago

1 more reply

tommy29tmar1mo ago· 1 in thread

senaevrenOP1mo ago

reliablereason1mo ago· 1 in thread

This is like asking:

"Who owns the text microsoft word helped you write?"

Claude code is a software tool not a legal entity.

freejazz1mo ago

Not if claude does the writing. MS doesn't write things for you, and if it did, you would not be entitled to a copyright in whatever it wrote for you.

1 more reply

padmabushan1mo ago· 1 in thread

First answer who owns the model built with public data

senaevrenOP1mo ago

kouru2251mo ago· 1 in thread

Now copyright is a lifelong sentence at almost 100 years. The entire purpose of it has been undermined. Corporations own all your childhood and by the time you can profit off of it, it’s outdated.

A world where the mainstream is primarily a commons seems to me like an egalitarian world. I’d like to live in that world.

senaevrenOP1mo ago

1 more reply

Isamu1mo ago· 1 in thread

torben-friis1mo ago

AI is a monster to our current copyright system - monster in the philosophical sense, that is, an example that destroys the concept.

gspr1mo ago

I'm still flabbergasted that people – and big, visible companies with big targets on their backs – choose to keep on using the output of LLMs without having an answer to these questions.

jerleth1mo ago

> The second commit message versus the first is the difference between a defensible authorship claim and a clean “Claude wrote this” record.

That makes no sense to me, as the commit message is probably LLM generated as well. (and even easier to generate as it doesn't have to compile or pass automated tests).

bearjaws1mo ago

Article is incredibly fear mongering.

Twice in my career the owners of a company have wanted to sue competitors for stealing their "product" after poaching our staff.

Each time, the lawyers came in and basically told us that suing them for copyright is suicide, will inevitably be nearly impossible to prove, and money would be better spent in many other areas.

In fact, we ended up suing them (and they settled) for stealing our copyrighted clinical content, which they copied so blatantly they left our own typos and customer support phone number in it.

Then you have merger doctrine, how many ways can we really represent the same basic functions?

Ultimately anyone can sue you for any reason, nothing is stopping anyone right now from suing you claiming AI stole their copyleft code.

1 more reply

joshka1mo ago

If you want to go much deeper, https://www.copyright.gov/ai/ is particularly good at least on the side of comprehensiveness.

zuzululu1mo ago

I think it's pretty clear cut, whoever is paying for your agentic coding tool subscription is part of the litmus test.

I use my own computer, I pay for my own subscription and I build my open source projects then the code belongs to me.

If I use my company's computer, they pay for my subscription and we work on the company's projects then the code belongs to the company.

bandrami1mo ago

randyrand1mo ago

Normally this solved with an employment contract: "Anything you write, the copyright is transferred to your employer"

dash21mo ago

hmokiguess1mo ago

Tangential but I find this an interesting parallel from a few years ago:

https://www.vice.com/en/article/musicians-algorithmically-ge...

kazinator1mo ago

> Code that Claude Code or Cursor generated and you accepted without meaningful modification may not be copyrightable by anyone.

Except if it happens to regurgitate a significant excerpt of some existing work, then the authors of that can assert their copyright; i.e. claim that it infringes.

giancarlostoro1mo ago

mensetmanusman1mo ago

raggi1mo ago

GhostDriftInc1mo ago

The documentation advice is practical, but commit messages and prompt logs are self-reported. "Meaningful human authorship" needs a verifiable evidentiary chain, not attestations.

sidewndr461mo ago

Note to anyone reading this: the author is actively reading the comments and updating the piece based off reported issues. As a result, no meaningful discussion will take place here.

jillesvangurp1mo ago

Good overview of the issues. I'm sure there are a few nits to pick with that.

mlmonkey1mo ago

On a related note, another question: who owns the paper that Claude (or OpenAI) wrote? Should such paper submissions in conferences call out the model(s) used to write the paper itself?

lofaszvanitt1mo ago

This is a non issue, since any complex thing needs a lot of human oversight, otherwise it's nothing more than a multitentacled monstrosity.

pfortuny1mo ago

You don't but nevertheless you bear the responsibility of making it public (whether in soyrce or binary form). That is what Anthropic would like.

6d6b731mo ago

LLMs are just tools we use. If I program an app in C++, do I not own the rights to the executable because my compiler wrote machine code for me?

unholiness1mo ago

> Here is the legal baseline, in plain terms:

briandw1mo ago

Your employer can claim your code if you use their tools to produce it. Nothing new here. This has nothing to do with AI tooling.

everdrive1mo ago

Well I don't own anything I write while working on my company. Maybe my company and Claude can fight over who owns it.

jMyles1mo ago

There is no such thing as ownership of a pattern of information. It has been an illusion, and that illusion is now fading.

mock-possum1mo ago

I do. I used a tool to create it. I own the things I create.

Anything else is just bullshit equivocation.

mifydev1mo ago

Missed opportunity for a tongue twister:

Who coded the code Claude Code code?

ikrenji1mo ago

the entire US economy rides on AI. no ruling throwing a wrench into the multi trillion engine is ever going to be permitted to happen

rnxrx1mo ago

po1nt1mo ago

Who owns the code my keyboard wrote?

aakresearch1mo ago

threepts1mo ago

Whoever pays for the tokens.

pelasaco1mo ago

so as i understood GPL dont cover code written by agents?

teeray1mo ago

What if no meaningful thought was put into the code (entirely vibe-coded slop), but it’s made for your employer? Shouldn’t the work be uncopyrightable?

nicman231mo ago

i do, all of it. sorry

whattheheckheck1mo ago

Ask chatgpt deep research citing court cases and it shows dark factory swe code are not copyrightable under current precedents.

Even steering it with prompts isn't enough. The guy couldn't copyright the image he made with ai, code is no different.

Maybe prompts written by humans are copyrightable.

Can't wait for the Billionaires to entrench in court they can steal everything for these machines and claim it as their own and maybe even reach for anything that it helps produce. Fuck that

giannicmptr10001mo ago

yo Mama

-Claude

theteapot1mo ago

That was a rather unhelpful TL;DR.

j / k navigate · click thread line to collapse