AI and the Ship of Theseus (opens in new tab)

(lucumr.pocoo.org)

184 pointspixelmonkey3mo ago193 comments

193 comments

125 comments · 40 top-level

nomdep3mo ago· 17 in thread

In this emerging reality, the whole spectrum of open-source licenses effectively collapses toward just two practical choices: release under something permissive like MIT (no real restrictions), or keep your software fully proprietary and closed.

These are fascinating, if somewhat scary, times.

pabs33mo ago

The latter will become MIT sooner or later with Ghidra plus LLM-assisted reverse engineering.

https://reorchestrate.com/posts/your-binary-is-no-longer-saf... https://reorchestrate.com/posts/your-binary-is-no-longer-saf...

Even SaaSS isn't safe from that type of process:

https://news.ycombinator.com/item?id=47259485

roenxi3mo ago

I see you submitted that as a link, it deserves a lot more than the current 4 upvotes I see. What a fascinating article. It gives me much hope that dead old games are not in fact dead. If there is still a binary somewhere and current trends continue then they can probably be resurrected cheaply and with relatively unskilled people.

visarga3mo ago

If you got access to a working prototype of a software, you can use it for differential testing. So you got unlimited tests for free.

galaxyLogic3mo ago

We will need ... software patents!

1 more reply

measurablefunc3mo ago

If you listen to the people who believe real AI is right around the corner then any software can be recreated from a detailed enough specification b/c whatever special sauce is hidden in the black box can be inferred from its outward behavior. Real AI is more brilliant than whatever algorithm you could ever think of so if the real AI can interact w/ your software then it can recreate a much better version of it w/o looking at the source code b/c it has access to whatever knowledge you had while writing the code & then some.

I don't think real AI is around the corner but plenty of people believe it is & they also think they only need a few more data centers to make the fiction into a reality.

GaggiX3mo ago

>Real AI is more brilliant than whatever algorithm you could ever think of

So with "Real AI" you actually mean artificial superintelligence.

1 more reply

luma3mo ago

What you describe is essentially what happened, the AI result working from specs and tests was more performant than the original. The real AI you describe just rewrote chardet without looking at the source, only better.

2 more replies

pixl973mo ago

Real AI will never be invented, because as AI systems become more capable we'll figure out humans weren't intelligent in the first place, therefore intelligence never existed.

1 more reply

HappyPanacea3mo ago

> b/c whatever special sauce is hidden in the black box can be inferred from its outward behavior.

This is not always true, for an extreme example see Indistinguishability obfuscation.

embedding-shape3mo ago

> or keep your software fully proprietary and closed.

I guess it depends on your intention, but eventually I'm not sure it'll even be possible to keep it "fully proprietary and closed" in the hopes of no one being able to replicate it, which seems to be the main motivation for many to go that road.

If you're shipping something, making something available, others will be able to use it (duh) and therefore replicate it. The barrier for being able to replicate things like this either together with LLMs or letting the LLM straight it up do it themselves with the right harness, seems to get lowered real quick, massive difference in just a few years already.

josephg3mo ago

I completely agree.

Right now you can point claude at any program and ask it to analyse it, write an architecture document describing all the functionality. Then clear memory and get it to code against that architecture document.

You can't do that as easily with closed source software. Except, if you can read assembly, every program is open source. I suspect we're not far away from LLMs being able to just disassemble any program and do the same thing.

Is there a driver in windows that isn't in linux? No problem. Just ask claude to reverse engineer it, write out a document describing exactly how the driver issues commands to the device and what constraints and invariants it needs to hold. Then make a linux driver that works the same way.

Have an old video game you wanna play on your modern computer? No problem. Just get claude to disassemble the whole thing. Then function by function, rewrite it in C. Then port that C code to modern APIs.

It'll be chaos. But I'm quite excited about the possibilities.

2 more replies

moregrist3mo ago

I suspect there’s a middle ground that involves either keeping tests more proprietary or a copyright license that bars using the work for AI reimplementation, or both.

I think it’s entirely reasonable to release a test suite under a license that bars using it for AI reimplementation purposes. If someone wants to reimplement your work with a more permissive license, they can certainly do so, but maybe they should put the legwork in to write their own test suite.

vintagedave3mo ago

Or GPL. Which I’m increasingly thinking is the only license. It requires sharing.

And if anything can be reimplemented and there’s no value in the source any more, just the spec or tests, there’s no public-interest reason for any restriction other than completely free, in the GPL sense.

Hamuko3mo ago

>Or GPL. Which I’m increasingly thinking is the only license. It requires sharing.

It doesn't if Dan Blanchard spends some tokens on it and then licenses the output as MIT.

1 more reply

raincole3mo ago

I highly recommend read the post in question first before commenting.

1 more reply

formerly_proven3mo ago

> Or GPL. Which I’m increasingly thinking is the only license. It requires sharing.

LLM companies and increasingly courts view LLM training as fair use, so copyright licensing does not enter the picture.

f33d51733mo ago

I don't think it changes much about licensing in particular. People are going on about how since the AI was trained on this code, that makes it a derivative work. But it must be borne in mind that AI training doesn't usually lead to memorizing the training data, but rather learning the general patterns of it. In the case of source code, it learns how to write systems and algorithms in general, not a particular function. If you then describe an interface to it, it is applying general principles to implement that interface. Its ability to succeed in this depends primarily on the complexity of the task. If you give it the interfaces of a closed source and open sourced project of similar complexity, it will have a relatively equal time of implementing them.

Even prior to this, relatively simple projects licensed under share alike licenses were in danger of being cloned under either proprietary or more permissive licenses. This project in particular was spared, basically because the LGPL is permissive enough that it was always easier to just comply with the license terms. A full on GPLed project like GCC isn't in danger of an AI being able to clone it anytime soon. Nevermind that it was already cloned under a more permissive license by human coders.

scuff3d3mo ago· 13 in thread

The solution to this whole situation seems pretty simple to me. LLMs were trained on a giant mix of code, and it's impossible to disentangle it, but a not insignificant portion of their capabilities comes from GPL licenced code. Therefore, any codebase that uses LLM code is now GPL. You have a proprietary product? Not anymore.

Not saying there's a legal precedent for that right now, but it's the only thing that makes any sense to me. Either that or retain the models on only MIT/similarly licenced code or code you have explicit permission to train on.

nkmnz3mo ago

What about the code that wasn't even GPL, but "all rights reserved", i.e., without any license? That's even stronger than GPL and based on your reasoning, this would mean that any code created by an LLM is not licensed to be used for anything.

PaulDavisThe1st3mo ago

Code created by an LLM cannot, in the USA, be copyrighted. No copyright, no license.

1 more reply

scuff3d3mo ago

Okay. That's fine with me. I was trying to be generous and assume the GPL would be the strongest.

duskdozer3mo ago

That would make sense, yes.

moralestapia3mo ago

Yes.

PaulDavisThe1st3mo ago

US courts have already ruled that in the USA, no machine-generated code can be copyrighted. No copyright, no license, of any type.

keithnz3mo ago

if you train yourself by looking at GPL code then go implement your own things, is that code GPL?

dec0dedab0de3mo ago

it can be, depending on if it is different enough to convince a jury that it is not a copyright violation. See the lawsuits from Marvin Gaye's family to see how that can be unpredictable.

1 more reply

estimator72923mo ago

If you copy and paste one line from a thousand different GPL projects, is the resulting program GPL?

Let's be honest about what's happening here.

1 more reply

scuff3d3mo ago

I work with people who literally won't even look at GPL code, because of the risk. So yes, potentially.

AberrantJ3mo ago

Of course not, because everyone making these arguments wants people to have some magic sauce so they get to ignore all the rules placed on the "artificial" thing.

1 more reply

moralestapia3mo ago

100% agree, if we are fair and honorable.

In practice, well ... you saw what's been going on with the Epstein files, etc... we are far from being ourselves in a world that's fair and honorable.

(I'm not condoning it, I think it's massively trashy to steal code like this then pretend you're the good guy because of some super weird mental gymnastics you're doing)

scuff3d3mo ago

Completely agree. This isn't practical. It's never going to happen just because of the sheer amount of capital behind LLM companies.

You can do anything rotten, as long as you throw enough money at it.

cheesecompiler3mo ago· 12 in thread

> I personally think all of this is exciting. I’m a strong supporter of putting things in the open with as little license enforcement as possible. I think society is better off when we share, and I consider the GPL to run against that spirit by restricting what can be done with it.

I like sharing too but could permissive only licenses not backfire? GPL emerged in an era where proprietary software ruled and companies weren't incentivized to open source. GPL helped ensure software stayed open which helped it become competitive against the monopoly proprietary giants resting on their laurels. The restriction helped innovation, not the supposedly free market.

rkJahsdg3mo ago

Ronacher has a startup Earendil that markets itself as a non-profit like OpenAI. He appears with Austrian OpenClaw people.

He is totally in on AI and that quote of his is self-serving. Can't we go back to flaming Unicode in Python?

muyuu3mo ago

i find his arguments on re-licensing blatantly AI-plagiarised libraries down to API compatibility confusing

they are arguments against any licence not just LGPL, I could literally plagiarise all his work, claim it's mine "clean-room" and not give him as much as a mention, by his own logic

and in his own words, he's "not interested" about the morality of it

odd

bored90003mo ago

Also, towards the bottom of the page: > Content licensed under the Creative Commons Attribution-NonCommercial 4.0 International License.

jason_oster3mo ago

You're putting a lot of responsibility on a license that has several permissive contemporaries. The original BSD license "Net/1" and GPL 1.0 were both published in 1989, while the MIT license has its roots set in "probably 1987" [1] with the release of X11.

No doubt, GPL had some influence. But I would hardly single it out as the force that ensured software stayed open. Software stayed open because "information wants to be free" [2], not because some authors wield copyright law like a weapon to be used against corporations.

[1]: https://opensource.com/article/19/4/history-mit-license

[2]: A popular phase based on a fundamental idea that predates software.

Decabytes3mo ago

The existence of permissive licenses like BSD or MIT does not show that copyleft was unimportant.Those licenses allowed code to remain open, but they also allowed it to be absorbed into proprietary products.

The GPL’s significance was that it changed the default outcome. At a time when software was overwhelmingly proprietary, it created a mechanism that required improvements to remain available to users and developers downstream.

Gcc was a massive deal for the reasons why compilers are free now today for example

1 more reply

mzi3mo ago

GPL was a response to Symbolics incorporating public domain into their software without giving back to the community (and Lisp Machines).

cheesecompiler3mo ago

I’m not saying it’s the only force. But if it wasn’t instrumental what’s your take on the cause of proprietary software dominating until relatively recently?

2 more replies

littlestymaar3mo ago

Linux won against the multiple proprietary Unixes because it forced corporations to contribute back instead of keeping their secret sauce for themselves.

1 more reply

waffletower3mo ago

The downvotes on the above post are telling -- the GPL Bolsheviks are girding their loins. Myself, I am nostalgic for "information wants to be free" and find the Bolsheviks to embody a horseshoe alternative form of fascism who, somehow without cognizance of the irony, attempt to redefine the meaning of freedom.

duskdozer3mo ago

Absolutist permissive licenses are how you get the xkcd jenga tower

Aeolun3mo ago

I dunno, I’m inclined to think the WTFPL and MIT did more to help open source. And for a while during my youth there was indeed no distinction between publically accessible code and free and unencumbered code.

cheesecompiler3mo ago

Inclined to think that why?

marcus_holmes3mo ago· 10 in thread

> For me personally, what is more interesting is that we might not even be able to copyright these creations at all. A court still might rule that all AI-generated code is in the public domain, because there was not enough human input in it. That’s quite possible, though probably not very likely.

As I understand it, the US Supreme Court has just this week ruled exactly this. LLM output cannot be copyrighted, so the only part of any piece of software that can be copyrighted is that part that was created by a human.

If you vibe-code the entire thing, it's not copyrightable. And if it can't be copyrighted that means it is in the public domain from the instant it was created and can't be licensed.

graemep3mo ago

> As I understand it, the US Supreme Court has just this week ruled exactly this. LLM output cannot be copyrighted, so the only part of any piece of software that can be copyrighted is that part that was created by a human.

Your understanding is incorrect. The case was about whether an LLM can be an author, and did not whether the person using it can be (which will be the case). https://news.ycombinator.com/item?id=47260110

marcus_holmes3mo ago

Cory Doctorow (and almost every other source I'd found online commenting on this) disagrees with you.

https://pluralistic.net/2026/03/03/its-a-trap-2/

Quoting from that post:

> At the core of the dispute is a bedrock of copyright law: that copyright is for humans, and humans alone. In legal/technical terms, "copyright inheres at the moment of fixation of a work of human creativity."

jagged-chisel3mo ago

This is the correct understanding. Go back to the selfie of the monkey. Is the monkey the creator of the photo? Does he own the copyright? No. The photographer who created the opportunity for the monkey to take the selfie is the holder of the copyright on that image.

Similarly, the operator of the LLM is the holder of the copyright of the LLM’s output.

1 more reply

semi-extrinsic3mo ago

> And if it can't be copyrighted that means it is in the public domain from the instant it was created and can't be licensed.

I don't think this follows? If I vibe code something and never post it anywhere public, I can still license that code to a company and ask them to pay me for using the code?

So as a corollary, the business model of providing software where you can choose either free (as in beer) and restrictive license (e.g. GPL), or pay money and get a permissive business-compatible license, will cease to exist.

I think that's a shame actually, because it has been a good way of providing software that does something useful but where large companies that earn money from the use will have to pay the software creator.

marcus_holmes3mo ago

If the code has been entirely a product of an LLM, you don't have copyright so you can't license it. Copyright is only applicable to human creativity, so you can only copyright the bit of the product that was created by a human. And all licensing derives from copyright.

There might be a path to this business model via Trade Secrets (you register your source code as a Trade Secret, and sell only binaries).

And, of course, you can still sell support as the paid-for service, which has worked for a lot of people.

ketzu3mo ago

> I can still license that code to a company and ask them to pay me for using the code

I believe you can do that with public domain/copyright free material in general. There is no requirement to tell someone that the material you license them is also available under a different one or that your license is not enforceable.

1 more reply

prmph3mo ago

Technically how will vibe code be identified? And how does one determine the level of human involvement that would make code copyrightable? What of the prompts? Are those copyrightable? What about the architectural and tactical design of the code if I do those myself?

I don't vibe code; I am firmly in charge of the architecture and code style of my projects, and i frequently give detailed instructions to AI tools I use. But, to me, this is leading to a weird place. Why would the result of using a tool to create something new not be copyrightable simply due to the specific tool used?

I think this whole hullabaloo is self inflicted. Code or an other creative work should stand on its merits. There is no issue with copyright and no issue with the ship of Theseus. The current copyright approach is still applicable: code (or any other creative work) that appears to be lifted verbatim from another work could be a copyright violation. Work that is sufficiently original (irrespective of how it was created) is likely not a copyright violation.

marcus_holmes3mo ago

It's the courts' opinions that count. And they say that copyright only attaches to human creative work, and that does not include LLM output.

I can see there's going to be some huge court fights over this in the next ten years - there's no way some of the big media companies are going to be OK with their content being public domain, and no way are they going to just miss out on being able to produce it so cheaply with an LLM.

BerislavLopac3mo ago

Code is one thing, but what about writing? There is no 100% foolproof way to identify content written by LLMs, and human writing routinely gets incorrectly flagged as such. If I write a book, and a checker says that it's written by LLM, is it automatically in the public domain?

marcus_holmes3mo ago

Really good question.

My understanding is that only human creativity can be copyrighted. So if you sketched out the plot and got the LLM to write all the words, then only the plot is copyrightable. So someone else can copy all the words, as long as they don't copy your plot.

However, as you point out, someone has to determine which bits the LLM created and which bits you created. If you wrote the whole book, and a tool incorrectly flags your writing as LLM writing, and then someone copies chunks of your book because they believed the tool and assumed they could (and assuming you filed a DMCA claim and they denied it using the tool's output as proof) then there's going to have to be a court case.

I suspect there's going to be a few court cases about this.

1 more reply

erelong3mo ago· 8 in thread

hopefully this continues to show how awkward the idea of "intellectual property" (IP) is until people abandon it

IP sounds good in theory but enables things like "patent trolling" by large corps and creating all kinds of goofy barriers and arbitrary questions like we're asking about if re-implementations of ideas are "really ours"

(maybe they were never anyone's in the first place, outside of legally created mentalities)

ideas seem to fundamentally not operate like physical things so asserting they can be considered "property" opens the door for all kinds of absurdities like as pondered in the OP

AuthAuth3mo ago

I have no data to back this up but patent trolling seems to happen far less than companies that already own significant infra/talent ripping products from smaller companies and out competing them with their scale. I'd rather have patent trolling than have Amazon manufacturer everything i launch.

The problem with IP laws and the US is that the big companies already do what IP is suppose to protect and the US refuses to legislate effectively against them.

galaxyLogic3mo ago

And the reason for this is that there is no limit as to how much money corporations can pay for the election campaigns of politicians who make the laws. Right?

NewsaHackO3mo ago

Unfortunately, there are going to be people who push back on the virtue of this being a startup founder website.

TZubiri3mo ago

the issue with this Stallmanian view on IP is that IP predates software and solves an actual issue.

I don't think Stallman has a real proposal to how innovation can be incentivized and compensated.

Take the example of medical innovations, sure big pharma is bad, but if they don't get to monetize their inventions, how will R&D get funded?

If you destroy IP and allow everyone to clone whatever, you will have a great result in the short term, then no one will continue R&D

duskdozer3mo ago

>Take the example of medical innovations, sure big pharma is bad, but if they don't get to monetize their inventions, how will R&D get funded?

By taking the public money that goes to medical R&D already, increased if need be, and hire scientists to research medical tech in the interest of public wellbeing and not profit.

erelong3mo ago

I think getting rid of IP shifts economic focus on to tangible physical goods which you can exclusively own: you can sell the physical medical devices, just not claim a specific design is "yours exclusively"

IP has always had awkward things like, what if you discover the sole treatment for a disease and can restrict people from making use of it... kind of weird, especially when people can "independently" draw the same conclusions so they truly obtain an idea that is "their own" but which then they are legally restricted from making use of in such an example

chii3mo ago

> then no one will continue R&D

i would like to see a system of publicly funded R&D.

moralestapia3mo ago

Is there anything you have created, spending considerable resources and time, that you ended up giving up for free? For the betterment of humanity?

Let's see it!

Splinelinus3mo ago· 5 in thread

I'm waiting for AGPL to become AIGPL: If you train a model with some or all of the licensed work, you agree that the weights of that model constitute a derivative work, and further for the weights, as well as any inference output produced as a result of those weights to be bound by the terms of the license. If you run a model with the licensed work in part or in full as input, you agree that any output from the model is bound by the terms of the license.

sigmar3mo ago

You can't change the law with a license agreement and redefine what constitutes a derivative work. If that was possible, people could have done it pre-LLMs.

also how would you prove it was in the training set? re: your last sentence, the licensed work wasn't in the input in the chardet example ("no access to the old source tree")

glkindlmann3mo ago

Sure, a license can't create new legal understanding of "derived work", but I think the intent of what Splinelinus said still works: a license outlines the terms under which a licensee can use the licensed Work. The license can say "if you train a model on the Work, then here are the terms that apply to model or what the model generates". If you accept the license, those terms apply, even if the phrase "derived work" never came up. I hope there are more licenses that include terms explicitly dealing with models trained on the Work.

Also, for comparison, both GPL and LGPL, when applied to software libraries (in the C sense of the word), assert that creating an application by linking with the library creates a derived work (derived from the library), and then they both give the terms that govern that "derived work" (which are reciprocal for GPL but not for LGPL). IANAL but I believe those terms are enforceable, even if the thing made by linking with the library does not meet a legal threshold for being a derived work.

2 more replies

ncruces3mo ago

Agree. But then, the test suite was the input (chardet). So, is the test suite creative or functional in nature? And does the concept of fair use apply globally?

rzmmm3mo ago

Bingo. I can see this is a possible future, and probably desirable scenario for anyone with preference for free software.

Smith423mo ago

So write it! Shouldn't be much extra to add to the AGPL licence?

PaulDavisThe1st3mo ago· 4 in thread

US courts have ruled that machine generated code cannot be copyright. Ergo, it cannot be licensed (under any license; nobody owns the copyright, thus nobody can "license" it to anyone else).

You cannot (*) use LLMs to generate code that you then license, whether that license is GPL, MIT or some proprietary mumbo-jumbo.

(*) unless you just lie about this part.

nl3mo ago

This oversimplifies it.

You can't copyright a work that is only generated by a machine: "In February 2022, the Copyright Office’s Review Board issued a final decision affirming the refusal to register a work claimed to be generated with no human involvement"

But human direction of machine processes can be copyright:

"A year later, the Office issued a registration for a comic book incorporating AI-generated material."

and

"In most cases, however, humans will be involved in the creation process, and the work will be copyrightable to the extent that their contributions qualify as authorship. It is axiomatic that ideas or facts themselves are not protectible by copyright law and the Supreme Court has made clear that originality is required, not just time and effort. In Feist Publications, Inc. v. Rural Telephone Service Co., the Court rejected the theory that “sweat of the brow” alone could be sufficient for copyright protection. “To be sure,” the Court further explained, “the requisite level of creativity is extremely low; even a slight amount will suffice."

See https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...

PaulDavisThe1st3mo ago

I have no doubt that I was oversimplifying it. The court case that determines whether code written by an LLM in response to various types of prompts has not yet been launched (AFAIK; if it has, it has not yet been decided).

But it will be a shitshow either way.

galaxyLogic3mo ago

Would writing a prompt, or few, for an LLM qualify as "the requisite level of creativity is extremely low; even a slight amount will suffice"

1 more reply

IsTom3mo ago

If you take AI image (that cannot be copyrighted) and adjust it in photo edition software of your choice then the changes are potentially copyrightable and the resulting image can be copyrighted (you need to ensure that your changes pass the low bar of creativity).

It's not clear to me how much code you would need to modify by hand to qualify for copyright this way, but that's not an impossible avenue.

beloch3mo ago· 4 in thread

Perhaps code licensing is going to become more similar to music.

e.g. Somebody wrote a library, and then you had an LLM implement it in a new language.

You didn't come up with the idea for whatever the library does, and you didn't "perform" the new implementation. You're neither writer nor performer, just the person who requested a new performance. You're basically a club owner who hired a band to cover some tunes. There's a lot involved in running a club, just like there's a fair bit involved in operating a LLM, but none of that gives you rights over the "composition". If you want to make money off of that performance, you need to pay the writer and/or satisfy whatever terms and conditions they've made the library available under.

IANAL, so I don't even know what species of worms are inside this can I've opened up. It seems sensible, to me, that running somebody else's work through a LLM shouldn't give you something that you can then claim complete control over.

---------

Edit: For the sake of this argument, let's pretend we're somewhere with sensible music copyright laws, and not the weird piano-roll derived lunacy that currently exists in the U.S..

fpaf3mo ago

I find the music example very illuminating, thanks! Looking into US Copyright for songs there are two different kinds:

- one for the composition, the musical idea, music, lyrics.

-one for the recording, the music taking shape in a format that someone can listen to

I don't think this is how software licenses work, as they cover the code itself, rather than the ideas (the specific recording rather than the composition, in the music example), but it's an interesting way to frame why using LLM this way is, if not illegal, at least unethical.

source: https://www.copyright.gov/engage/musicians/

galaxyLogic3mo ago

If a recordoing is made in a club, doesn't the party doing the recording have the copyright to that (live) recording, or is it the performers?

jdndbdjsj3mo ago

Sans contract? Probably like if I take a photo of you holding a copy of a recent book. I own copy right of the photo. The author still has copyright of the book.

greyface-3mo ago

Compulsory licensing for software is going to be fun.

LucasAegis3mo ago· 4 in thread

AI is merely a sophisticated tool. If your original thoughts achieve a tangible result through this tool, the ownership should reside with the thinker. Reverse-engineering, in this context, shouldn't be seen merely as an infringement on AI-generated code, but as a violation of the human intellect and systemic design that orchestrated that code. We need to move past protecting 'lines of code' and start protecting the 'intent and architecture' behind them.

latexr3mo ago

> If your original thoughts achieve a tangible result through this tool, the ownership should reside with the thinker.

What if you ask the tool “come up with an idea and build it” and it makes you an (obviously) derivative app? Or what if (closer to this post) you say “copy this thing, but differently so we don’t get into legal trouble”? Is any of those an “original thought” worthy of ownership of the output?

bayindirh3mo ago

What if the tool needs an amalgam of everything on the internet to barely function and some of this everything has a big red label saying that adding said thing to this amalgam is forbidden for a reason or another?

Further, what if this tool can reproduce these forbidden things almost or completely verbatim and the user of the tool has no way to verify it?

LucasAegis3mo ago

You are focusing on the 'bricks' (the literal lines of code), but your argument overlooks the fundamental reality of Architectural Interdependency. In the era of AI-driven synthesis, we must shift our perspective from linguistic expression to systemic logic.

Think of software development as finding a structural path from point A to point D.

1.The Foundational Gateway (A → B): You are correct that AI tools are an amalgam of existing data. This foundational layer (A-B) represents the "Prior Art" or the existing IP that serves as a necessary gateway for any further development. If the path starts here, the rights of the original creators must be respected through the established legal framework of Intellectual Property Offices.

2.The Innovative Branch (F → D): However, if an orchestrator uses a tool to forge a new path via a distinct architecture (F) to reach the destination (D), that specific "delta" is a unique intellectual asset. Even if the tool "borrows" the bricks, the topological map of the new architecture belongs to the thinker who directed it.

3.The Necessity of Cross-Licensing: This is where the true core of IP exists. If the owner of the foundation (A-B) wishes to utilize the superior, optimized results of the new path (ABFD), they must respect the IP of the FD architecture. Conversely, the FD creator must acknowledge the base.

We aren't just talking about 'verbatim reproduction' of code; we are talking about the Systemic Design that justifies the existence of IP offices worldwide. The future isn't about "cleaning" licenses through AI, but about a more sophisticated world of Cross-Licensing where the foundational layer and the innovative layer recognize each other's functional logic.

KaiserPro3mo ago

> the ownership should reside with the thinker.

Assuming that you are a programmer, when you think back to your contract, you will have noticed something like "The employee agrees to that any works created during employment will be solely owned by $company_name"

Google spent many many millions undermining that so they could run youtube, the news service and google books (amongst other things.

Disney bought most of congress to do the opposite.

At it's heart copyright is a tool that allows you and me to make a living. However its evolved into a system that allows large corporations to make and hold monopolies.

Now that large corporations can see an opportunity to cut employees out of the system entirely, they are quite happy with AI companies undermining copyright, just so long as they can keep charging for auto generated content.

TLDR: copyright is automatically assigned to the creator of the specific work, not the thinker.

ie thinker: "build me a box with two yellow rabbit ears"

The text is copyright of the "thinker"

maker: builds a box with yellow rabbit ears Unless the yellow rabbit ears are a specific and recognisable of the thinker's work, its not infringement.

rzerowan3mo ago· 2 in thread

Strange this with this whole incident apart from the rewrite/LLM part is the general misundrstanding of the licences. LGPL being a pretty permissive one going as far as allowing one to incorporate it in propriety code without the linking reciprocity clause [1] and MIT is even more permissive. Importantly these were meant to protect the USER of the code.Not the Dev , or the Company or the CLA holder - the USER is primary in the FreeSoftware world.Or at least was supposed to be , OSS muddied the waters and forgetting the old lessons learned when thing were basically bigcorp vs indie hacker trying to getthir electronic device to connect to what they want to connect to and do what they need is why were here.

Bikeshedding to eventually come full circle to understand why those decisions were made.

In a world where the large OEMs and bigcorps are increasinly locking down firmware , bootloaders , kernels and the internet. I would think a reappraisal of more enforcement that benefits the USER is paramount.

Instead we have devs looking to tear down the few user protections FLOSS provides and usher in a locked down hacker unfiendly future.

[1] https://licensecheck.io/blog/lgpl-dynamic-linking

the_mitsuhiko3mo ago

> Strange this with this whole incident apart from the rewrite/LLM part is the general misundrstanding of the licences. LGPL being a pretty permissive one going as far as allowing one to incorporate it in propriety code without the linking reciprocity clause

The short version is that chardet is a dependency of requests which is very popular, and you cannot distribute PyInstaller/PyOxidizer builds with chardet due to how these systems bundle up dependencies.

[1]: https://velovix.github.io/post/lgpl-gpl-license-compliance-w...

[2]: https://github.com/indygreg/PyOxidizer/issues/142

rzerowan3mo ago

Ok thanks for the background on that - again though this would be a painpoint on the packagers - but fully in line with the intentions of the GPL and with the LGPL to enpower the end user to be able to swap/update/tinker as they see fit.

As i recall there were some similar situations in regards to licences for distro builders regarding graphicsdrivers and even mp3 decoders wherer there was a song and dance the end user had to go through to legally install them during/after setup.

Or better yet to make a truly api compatible re-implementation to use with the license that they want to use, since what they have done i surmise would fall under a derivative work.So they havent really accomplised what they wanted - and instead introduced an unacceptable amount of risk to whoever uses the library going forward.

Kinda reminds me of what the Inderner Archive did during the pandemic with the digital lending library.Pushing the boundaries to test them and establish precedence. in any case let see how it plays out.

andsoitis3mo ago· 2 in thread

> I’m a strong supporter of putting things in the open with as little license enforcement as possible.

Oooohkaaaay?

falcor843mo ago

Licensing, and particularly copyleft is based on copyright - you cannot offer a license, if you don't have a copyright on the thing. You can put it in the public domain, but that is very different.

andsoitis3mo ago

I understand that. It was just curious to me why, if one holds the position that information ought to be as open as possible, the author still chooses to copyright their won writing. It seems to me, that the ideal is putting it in the public domain, i.e. no copyright. But maybe I'm missing something.

7777777phil3mo ago· 1 in thread

The legal question is a distraction. GPL was always enforced by economics: reimplementation had to cost more than compliance. At $1,100 for 94% API coverage, it doesn't. Copyleft was built for a world where clean-room rewrites were painful but they aren't anymore.

badc0ffee3mo ago

I don't think it's been established that clean-room rewrites are no longer painful. We don't know if chardet could have been rewritten so easily if the original code wasn't in the training set.

senko3mo ago· 1 in thread

Maybe, just maybe, this whole AI thing could result in us collectively waking up and realizing copyright is entirely unsuitable for software.

philipwhiuk3mo ago

Or maybe that AI is committing copyright theft?

just69793mo ago· 1 in thread

I think the reimplementation in question rubs people the wrong way because of the intentions of parties on both ends and the ignoring of one of them by the other (erasure of, from some POV). The original author of the code obviously chose the license they did intentionally (copyleft "keep it open" reasons, seemingly). And the the rewrite author has their intentions as well (unknown beyond "less restrictions on derivative"). The problem comes when those intentions conflict, and in this case the rewrite author basically just ignored the usual convention to resolve the conflict, which is forking or just starting a new project. Claiming "I've maintained it for a while so I can do whatever I want" is kinda gross because is just completely overrides the original authors' intention with their own. They're basically saying "my intentions as maintainer are more important than the creator's", and that doesn't feel even. The "is it a real clean-room" due to prior exposure due to LLM training and working on the codebase is always going to be contentious. But "should I override erase someone else intentions?" question is easy to answer. No. Especially since we have come up with so many ways to make it easy not to (forking is practically free, the abstraction of APIs is powerful, etc).

It also just feels a little nefarious. There isn't much reason to change between those licenses in question beyond to allow it to be more tightly integrated into something commercial and closed-source. In which case, having an LLM write a compatible rewrite _in a new project_ seems reasonable at the current moment in time. It's this intentional overriding of the original intentions, seemingly _for profit_ as well, that is the grossest part, because the alternatives are just so easy and common.

just69793mo ago

If Theseus recreated the ship from the original plans but all new parts, created new plans, and then burned the original plans and original parts, it is the same ship? If yes, what if they (with some ship building magic) converted to the second one to have a completely open floor plan inside? Still the same ship?

radarsat13mo ago· 1 in thread

This is interesting because I've been considering a similar project. I maintain a package for a scientific simulation codebase, it's all in Fortran and C++ with too much template code, which takes ages to build and is very error prone, and frankly a pain to maintain with its monstrous CMake spaghetti build system. Furthermore the whole thing would benefit with a rewrite around GPU-based execution, and generally a better separation between the API for specifying the simulation and the execution engine. So I've been thinking of rewriting it in Jax and did an initial experiment to port a few of the main classes to Python using Gemini. It did a fairly good job. I want to continue with it, but I'm also a bit hesitant because this is software that the upstream developers have been working on for 20+ years. The idea of just saying to them "hey look I rewrote this with AI and it's way better now" is not something I would do without giving myself pause for thought. In this case it's not about the license, they already use a permissive one, but just the general principle of suggesting a "replacement" for their work.. if I was doing it by hand it might be different, I don't know, they might appreciate that more, but I have no interest in spending that much time on it. Probably what I will do is just present the PoC and ask if they think it's worth attempting to auto-convert everything, they might be open to it. But yeah, the possibilities of auto-transpiling huge amounts of software for modernization purposes is a really interesting application of AI, amazing to think of all the possibilities. But I'm happy to have read the article because I certainly didn't think about the copyright implications.

duskdozer3mo ago

If you really want to do that, the sensible thing is to keep it separate from the original and respect the original license. There would have been no outcry if that happened with chardet. If the different package is genuinely better, it will be used.

vbarrielle3mo ago

The test suite was also licensed under the LGPL. The reimplementation can be seen as a derivative work of the test suite, and thus should fall under the LGPL. This does not even mention the fact that the coding agent, AND the user steering it, both had ample exposure to chardet's source code, making it hard to argue that the reimplementation is a new ship.

globular-toast3mo ago

> The motivation: enabling relicensing from LGPL to MIT.

Good heavens, that's incredibly unethical. I suppose I should expect nothing more from a profession that has shied away from ethics essentially since its conception.

> I think society is better off when we share

Me too.

> and I consider the GPL to run against that spirit by restricting what can be done with it.

The GPL explicitly allows anyone to do anything with it, apart from not sharing it.

You want me to share with you, but you don't want to share with me.

cubefox3mo ago

> Unlike the Ship of Theseus, though, this seems more clear-cut: if you throw away all code and start from scratch, even if the end result behaves the same, it’s a new ship.

That's not how copyright works. It doesn't require exact copies. You also can't just rephrase an existing book from scratch when the ideas expressed are essentially the same. Same with music.

fouc3mo ago

> But this all causes some interesting new developments we are not necessarily ready for. Vercel, for instance, happily re-implemented bash with Clankers but got visibly upset when someone re-implemented Next.js in the same way.

Kinda surprised nobody commented on this

bloppe3mo ago

> I’m a strong supporter of putting things in the open with as little license enforcement as possible. I think society is better off when we share, and I consider the GPL to run against that spirit by restricting what can be done with it.

This is a head-spinning argument. The whole point of GPL is to force more things out into the open. You'd think someone who espouses open source would cheer the GPL. The only practical difference between MIT and GPL is that the former allows more closed-source code.

This feels analogous to the paradox of freedom. Truly unlimited freedom would include the freedom to oppress others, so "freedom maximalism" is an unsound philosophy (unless applied solipsistically).

When I publish, I tend to do so under MIT. I also write plenty of closed-source code. And I do generally believe in open source. But I don't use that as a justification for preferring MIT. If anything, I like MIT despite believing in open source, not because. Mainly because I want people to actually use what I wrote.

mellosouls3mo ago

Note the Ship of Theseus, while a nice comparison for the title, is not - as the author eventually points out - an appropriate analogy here. A fundamental contribution to the idea of whether the identity of the entity persists or not is the continuity between intermediate states.

In the example given and discussed here the last couple of days there seems to be a process more akin to having an AI create a cast of the pre-existing work and fill it for the new one.

benob3mo ago

It's funny that real value is now in test suites. Or maybe it's always been...

mannanj3mo ago

I think at the core this is a problem of abuse of the commons and parasitic and extractive behavior being tolerated as a norm.

How would I defend myself against hostile entities and societal norms that make it OK to steal from me and my effort without compensation? I will close my doors, put up walls, and distrust more often.

That's clearly the trend the world is going towards and I don't see that changing until we find some a way to make it cheaper to detect deception and parasitic behavior along with holding said entities accountable. Since our world leaders have had a history of unaccountable leadership and they are whom model this behavior, I have difficulty seeing the norms change without drastic worldwide leadership change.

StephenHerlihyy3mo ago

At what point does the cost of reimplementation shrink below the benefits of obfuscation? Consider a new CVE in Linux. Well maybe my Linux is not the same as the public one. Maybe I just set a swarm of AI agents on making me a drop in replacement that is different but with an identical interface. Same-same but different. Right now writing your own OS to replace the entirety of Linux would be costly and error prone. Foolish. But will it always? What happens when Claude Code Infinute Opus can 1-shot a perfect reimagining in 24 hours? Or 30 minutes? Do all my servers have the same copy or are they all slightly different implementations of the same thing? I dunno.

emporas3mo ago

Porting code from one programming language to another will be one of the most important tasks of code gen A.I.

Imagine doing the same with vehicle engines. Less fuel consumption, less pollution, less weight and who knows how many more benefits.

Just letting the A.I. do it by itself is sloppy though. The real benefit is derived only when the resulting port is of equal or better quality than the original. It needs a more systematic approach, with a human in the loop and good tools to index and select data from both codebases, the original and the ported one. The tools are not invented yet but we will get there.

jFriedensreich3mo ago

Non-permissive licenses, open core and proprietary software will just not survive. There is no reality in which I or anyone in my community would use something like eg. raycast or the saas email clients that someone locks down and does rent extraction and top down decisions on. The experience of being able to change anything about the software i use with a prompt while using it is impossible to come back from to all the glitches, limitations and stupidities. we have to come to terms with infinite software.

cheesecompiler3mo ago

After cloning a test suite you're still left with ongoing maintenance and development, maintaining feature parity etc. There's a lot more than passing a test suite. If the rewrite is truly superior it deserves to become the new Ship of Theseus. But e.g. I doubt anyone's AI rewrites of SQLite will ever put a dent in its marketshare.

Devasta3mo ago

This is awful news, but I don't know what can be done, is it possible to have a new GPL4 that deals with this? I doubt it.

philipwhiuk3mo ago

Meanwhile elsewhere: https://www.theguardian.com/technology/2026/mar/06/uk-arts-m...

ChrisMarshallNY3mo ago

> slopforks

Good term.

For myself, I tend to have a similar view as the author (I publish MIT on most of my work), but it’s not really something I’m zealous about, and I’m not really into “slopforking” the work of others. I tend to prefer reinventing the wheel.

casey23mo ago

Pretty simple, if the model was trained on GPL or any copyleft then the output is copyleft (in whole or in part!) you just have a really long preprocessing step before hitting compile.

thangalin3mo ago

Translate an alternative?

https://github.com/albfernandez/juniversalchardet

infinitewars3mo ago

The ship never existed, only the idea of a ship.

davidcollantes3mo ago

> Right now I would argue that unless some evidence of the contrary could be provided, this can be seen as a new implementation from ground up.

Not ship of Theseus, but a "new implementation from ground up.

Evidently, the author prefers MIT (https://github.com/chardet/chardet/issues/327#issuecomment-4...), and seems OK with slop-coding.

jneen3mo ago

I mean, it has to be asked... was the source of chardet not in the training set...?

__mharrison__3mo ago

Licensing is done. Reimplementation will be to easy...

Towaway693mo ago

> There is an obvious moral question here, but that isn’t necessarily what I’m interested in.

Interestingly that‘s also the exact same spot I stopped reading.

The dilution of morals weakens societies. We ignore them at our own peril, the planet and most certainly any god figure doesn’t care.

latexr3mo ago

> There is an obvious moral question here, but that isn’t necessarily what I’m interested in.

And thus we arrive at the absolute shit state the world is in. We keep putting morality aside for something “more interesting” then forget to consider it back in when making the final point.

“Have you tried: “kill all the poor?””

https://youtube.com/watch?v=s_4J4uor3JE

rmoriz3mo ago

I know it's a bit off-topic, but https://www.youtube.com/watch?v=DTYnzLbHUHA

1 more reply

fergie3mo ago

> A court still might rule that all AI-generated code is in the public domain, because there was not enough human input in it. That’s quite possible, though probably not very likely.

Its not only likely, it is in fact the current position, at least in the US.

j / k navigate · click thread line to collapse

193 comments

125 comments · 40 top-level

nomdep3mo ago· 17 in thread

These are fascinating, if somewhat scary, times.

pabs33mo ago

The latter will become MIT sooner or later with Ghidra plus LLM-assisted reverse engineering.

https://reorchestrate.com/posts/your-binary-is-no-longer-saf... https://reorchestrate.com/posts/your-binary-is-no-longer-saf...

Even SaaSS isn't safe from that type of process:

https://news.ycombinator.com/item?id=47259485

roenxi3mo ago

visarga3mo ago

If you got access to a working prototype of a software, you can use it for differential testing. So you got unlimited tests for free.

galaxyLogic3mo ago

We will need ... software patents!

1 more reply

measurablefunc3mo ago

I don't think real AI is around the corner but plenty of people believe it is & they also think they only need a few more data centers to make the fiction into a reality.

GaggiX3mo ago

>Real AI is more brilliant than whatever algorithm you could ever think of

So with "Real AI" you actually mean artificial superintelligence.

1 more reply

luma3mo ago

2 more replies

pixl973mo ago

Real AI will never be invented, because as AI systems become more capable we'll figure out humans weren't intelligent in the first place, therefore intelligence never existed.

1 more reply

HappyPanacea3mo ago

> b/c whatever special sauce is hidden in the black box can be inferred from its outward behavior.

This is not always true, for an extreme example see Indistinguishability obfuscation.

embedding-shape3mo ago

> or keep your software fully proprietary and closed.

josephg3mo ago

I completely agree.

It'll be chaos. But I'm quite excited about the possibilities.

2 more replies

moregrist3mo ago

I suspect there’s a middle ground that involves either keeping tests more proprietary or a copyright license that bars using the work for AI reimplementation, or both.

vintagedave3mo ago

Or GPL. Which I’m increasingly thinking is the only license. It requires sharing.

Hamuko3mo ago

>Or GPL. Which I’m increasingly thinking is the only license. It requires sharing.

It doesn't if Dan Blanchard spends some tokens on it and then licenses the output as MIT.

1 more reply

raincole3mo ago

I highly recommend read the post in question first before commenting.

1 more reply

formerly_proven3mo ago

> Or GPL. Which I’m increasingly thinking is the only license. It requires sharing.

LLM companies and increasingly courts view LLM training as fair use, so copyright licensing does not enter the picture.

f33d51733mo ago

scuff3d3mo ago· 13 in thread

nkmnz3mo ago

PaulDavisThe1st3mo ago

Code created by an LLM cannot, in the USA, be copyrighted. No copyright, no license.

1 more reply

scuff3d3mo ago

Okay. That's fine with me. I was trying to be generous and assume the GPL would be the strongest.

duskdozer3mo ago

That would make sense, yes.

moralestapia3mo ago

Yes.

PaulDavisThe1st3mo ago

US courts have already ruled that in the USA, no machine-generated code can be copyrighted. No copyright, no license, of any type.

keithnz3mo ago

if you train yourself by looking at GPL code then go implement your own things, is that code GPL?

dec0dedab0de3mo ago

it can be, depending on if it is different enough to convince a jury that it is not a copyright violation. See the lawsuits from Marvin Gaye's family to see how that can be unpredictable.

1 more reply

estimator72923mo ago

If you copy and paste one line from a thousand different GPL projects, is the resulting program GPL?

Let's be honest about what's happening here.

1 more reply

scuff3d3mo ago

I work with people who literally won't even look at GPL code, because of the risk. So yes, potentially.

AberrantJ3mo ago

Of course not, because everyone making these arguments wants people to have some magic sauce so they get to ignore all the rules placed on the "artificial" thing.

1 more reply

moralestapia3mo ago

100% agree, if we are fair and honorable.

In practice, well ... you saw what's been going on with the Epstein files, etc... we are far from being ourselves in a world that's fair and honorable.

(I'm not condoning it, I think it's massively trashy to steal code like this then pretend you're the good guy because of some super weird mental gymnastics you're doing)

scuff3d3mo ago

Completely agree. This isn't practical. It's never going to happen just because of the sheer amount of capital behind LLM companies.

You can do anything rotten, as long as you throw enough money at it.

cheesecompiler3mo ago· 12 in thread

rkJahsdg3mo ago

Ronacher has a startup Earendil that markets itself as a non-profit like OpenAI. He appears with Austrian OpenClaw people.

He is totally in on AI and that quote of his is self-serving. Can't we go back to flaming Unicode in Python?

muyuu3mo ago

i find his arguments on re-licensing blatantly AI-plagiarised libraries down to API compatibility confusing

they are arguments against any licence not just LGPL, I could literally plagiarise all his work, claim it's mine "clean-room" and not give him as much as a mention, by his own logic

and in his own words, he's "not interested" about the morality of it

odd

bored90003mo ago

Also, towards the bottom of the page: > Content licensed under the Creative Commons Attribution-NonCommercial 4.0 International License.

jason_oster3mo ago

[1]: https://opensource.com/article/19/4/history-mit-license

[2]: A popular phase based on a fundamental idea that predates software.

Decabytes3mo ago

Gcc was a massive deal for the reasons why compilers are free now today for example

1 more reply

mzi3mo ago

GPL was a response to Symbolics incorporating public domain into their software without giving back to the community (and Lisp Machines).

cheesecompiler3mo ago

I’m not saying it’s the only force. But if it wasn’t instrumental what’s your take on the cause of proprietary software dominating until relatively recently?

2 more replies

littlestymaar3mo ago

Linux won against the multiple proprietary Unixes because it forced corporations to contribute back instead of keeping their secret sauce for themselves.

1 more reply

waffletower3mo ago

duskdozer3mo ago

Absolutist permissive licenses are how you get the xkcd jenga tower

Aeolun3mo ago

cheesecompiler3mo ago

Inclined to think that why?

marcus_holmes3mo ago· 10 in thread

If you vibe-code the entire thing, it's not copyrightable. And if it can't be copyrighted that means it is in the public domain from the instant it was created and can't be licensed.

graemep3mo ago

marcus_holmes3mo ago

Cory Doctorow (and almost every other source I'd found online commenting on this) disagrees with you.

https://pluralistic.net/2026/03/03/its-a-trap-2/

Quoting from that post:

jagged-chisel3mo ago

Similarly, the operator of the LLM is the holder of the copyright of the LLM’s output.

1 more reply

semi-extrinsic3mo ago

> And if it can't be copyrighted that means it is in the public domain from the instant it was created and can't be licensed.

I don't think this follows? If I vibe code something and never post it anywhere public, I can still license that code to a company and ask them to pay me for using the code?

marcus_holmes3mo ago

There might be a path to this business model via Trade Secrets (you register your source code as a Trade Secret, and sell only binaries).

And, of course, you can still sell support as the paid-for service, which has worked for a lot of people.

ketzu3mo ago

> I can still license that code to a company and ask them to pay me for using the code

1 more reply

prmph3mo ago

marcus_holmes3mo ago

It's the courts' opinions that count. And they say that copyright only attaches to human creative work, and that does not include LLM output.

BerislavLopac3mo ago

marcus_holmes3mo ago

Really good question.

I suspect there's going to be a few court cases about this.

1 more reply

erelong3mo ago· 8 in thread

hopefully this continues to show how awkward the idea of "intellectual property" (IP) is until people abandon it

(maybe they were never anyone's in the first place, outside of legally created mentalities)

ideas seem to fundamentally not operate like physical things so asserting they can be considered "property" opens the door for all kinds of absurdities like as pondered in the OP

AuthAuth3mo ago

The problem with IP laws and the US is that the big companies already do what IP is suppose to protect and the US refuses to legislate effectively against them.

galaxyLogic3mo ago

And the reason for this is that there is no limit as to how much money corporations can pay for the election campaigns of politicians who make the laws. Right?

NewsaHackO3mo ago

Unfortunately, there are going to be people who push back on the virtue of this being a startup founder website.

TZubiri3mo ago

the issue with this Stallmanian view on IP is that IP predates software and solves an actual issue.

I don't think Stallman has a real proposal to how innovation can be incentivized and compensated.

Take the example of medical innovations, sure big pharma is bad, but if they don't get to monetize their inventions, how will R&D get funded?

If you destroy IP and allow everyone to clone whatever, you will have a great result in the short term, then no one will continue R&D

duskdozer3mo ago

>Take the example of medical innovations, sure big pharma is bad, but if they don't get to monetize their inventions, how will R&D get funded?

By taking the public money that goes to medical R&D already, increased if need be, and hire scientists to research medical tech in the interest of public wellbeing and not profit.

erelong3mo ago

chii3mo ago

> then no one will continue R&D

i would like to see a system of publicly funded R&D.

moralestapia3mo ago

Is there anything you have created, spending considerable resources and time, that you ended up giving up for free? For the betterment of humanity?

Let's see it!

Splinelinus3mo ago· 5 in thread

sigmar3mo ago

You can't change the law with a license agreement and redefine what constitutes a derivative work. If that was possible, people could have done it pre-LLMs.

also how would you prove it was in the training set? re: your last sentence, the licensed work wasn't in the input in the chardet example ("no access to the old source tree")

glkindlmann3mo ago

2 more replies

ncruces3mo ago

Agree. But then, the test suite was the input (chardet). So, is the test suite creative or functional in nature? And does the concept of fair use apply globally?

rzmmm3mo ago

Bingo. I can see this is a possible future, and probably desirable scenario for anyone with preference for free software.

Smith423mo ago

So write it! Shouldn't be much extra to add to the AGPL licence?

PaulDavisThe1st3mo ago· 4 in thread

US courts have ruled that machine generated code cannot be copyright. Ergo, it cannot be licensed (under any license; nobody owns the copyright, thus nobody can "license" it to anyone else).

You cannot (*) use LLMs to generate code that you then license, whether that license is GPL, MIT or some proprietary mumbo-jumbo.

(*) unless you just lie about this part.

nl3mo ago

This oversimplifies it.

But human direction of machine processes can be copyright:

"A year later, the Office issued a registration for a comic book incorporating AI-generated material."

and

See https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...

PaulDavisThe1st3mo ago

But it will be a shitshow either way.

galaxyLogic3mo ago

Would writing a prompt, or few, for an LLM qualify as "the requisite level of creativity is extremely low; even a slight amount will suffice"

1 more reply

IsTom3mo ago

It's not clear to me how much code you would need to modify by hand to qualify for copyright this way, but that's not an impossible avenue.

beloch3mo ago· 4 in thread

Perhaps code licensing is going to become more similar to music.

e.g. Somebody wrote a library, and then you had an LLM implement it in a new language.

---------

Edit: For the sake of this argument, let's pretend we're somewhere with sensible music copyright laws, and not the weird piano-roll derived lunacy that currently exists in the U.S..

fpaf3mo ago

I find the music example very illuminating, thanks! Looking into US Copyright for songs there are two different kinds:

- one for the composition, the musical idea, music, lyrics.

-one for the recording, the music taking shape in a format that someone can listen to

source: https://www.copyright.gov/engage/musicians/

galaxyLogic3mo ago

If a recordoing is made in a club, doesn't the party doing the recording have the copyright to that (live) recording, or is it the performers?

jdndbdjsj3mo ago

Sans contract? Probably like if I take a photo of you holding a copy of a recent book. I own copy right of the photo. The author still has copyright of the book.

greyface-3mo ago

Compulsory licensing for software is going to be fun.

LucasAegis3mo ago· 4 in thread

latexr3mo ago

> If your original thoughts achieve a tangible result through this tool, the ownership should reside with the thinker.

bayindirh3mo ago

Further, what if this tool can reproduce these forbidden things almost or completely verbatim and the user of the tool has no way to verify it?

LucasAegis3mo ago

Think of software development as finding a structural path from point A to point D.

KaiserPro3mo ago

> the ownership should reside with the thinker.

Google spent many many millions undermining that so they could run youtube, the news service and google books (amongst other things.

Disney bought most of congress to do the opposite.

At it's heart copyright is a tool that allows you and me to make a living. However its evolved into a system that allows large corporations to make and hold monopolies.

TLDR: copyright is automatically assigned to the creator of the specific work, not the thinker.

ie thinker: "build me a box with two yellow rabbit ears"

The text is copyright of the "thinker"

maker: builds a box with yellow rabbit ears Unless the yellow rabbit ears are a specific and recognisable of the thinker's work, its not infringement.

rzerowan3mo ago· 2 in thread

Bikeshedding to eventually come full circle to understand why those decisions were made.

Instead we have devs looking to tear down the few user protections FLOSS provides and usher in a locked down hacker unfiendly future.

[1] https://licensecheck.io/blog/lgpl-dynamic-linking

the_mitsuhiko3mo ago

[1]: https://velovix.github.io/post/lgpl-gpl-license-compliance-w...

[2]: https://github.com/indygreg/PyOxidizer/issues/142

rzerowan3mo ago

andsoitis3mo ago· 2 in thread

> I’m a strong supporter of putting things in the open with as little license enforcement as possible.

Oooohkaaaay?

falcor843mo ago

Licensing, and particularly copyleft is based on copyright - you cannot offer a license, if you don't have a copyright on the thing. You can put it in the public domain, but that is very different.

andsoitis3mo ago

7777777phil3mo ago· 1 in thread

badc0ffee3mo ago

I don't think it's been established that clean-room rewrites are no longer painful. We don't know if chardet could have been rewritten so easily if the original code wasn't in the training set.

senko3mo ago· 1 in thread

Maybe, just maybe, this whole AI thing could result in us collectively waking up and realizing copyright is entirely unsuitable for software.

philipwhiuk3mo ago

Or maybe that AI is committing copyright theft?

just69793mo ago· 1 in thread

just69793mo ago

radarsat13mo ago· 1 in thread

duskdozer3mo ago

vbarrielle3mo ago

globular-toast3mo ago

> The motivation: enabling relicensing from LGPL to MIT.

Good heavens, that's incredibly unethical. I suppose I should expect nothing more from a profession that has shied away from ethics essentially since its conception.

> I think society is better off when we share

Me too.

> and I consider the GPL to run against that spirit by restricting what can be done with it.

The GPL explicitly allows anyone to do anything with it, apart from not sharing it.

You want me to share with you, but you don't want to share with me.

cubefox3mo ago

> Unlike the Ship of Theseus, though, this seems more clear-cut: if you throw away all code and start from scratch, even if the end result behaves the same, it’s a new ship.

That's not how copyright works. It doesn't require exact copies. You also can't just rephrase an existing book from scratch when the ideas expressed are essentially the same. Same with music.

fouc3mo ago

Kinda surprised nobody commented on this

bloppe3mo ago

This feels analogous to the paradox of freedom. Truly unlimited freedom would include the freedom to oppress others, so "freedom maximalism" is an unsound philosophy (unless applied solipsistically).

mellosouls3mo ago

In the example given and discussed here the last couple of days there seems to be a process more akin to having an AI create a cast of the pre-existing work and fill it for the new one.

benob3mo ago

It's funny that real value is now in test suites. Or maybe it's always been...

mannanj3mo ago

I think at the core this is a problem of abuse of the commons and parasitic and extractive behavior being tolerated as a norm.

StephenHerlihyy3mo ago

emporas3mo ago

Porting code from one programming language to another will be one of the most important tasks of code gen A.I.

Imagine doing the same with vehicle engines. Less fuel consumption, less pollution, less weight and who knows how many more benefits.

jFriedensreich3mo ago

cheesecompiler3mo ago

Devasta3mo ago

This is awful news, but I don't know what can be done, is it possible to have a new GPL4 that deals with this? I doubt it.

philipwhiuk3mo ago

Meanwhile elsewhere: https://www.theguardian.com/technology/2026/mar/06/uk-arts-m...

ChrisMarshallNY3mo ago

> slopforks

Good term.

casey23mo ago

Pretty simple, if the model was trained on GPL or any copyleft then the output is copyleft (in whole or in part!) you just have a really long preprocessing step before hitting compile.

thangalin3mo ago

Translate an alternative?

https://github.com/albfernandez/juniversalchardet

infinitewars3mo ago

The ship never existed, only the idea of a ship.

davidcollantes3mo ago

> Right now I would argue that unless some evidence of the contrary could be provided, this can be seen as a new implementation from ground up.

Not ship of Theseus, but a "new implementation from ground up.

Evidently, the author prefers MIT (https://github.com/chardet/chardet/issues/327#issuecomment-4...), and seems OK with slop-coding.

jneen3mo ago

I mean, it has to be asked... was the source of chardet not in the training set...?

__mharrison__3mo ago

Licensing is done. Reimplementation will be to easy...

Towaway693mo ago

> There is an obvious moral question here, but that isn’t necessarily what I’m interested in.

Interestingly that‘s also the exact same spot I stopped reading.

The dilution of morals weakens societies. We ignore them at our own peril, the planet and most certainly any god figure doesn’t care.

latexr3mo ago

> There is an obvious moral question here, but that isn’t necessarily what I’m interested in.

And thus we arrive at the absolute shit state the world is in. We keep putting morality aside for something “more interesting” then forget to consider it back in when making the final point.

“Have you tried: “kill all the poor?””

https://youtube.com/watch?v=s_4J4uor3JE

rmoriz3mo ago

I know it's a bit off-topic, but https://www.youtube.com/watch?v=DTYnzLbHUHA

1 more reply

fergie3mo ago

> A court still might rule that all AI-generated code is in the public domain, because there was not enough human input in it. That’s quite possible, though probably not very likely.

Its not only likely, it is in fact the current position, at least in the US.

j / k navigate · click thread line to collapse