Cessation of public development of Kefir C compiler (opens in new tab)

(kefir.protopopov.lv)

141 pointsf311a22d ago134 comments

134 comments

62 comments · 17 top-level

kator22d ago· 14 in thread

> Yet, this shift made me re-evaluate the open source code publishing. Prior to that, I have been positive about free and open software, and considered this to be the default mode for work such as kefir. I did not require any justifications from myself to publish something. Now, however, I feel more and more that the main beneficiaries of my unpaid work are companies scraping the internet to train large language models. Currently accepted status quo in this area goes against my own intentions in licensing this work under GNU GPLv3. Publication has ceased to be the "null hypothesis" for me, and requires explicit mental justification which I am not able to provide.

I feel this pain, one of my small donation driven sites has been destroyed by crawlers who just ignore robots.txt and burn the site into the ground.

Sort of jokingly I proposed an update to the "spam fax" law:

https://www.karlbunch.com/random/website-protection-act/

account4222d ago

This is essentially the digital world transforming from a high trust society into a low trust one. Sad to see.

pibaker21d ago

I don't think the digital world was ever high trust. I mean, everyone above a certain age is trained to never click on the biggest download button on a page and to uncheck any checkboxes during an installation. A certain open source forge used to bundle malware in downloads. You can't walk two steps without hitting cloudflare. All email providers consider random VPS IP ranges to be spam farms. All web servers with public IPs must be up to date or you get pwned instantly and assimilated into a bot farm.

I can go on and on about how much safety measurements we take online since ages ago and how little trust we have for anything that comes through an Ethernet port. I have never needed such levels of vigilance in real life even though I live somewhere with higher crime rates compared to the national average.

oooyay22d ago

Not even just digital; much of the world is shifting from high trust to low trust as well: https://social.desa.un.org/sites/default/files/inline-files/...

lenerdenator22d ago

There are currently a lot of people in the upper echelons of our society who repeatedly and vigorously abuse the high-trust digital world.

We based all of this on gentlemen's agreements and handshakes. That let quite a few people get only very wealthy, instead of hyper-wealthy. Thus those agreements have to be shredded.

AP mentions this in the link:

> Section 227(g)(4). Enforcement. Statutory damages of not less than $500 per server request made in violation of this section, consistent with the per-violation damages established under the original Act for unsolicited facsimile transmissions.

While this is at least something, it's not going to dissuade a startup from doing this sort of thing. They'll find ways to hide the origin of traffic, or just soak up the costs with more VC money.

You need to start throwing people in prison for long periods of time (10+ years) for this sort of thing to stick.

throw1092022d ago

This is kind of crazy. The digital world has never been a "society" except perhaps for the first few years after ARPANET was invented, and it certainly hasn't been a high-trust one for almost as long - we've had spam filters, user account registration required to comment, various authentication methods, moderation, and various things you get in a low-trust environment for decades now. To think otherwise is a bit delusional.

Gormo22d ago

To whom would you attribute the greater part of that reduction in trust: the people using FOSS to train LLMs, or the people trying to block them?

2 more replies

jagged-chisel22d ago

> The sender pays, not the receiver.

You have a hole here. Your web server is sending the response and the bot is receiving.

Fix that and … profit? :-)

wizzwizz422d ago

I'm trying to compose a better wording, but my attempts aren't working. The best I've got is:

> The initiator of the communication pays, not the server operator.

kator22d ago

oh good point got that backwards… OMG my fax brain didn’t even think about it.

malwrar22d ago

Really hate to say it, but I’ve stopped publishing my work too for this reason. I spend most of my time now building my own little software ark, and I aspire to no longer think of programming in the next few years. I feel like the creative economy in general will be unrecognizable in the near future, maybe nonexistent. I wonder what modes of collaboration on ideas might form in the next few years.

irdc22d ago

Here is what the purveyors of AI don't seem to realise. You can bend copyright law all you want in order to train your models on whatever you can grab, but in the absence of genuine protection of their creative work authors are simply not going to be publishing at all.

3 more replies

kator22d ago

The sad thing is I feel trapped on all sides of the debate, I wrote a book about LLMs and human creativity (spoiler Humans win for a long time) but I was going to do it as a blog series, instead I published https://www.amazon.com/dp/B0GXCSY4W8 because I felt at least I might get a bit back for literally 100’s of hours of my life I poured into the book and my editor and friends who read and provided reviews.

And I push a lot of open source code including a ton for the SWGEmu project, but now I’m of mixed mind to stop pushing anything public. I can’t decide, am I talking out of both sides of my mouth, it’s a confusing time to navigate for sure.

1 more reply

lelanthran22d ago

> Really hate to say it, but I’ve stopped publishing my work too for this reason.

Me too; not that I've published a lot, but definitely more than most. That won't be happening anymore.

garaetjjte22d ago

Incredibly rich to complain about LLM scraping with LLM generated article.

bjourne22d ago· 12 in thread

People taking your work and not giving anything back was ALWAYS the risk you took when writing free software. LLM training doesn't change that much. That the us military no doubt is using gcc to compile embedded software for their icbm:s no doubt irks the gnu people. But you can't have it any other way. "You can only use my software for good things" just is not consistent with "free software".

Gormo22d ago

Yeah, I really can't comprehend these sentiments as anything other than an "I don't like AI" argument. FOSS has always been about just writing code and putting it out into the world where others can do as they please with it.

I see a lot of risks involved in people surrendering their own decision-making to LLMs, but that's a question of how they're used, not how they're trained. The idea that using FOSS software to train LLMs is somehow a violation of FOSS norms just doesn't seem valid.

imtringued20d ago

That's just the licensing part. The license says something, but a license doesn't turn people into slaves. The desire or decision to produce software has to come first and only then does code with a license exist.

Before AI and in the early days of FOSS, people assumed that the primary recipient of code sharing were other FOSS enthusiasts, in the form of developers and users.

Then there was a wave of permissive licensing, which obviously brought with it corporate interests, however, this was easily foreseeable and many people who favored permissive licensing intentionally did so to appeal to corporate users, so the risk of them quitting due to perceived abuse was slim.

Now that LLMs are a thing, the primary recipient of a lone developer working on his project isn't really another human being. This human connection is now lost. Instead, your project is now laundered through the model and the model vendor can get away with ignoring your terms and conditions and let others write proprietary software.

In this transition period there were developers who thought that there was always going to be a human connection (even if part of a corporation), but then things changed and they realized their world view was wrong. Given the arrival of this new information, they obviously change their behavior in accordance to how the world actually is.

lelanthran22d ago

> FOSS has always been about just writing code and putting it out into the world where others can do as they please with it.

That is wrong. How can you write that with a straight face? There are projects that are put into the public domain (one major one comes to mind), but the clear majority of FOSS projects have strings attached which make the intention of the authors absolutely clear.

IOW, if you're not happy with what the cost of the product is, then just don't use it.

1 more reply

xigoi22d ago

> FOSS has always been about just writing code and putting it out into the world where others can do as they please with it.

Not true. Most FOSS licenses require attribution and many require derivatives to be released under the same license.

1 more reply

TheOtherHobbes22d ago

There's an almost intergalactic level of irony in the extent to which open source has benefited giant corporations and the military at the expense of individuals, and ultimately contributed to the commercialised enclosure of software IP.

I suppose you could argue it also indirectly led to the empowerment of non-developers to create their own vibe coded solutions. But we're not quite there yet.

And the AI IP that makes that possible is still enclosed rather than open.

bjourne22d ago

Sure, Free Software hasn't been the vehicle for societal change that RMS and others certainly hoped. I remember being flamed out in a user group for suggesting that our conference shouldn't be held in a "non-free" country such as Morocco, Turkey, or China because it's counter-productive to freedom. Very few people actually got it. But it's orthogonal to LLM trainers also using free software in "non-approved" ways.

Gormo22d ago

> There's an almost intergalactic level of irony in the extent to which open source has benefited giant corporations and the military at the expense of individuals, and ultimately contributed to the commercialised enclosure of software IP.

Could you perhaps explain that irony a bit more explicitly?

Can you provide any examples of "commercialized enclosure of software IP" somehow backwashing into the FOSS ecosystem and closing things up that are already open?

nine_k22d ago

Don't open-weight models sort of returning the favor?

LtWorf22d ago

> it also indirectly led to the empowerment of non-developers to create their own vibe coded solutions.

Nobody is empowered to do that because the models to do that aren't free.

fragmede22d ago

> But we're not quite there yet.

Judging from the number of projects I've seen from people who aren't software developers, we're there enough.

xigoi22d ago

Before LLMs, you could use the GNU GPL or other copyleft licenses to protect your code from being used to develop non-free software. Unfortunately, the courts have decided that LLMs are free to ignore licenses.

bjourne22d ago

Copyleft is about republishing. You can't prevent anyone from using your compiler or text editor to develop non-free software.

rgoulter22d ago· 7 in thread

Seems to me LLMs have changed some things. I'm not sure how it's best put, but it used to be:

- Seeing code (or a blogpost or whatever) was a result from effort where thought had gone into it. The writer paid effort so the reader didn't have to.

- There'd be some level of attachment to what you've put effort into.

With LLMs, that's undermined: it's easy to produce thoughtless imitations. Code or comments where thought didn't go into it. So, seeing some result isn't an indication of skill, but also not even an indication thought went into it.

I guess there's still something lost if someone isn't going to share code they've put thought into. -- But on the other hand, if it's just for me & I don't have to share it with a wider audience, getting LLMs to write out code isn't so expensive.. so code itself isn't necessarily something to value so much.

irdc22d ago

But LLMs don’t seem particularly good at inventing new ways to code (or write, or…). It’s literally all derivative. So what happens in 10 years? Are we headed for a great stagnation?

rgoulter22d ago

> But LLMs don’t seem particularly good at inventing new ways to code (or write, or…). It’s literally all derivative.

I think the key part is how much thought goes into something.

Optimistically, LLMs are good at taking unstructured input, and (probably) producing the intended output from that. -- This allows for an interesting new way of coding: a set of instructions don't need to be as rigorous as a shell script, but can be natural language.

That part surely extends creativity. An LLM will be familiar with domain ideas I'm not, even if an LLM is completely disinterested in doing things.

Pessimistically, I think it's still not clear what the right way of interacting online with all of this is (other than clear expectations of "no AI")... in some sense LLM output is worthless to share, in the sense that I'm just as capable of asking the LLM to output something as anyone else is.

mswphd21d ago

Looking at LLMs applications to math might be instructive. A year ago when they had some preliminary claims/results, people would hypothesize they had the answer implicit in their training data (and so were being "better search", but fundamentally doing derivative work). This may have been substantiated sometimes, I forget.

Recently the tune has changed somewhat, say with LLM's approaches to Erdos problems (and in particular the unit distance problem. The LLM solution here spurred progress on another large problem, namely https://arxiv.org/abs/2605.28781 ). There have been no claims that the LLMs work on the unit distance problem was derivative, and I've seen mathematicians claim it would have been accepted to a top journal (say Annals).

In spite of this, the capabilities of LLMs within mathematics are still limited. LLMs seem decent at

1. "constructions", e.g. where you claim \exists object with certain properties. It can help if the verification that the object has these properties is efficiently computable, but I don't believe this sort of verification was used for the unit distance problem.

There are other areas of math that LLMs so far are less adapted to, for example

2. impossibility results, or showing \lnot \exists object with certain properties, or

3. "abstraction building". Often in math results become much easier to obtain if you have "the right definition". Grothendiek was famous for this, as is e.g. Scholze currently.

These claims are based off of current public results via LLMs. It's possible capabilities will develop further. But also, in hindsight, it is natural that LLMs would be better at the thing they ended up being good at.

I'm unsure if there is a way of extracting from this insights to programming/writing. Plausibly, you could see LLM's developments of PoC exploits as similar to (1) but for computer science. It is a concrete "construction" that is efficiently verifiable. (2) would suggest trivial observations that it would be hard for ah LLM to show that a program does not have vulnerabilities. I'm not sure if there are less trivial observations. Finally, (3) might be what you're bemoaning. In simple language, it would currently be surprising if LLMs could create useful, novel, design patterns/abstractions.

uncircle22d ago

Let LLMs ingest its own output, everything past 2022 will be increasingly hallucinatory self-regurgitation.

multjoy22d ago

That’s because they cannot invent anything. They’re reductive, not creative.

dzhiurgis22d ago

It’s like arguing that nobody is going to invent new ways to ride horses in the age to automobile.

2 more replies

f6v22d ago

I don't know... I've been writing code for good twenty years (15 professionally).

First, I think it's the best time to write software since so much boring stuff can be automated. I can put my thoughts into what I'm trying to achieve instead of how. To put it otherwise, I think about big picture much more than about mundane details like dealing with particularities of a programming language.

Second, most people were using SO to solve just about any issue they had. The number of developers producing truly original code was minimal even 10 years ago.

Rochus22d ago· 4 in thread

I have many GPL projects (e.g. https://github.com/rochus-keller/Oberon, https://github.com/rochus-keller/Luon, https://github.com/rochus-keller/Micron) and spend a significant amount of time in them. GPL has always explicitly permitted commercial use; that's a feature, not a bug, dating back to Stallman's original vision. Any person or company can use my code (or Kefir code) under the terms of the GPL, as I use code given away by companies under GPL or even more liberal licences for free. That's the deal. GPL is a license explicitly designed to maximize use, so it doesn't make sense to object to a specific form of use. The claim that AI companies are somehow violating GPL by training on GPL code is legally baseless (I studied law here in Switzerland and had lectures about international IP law); also the FSF itself has not claimed otherwise; even if it were prohibited, it would be a copyright enforcement problem, and not a reason to stop publishing. I don't know Kefir, but it looks like a great (even optimizing) compiler. So it's really a pitty that its development is no longer open source.

ergonaught22d ago

The GPL, unlike the BSD and such, intends to prevent the closing of distributed derivative works. LLMs trained on GPL code can produce derivative works without any enforcement mechanism.

You may be fine with that, but the GPL is not a public domain license, and LLM training treats all things as if they were public domain.

Rochus22d ago

> LLMs trained on GPL code can produce derivative works

This confuses two completely separate things. GPL governs distribution of derivative works. An LLM trained on GPL code does not distribute that code. The model weights are not a copy, a derivative, or a distribution of the training data in any legally recognizable sense; "influenced by" is not "derived from". The enforcement argument is a non sequitur; the GPL has never had a technical enforcement mechanism; it's always been legally enforced after the fact by copyright holders who discover violations. So if the LLM would indeed produce output sufficiently similar to my code and someone would publish it in violation of GPL, I have the same legal means to enforce my rights as if the code was copied by a human.

1 more reply

myrmidon20d ago

> GPL is a license explicitly designed to maximize use

I feel this is a misrepresentation. GPL rather seems designed to maximize source availability for users.

But mandatory public source availability does make selling software products more difficult ("why would anyone pay if they can just use the source"), which is why most commercial software products still sell and ship binaries when they can.

Rochus20d ago

> designed to maximize source availability

Right. It depends on what you mean by "use"; GPL maximizes use in the sense that it prevents anyone from taking the code proprietary and thereby restricting future users' access. But it doesn't touch my actual point, which is that GPL explicitly permits commercial use, broad distribution, and also LLM training (none of which are restricted by the license). The source availability requirement is the condition, not a restriction on who can use the code.

> why would anyone pay if they can just use the source

Red Hat, Qt, and countless others have built commercial businesses on GPL code. So apparently there is a business and people willing to pay even if the source code is available. But that was not my point anyway.

1 more reply

Max-Ganz-II22d ago· 3 in thread

I put my site behind a username/password wall, to block LLM bots.

Xirdus22d ago

Spambots learned to autoregister 30 years ago. Do LLMs not do that? Crazy.

Max-Ganz-II22d ago

User has to email me for access.

1 more reply

krystalgamer22d ago

same, not worth getting 100GB of content getting scrapped every other day.

binaryturtle22d ago· 1 in thread

I'm also very hesitant to release any new works (code, artworks, etc.) to the public. I usually release code under the GPL or AGPL, but I don't think any of those choices are properly respected by the AI crawlers, and subsequent "mixing into" those models.

Multiple times I got partially broken "citations" of GPL licensed code out of the models as answers to basic research questions (aka prompts) w/o any mentioning of the original license applied to the code. Just adding some random bugs every 10th line doesn't make it not a direct derivate. Image generators happily generated Sonics or Bart Simpsons (w/o directly prompting for that either). No mentions that those are copyrighted characters either.

Lerc21d ago

I have gone the other way, I used to release things under MIT licence, but have switched to public domain or unlicenced.

I mostly make things because I felt they should be made. I am fine with what I produce being used by others provided they don't take it away from anyone else.

I was never very happy with the selfishness of the GPL, which is why I tended to prefer MIT, but the stances taken by people in recent years made me realise that nobody owns ideas, and even attribution is commoditised.

I am ok with voluntary attribution so that it may be used as a means to confirm additional information. I don't like the idea that if I think of something, someone else is not allowed to think about it without my permission.

Citation farming is a problem that happened because the value of the idea was placed on the names attached to it. That generated motivation to attach names to ideas as a way to gain power or prestige. To take credit for someone else's idea can only occur is because people have put the credit value onto the person and not the idea. Many of those names are of no use when it comes to verifying if the idea is sound, it's creating a denial of service attack on the ability to validate.

I understand the realities of commerce and academia that put these things in place, and how those who work within those frameworks have to do so in a way that is compatible with them.

I don't like it though, I think it makes the world less informed and less free. I don't have to create under those frameworks myself, so I made the decision to make any idea I have to not be bound to my will or identity.

rurban22d ago· 1 in thread

One of the very few small compilers which passes the full gcc torture tests. But for me kefir is good enough as the reference small compiler. Not as fast as tcc, but more correct

paufernandez22d ago

I've been taking a look at the source and it's a work of art :O

altmanaltman22d ago· 1 in thread

What a well-rounded nicely written announcement that touches on all parts of the argument without any rage baiting or flex etc. It would be easy to just ramble against AI and how its the end of the world etc but the author focused on a point that's not even related to use or misue of AI in software but rather how we have made it acceptable that large corporate companies can skirt copyright without any issue and make rivers of money with it. This problem extends not only to coding but other industries as well.

snarfy22d ago

Instead of a derivative work we have a machine that creates derivative works. I fail to see how this is fair use.

RetroTechie22d ago· 1 in thread

So how big is the community around this project?

If a one-person show, closing it up would effectively kill it? Or (re?)turn it into a hobby project developed at snail pace.

If some community exists: fork coming up?

tocariimaa22d ago

One person show. Effectively, it is dead since now it became the proprietary toy of its author. The author is entitled to do what he wants with his own creation, however.

sneak22d ago· 1 in thread

> I also do not want my future work to be exploited for naught in commercial purposes.

Other people using your code to enrich their lives or businesses doesn't exploit you in any way, as it doesn't cost you a thing. This is irrational.

CamperBob222d ago

Also irrational because just as others benefit from his code, he benefits from theirs. LLMs fulfill the promise of Open Source, they don't violate it.

As long as they are universally available, that is. That's the part people should be concerned about.

keyle22d ago

   This project in particular has been unconcerned with new coding practices so far, primarily, because I derive pleasure from hand-written implementations of my ideas, and believe that overcoming challenges the hard way is the main value I get from it.

This 100% the same for me. Outside of work where speed is more important than quality, and I work with people that use AI, I don't use AI at all on my own projects. It poisons the mind and the soul. Ok that sounds dramatic, but I felt down up until the point where I started hand writing everything again. Software engineering is still fun and powerful, and the hell with where the world is going.

genxy22d ago

Surprised no one has yet linked to the source https://sr.ht/~jprotopopov/kefir/

nianderwallace22d ago

People in other professions are jumping on this bandwagon - Tony Gilroy decided not to publish Andor TV show scripts to prevent AI companies using them for training.

see https://variety.com/2025/tv/news/andor-creator-refuses-publi...

turtleyacht22d ago

It was nice hearing about it. If this is a healthy direction for the project, then so be it. At least source to previous versions is still available.

kazinator22d ago

I'm finding it hard to be motivated to continue on language dev work. I feel it may also have to do with AI. Not so much the predatory aspect of it, like this author, but something else: shall we say, certain revelations about the nature of the target audience.

fithisux22d ago

Same situation some time ago with Solar assembler

ryanshrott22d ago

The gcc torture tests are no joke. I skimmed them once thinking I’d write a toy C compiler. Thousands of test cases covering edge cases I’d never even thought about. Respect to anyone who gets through the full suite.

j / k navigate · click thread line to collapse

134 comments

62 comments · 17 top-level

kator22d ago· 14 in thread

I feel this pain, one of my small donation driven sites has been destroyed by crawlers who just ignore robots.txt and burn the site into the ground.

Sort of jokingly I proposed an update to the "spam fax" law:

https://www.karlbunch.com/random/website-protection-act/

account4222d ago

This is essentially the digital world transforming from a high trust society into a low trust one. Sad to see.

pibaker21d ago

oooyay22d ago

Not even just digital; much of the world is shifting from high trust to low trust as well: https://social.desa.un.org/sites/default/files/inline-files/...

lenerdenator22d ago

There are currently a lot of people in the upper echelons of our society who repeatedly and vigorously abuse the high-trust digital world.

We based all of this on gentlemen's agreements and handshakes. That let quite a few people get only very wealthy, instead of hyper-wealthy. Thus those agreements have to be shredded.

AP mentions this in the link:

While this is at least something, it's not going to dissuade a startup from doing this sort of thing. They'll find ways to hide the origin of traffic, or just soak up the costs with more VC money.

You need to start throwing people in prison for long periods of time (10+ years) for this sort of thing to stick.

throw1092022d ago

Gormo22d ago

To whom would you attribute the greater part of that reduction in trust: the people using FOSS to train LLMs, or the people trying to block them?

2 more replies

jagged-chisel22d ago

> The sender pays, not the receiver.

You have a hole here. Your web server is sending the response and the bot is receiving.

Fix that and … profit? :-)

wizzwizz422d ago

I'm trying to compose a better wording, but my attempts aren't working. The best I've got is:

> The initiator of the communication pays, not the server operator.

kator22d ago

oh good point got that backwards… OMG my fax brain didn’t even think about it.

malwrar22d ago

irdc22d ago

3 more replies

kator22d ago

1 more reply

lelanthran22d ago

> Really hate to say it, but I’ve stopped publishing my work too for this reason.

Me too; not that I've published a lot, but definitely more than most. That won't be happening anymore.

garaetjjte22d ago

Incredibly rich to complain about LLM scraping with LLM generated article.

bjourne22d ago· 12 in thread

Gormo22d ago

imtringued20d ago

Before AI and in the early days of FOSS, people assumed that the primary recipient of code sharing were other FOSS enthusiasts, in the form of developers and users.

lelanthran22d ago

> FOSS has always been about just writing code and putting it out into the world where others can do as they please with it.

IOW, if you're not happy with what the cost of the product is, then just don't use it.

1 more reply

xigoi22d ago

> FOSS has always been about just writing code and putting it out into the world where others can do as they please with it.

Not true. Most FOSS licenses require attribution and many require derivatives to be released under the same license.

1 more reply

TheOtherHobbes22d ago

I suppose you could argue it also indirectly led to the empowerment of non-developers to create their own vibe coded solutions. But we're not quite there yet.

And the AI IP that makes that possible is still enclosed rather than open.

bjourne22d ago

Gormo22d ago

Could you perhaps explain that irony a bit more explicitly?

Can you provide any examples of "commercialized enclosure of software IP" somehow backwashing into the FOSS ecosystem and closing things up that are already open?

nine_k22d ago

Don't open-weight models sort of returning the favor?

LtWorf22d ago

> it also indirectly led to the empowerment of non-developers to create their own vibe coded solutions.

Nobody is empowered to do that because the models to do that aren't free.

fragmede22d ago

> But we're not quite there yet.

Judging from the number of projects I've seen from people who aren't software developers, we're there enough.

xigoi22d ago

bjourne22d ago

Copyleft is about republishing. You can't prevent anyone from using your compiler or text editor to develop non-free software.

rgoulter22d ago· 7 in thread

Seems to me LLMs have changed some things. I'm not sure how it's best put, but it used to be:

- Seeing code (or a blogpost or whatever) was a result from effort where thought had gone into it. The writer paid effort so the reader didn't have to.

- There'd be some level of attachment to what you've put effort into.

irdc22d ago

But LLMs don’t seem particularly good at inventing new ways to code (or write, or…). It’s literally all derivative. So what happens in 10 years? Are we headed for a great stagnation?

rgoulter22d ago

> But LLMs don’t seem particularly good at inventing new ways to code (or write, or…). It’s literally all derivative.

I think the key part is how much thought goes into something.

That part surely extends creativity. An LLM will be familiar with domain ideas I'm not, even if an LLM is completely disinterested in doing things.

mswphd21d ago

In spite of this, the capabilities of LLMs within mathematics are still limited. LLMs seem decent at

There are other areas of math that LLMs so far are less adapted to, for example

2. impossibility results, or showing \lnot \exists object with certain properties, or

3. "abstraction building". Often in math results become much easier to obtain if you have "the right definition". Grothendiek was famous for this, as is e.g. Scholze currently.

uncircle22d ago

Let LLMs ingest its own output, everything past 2022 will be increasingly hallucinatory self-regurgitation.

multjoy22d ago

That’s because they cannot invent anything. They’re reductive, not creative.

dzhiurgis22d ago

It’s like arguing that nobody is going to invent new ways to ride horses in the age to automobile.

2 more replies

f6v22d ago

I don't know... I've been writing code for good twenty years (15 professionally).

Second, most people were using SO to solve just about any issue they had. The number of developers producing truly original code was minimal even 10 years ago.

Rochus22d ago· 4 in thread

ergonaught22d ago

The GPL, unlike the BSD and such, intends to prevent the closing of distributed derivative works. LLMs trained on GPL code can produce derivative works without any enforcement mechanism.

You may be fine with that, but the GPL is not a public domain license, and LLM training treats all things as if they were public domain.

Rochus22d ago

> LLMs trained on GPL code can produce derivative works

1 more reply

myrmidon20d ago

> GPL is a license explicitly designed to maximize use

I feel this is a misrepresentation. GPL rather seems designed to maximize source availability for users.

Rochus20d ago

> designed to maximize source availability

> why would anyone pay if they can just use the source

1 more reply

Max-Ganz-II22d ago· 3 in thread

I put my site behind a username/password wall, to block LLM bots.

Xirdus22d ago

Spambots learned to autoregister 30 years ago. Do LLMs not do that? Crazy.

Max-Ganz-II22d ago

User has to email me for access.

1 more reply

krystalgamer22d ago

same, not worth getting 100GB of content getting scrapped every other day.

binaryturtle22d ago· 1 in thread

Lerc21d ago

I have gone the other way, I used to release things under MIT licence, but have switched to public domain or unlicenced.

I mostly make things because I felt they should be made. I am fine with what I produce being used by others provided they don't take it away from anyone else.

I understand the realities of commerce and academia that put these things in place, and how those who work within those frameworks have to do so in a way that is compatible with them.

rurban22d ago· 1 in thread

One of the very few small compilers which passes the full gcc torture tests. But for me kefir is good enough as the reference small compiler. Not as fast as tcc, but more correct

paufernandez22d ago

I've been taking a look at the source and it's a work of art :O

altmanaltman22d ago· 1 in thread

snarfy22d ago

Instead of a derivative work we have a machine that creates derivative works. I fail to see how this is fair use.

RetroTechie22d ago· 1 in thread

So how big is the community around this project?

If a one-person show, closing it up would effectively kill it? Or (re?)turn it into a hobby project developed at snail pace.

If some community exists: fork coming up?

tocariimaa22d ago

One person show. Effectively, it is dead since now it became the proprietary toy of its author. The author is entitled to do what he wants with his own creation, however.

sneak22d ago· 1 in thread

> I also do not want my future work to be exploited for naught in commercial purposes.

Other people using your code to enrich their lives or businesses doesn't exploit you in any way, as it doesn't cost you a thing. This is irrational.

CamperBob222d ago

Also irrational because just as others benefit from his code, he benefits from theirs. LLMs fulfill the promise of Open Source, they don't violate it.

As long as they are universally available, that is. That's the part people should be concerned about.

keyle22d ago

   This project in particular has been unconcerned with new coding practices so far, primarily, because I derive pleasure from hand-written implementations of my ideas, and believe that overcoming challenges the hard way is the main value I get from it.

genxy22d ago

Surprised no one has yet linked to the source https://sr.ht/~jprotopopov/kefir/

nianderwallace22d ago

People in other professions are jumping on this bandwagon - Tony Gilroy decided not to publish Andor TV show scripts to prevent AI companies using them for training.

see https://variety.com/2025/tv/news/andor-creator-refuses-publi...

turtleyacht22d ago

It was nice hearing about it. If this is a healthy direction for the project, then so be it. At least source to previous versions is still available.

kazinator22d ago

fithisux22d ago

Same situation some time ago with Solar assembler

ryanshrott22d ago

j / k navigate · click thread line to collapse