Did Claude increase bugs in rsync? (opens in new tab)

(alexispurslane.github.io)

512 pointslogicprog17d ago567 comments

567 comments

217 comments · 68 top-level

thorum16d ago· 19 in thread

Unfortunately for the people mad about this, I predict the only thing they will accomplish by pressuring the rsync maintainers, is to discourage everyone else from responsibly disclosing their use of AI. You’re just going to make people disable Claude attribution on their commits to avoid drama.

zzyzxd16d ago

I never care about AI usage disclosure, because I don't believe that human produced code is necessarily better than AI produced code, unless it's someone I personally know.

People need to be responsible for code they commit and push anyways. This has never changed. Whether the code is written by hand, by their cat walking over keyboard, or by AI, is not my concern.

A project's code quality can decline for all kinds of reasons. I don't think it's productive to laser-focus on whether it's produced by AI or not. That's a distraction. If a person just want to find excuse to criticize AI, and another person wants to fight back and defend AI, sure, go for it. But that's not how you would want to assess a project's code quality.

calvinmorrison16d ago

something as simple as requiring sign-offs like the DCO maybe relevant to people who care. I do think the driveby stuff may get smaller. People dont need to get stuff upstream. I have lots of patches I am keeping downmstrea and instead have a trigger system when new packages updates drop into debian and i rebuild the package with my patches on top using quill. Other systems like gentoo basically always supported this flow.

So - why bother forking or going upstream? maybe its selfish. I think publishing the patches are cool but I feel less of a need to force other people into doing what I want or even writing every possible configuration or solution. I just hack it for me

delusional16d ago

> People need to be responsible for code they commit and push anyways.

Well the GPL (which rsync is licensed under) says: "This program comes with ABSOLUTELY NO WARRANTY" so actually nobody is responsible for anything.

2 more replies

matheusmoreira16d ago

> You’re just going to make people disable Claude attribution on their commits to avoid drama.

People should be doing this regardless of drama. No reason to provide free advertising for trillion dollar corporations. Generated-by trailers are only relevant when contributing to third party projects, in that case disclosure is polite.

Aurornis16d ago

The value of the Claude attribution is that you can tell at a glance who used AI.

I don't care about the advertising angle. We all know Claude by now. I want some indicator that AI was used.

3 more replies

julianeon16d ago

If Claude is actually good enough to commit to rsync, of course I'm going to look at that and think "it's good enough for my side project too." And (benefit to companies aside) that is info it is useful to know, if it's true.

1 more reply

trwired16d ago

Is that a bad thing? I mean from the perspective of Anthropic's marketing department sure, but if agents are just another type of tool in developer's tool belt - as I see people recently like to claim - attribution feels kinda weird. In the end it is the developer who is responsible for their commits.

eli16d ago

Yeah I think it's a bad thing. It's context about how open source code was written that is lost.

And I guess maybe there's no such thing as bad press but at least in this cases it doesn't seem like effective marketing for Anthropic.

eschaton16d ago

“Don’t get mad at people for doing something unethical or immoral, or they’ll do something unethical or immoral!”

Disabling attribution of LLM-generated code is fraud, because you’re saying you wrote the code.

Of course that fits right in with the use of an LLM to generate code in the first place, since what it’s actually doing is regurgitating its inputs stripped of any license and copyright notice.

UebVar16d ago

I'm very certain that this is not fraud, across multiple legal systems, both roman and common law. In both cases fraud requires a person is deprived of a material good. Neither the defrauded person or their material loss is present in this case. Maybe there is a oddball legal system somewhere in the world where fraud is something entirely different, but i doubt it. "Fraud", just like "Decorator Pattern" is a well established concept and pretty simple concept, even if there are edge cases. This does not fit at all.

In academia this is miss-attribution, outside of academia this does not exist.

This is clearly not not copyright infringement either as LLMs do not claim copyright, nor could they. Just like the photograph taken by the monkey, or pictures drawn by crows. LLM output is not a creative work either.

If this is unethical or immoral is a totaly different question. I really dont think so and I dont think you argue that position well.

1 more reply

jhack16d ago

"Disabling attribution of LLM-generated code is fraud, because you’re saying you wrote the code."

Should there by attribution for Google or Stack Overflow copy/paste? Who should we bully about this?

2 more replies

Leynos16d ago

Outside of situations where it is required by contract, attributing AI usage is a courtesy, nothing more.

1 more reply

infamouscow16d ago

It's only fraud if a person signed their name stating such.

Their name being attached to the commit is itself, irrelevant, as their is no way to submit a patch otherwise. You could use a fake name, but you're just moving this fraud problem around.

You're going to have a hard time convincing anyone that using a tool constitutes fraud. Frankly, it's silly, if not genuinely stupid.

Film photographers in the early 2000s routinely called digital "not real photography" and Photoshop "cheating" because you could delete bad shots and fix everything later. Traditional musicians and critics dismissed drum machines, synthesizers, and autotune as soulless tools.

1 more reply

Unit32716d ago

This argument gets trotted out every time but it doesn't convince me of anything. Yes, calling things out creates an incentive for people to hide them, but so what?

Setting aside the whole AI = bad argument, let's do a metaphor. Tax evasion is bad and unethical and you should call it out where you see it. But wait, that creates an incentive for people to hide it! So I'd better not call it out, it's best to just keep my mouth shut.

mohamedkoubaa16d ago

I'd be willing to be that an undisclosed LLM disclosure will follow a developer around for the rest of their career

eschaton16d ago

That kind of fraud absolutely should. (I suspect you mean “undisclosed LLM use.”)

1 more reply

Daishiman16d ago

I'm willing to be that in two years that's going to be completely irrelevant because the amount of code written by hand will drop to less than 10%.

overgard16d ago

I mean, I don't think commits are the place for tool attributions. I want to know what the change was, I'm not really interested in your tool selection (put that in the PR if it's relevant). It'd be just as irrelevant to see "written on my macbook in neovim"

hnav16d ago

Depends on what the claude attribution actually means. A lot of people will just get the thing building and then ship. To me that attribution is generally a red flag.

1 more reply

aesthesia16d ago· 13 in thread

I don't have a dog in this fight, but a few points that look a little suspicious:

- The release with the highest number of attributed bugs is the release _right before_ the first release with Claude-coauthored commits, released in January; is there a chance that unattributed LLM-authored commits made it into this release?

- The release attribution methodology is not great, since it will tend to attribute bugs introduced in a minor version update to the longest-lived patch release of that minor version. I doubt that 3.4.1 actually introduced a lot of bugs, but since it was released a day after 3.4.0, bugs that were introduced in that release get attributed to 3.4.1.

- Relatedly, more recent releases have had less time to have bugs filed against them, so there may be a bit of a bias toward evaluating recent releases as less buggy.

theteapot16d ago

Agree. From the article:

> Here's my favorite part, though. Digging into the data, one of the first things that jumped out at me with blinding clarity was that the worst release, by far, in rsync history was entirely prior to the introduction of Claude ... And yet nobody noticed.

Language really does suggest the article's author does have a dog in this fight and is cloaking opinion in fancy statistics jargon. "Blinding clarity"? All you have to do is draw a plot. And anyway, v3.4.1 was 2025-01-16, technically well within the AI assisted coding era and before attribution was becoming standard practice.

iandinwoodie16d ago

Also from the article:

> "Claude clearly made things worse" &emdash; the main claim

This article was clearly generated by AI, yet I found no mention/attribution of that by author.

How likely is it than someone who vibe codes articles would also vibe code the underlying analysis and be eager to accept an outcome that is highly validating of that person’s workflow? I’d say very.

3 more replies

OptionOfT16d ago

You can use LLMs in multiple ways, from very hands on to make local changes to completely hands-off.

I've seen plenty of code that was LLM generated but the commit message itself did not have the co-author attached to it. This only seems to happen when someone's interface to the codebase is completely though Claude/Codex/..., and those are usually the most verbose commits, and yet they say the least, because they just summarize the code changes, not the why.

On the other hand I've seen developers using Claude as a tool. They have VSCode open and a terminal window with Claude and go back and forth, ensuring they write correct code, and leave the plumbing to Claude.

So maybe the author of the code started off small and it grew over time?

hparadiz16d ago

I would expect a mature code base like rsync to have a lot of unit tests and integration tests and frankly if there's not enough that such bugs haven't been caught; that should be your first use of LLMs in order to setup some deterministic guidelines when you do start making changes to your actual code.

I have been experimenting with both aforementioned styles with interesting results.

2 more replies

logicprogOP16d ago

Your first and second points seem to contradict each other because if all of the bugs for 3.4.1 should be attributed to 3.4.0, that pushes the timetable back even further that unattributed LLM commits would have to have been being committed to the project, which just makes your point even more absurd.

Which brings me to my overall response, which is that there is absolutely no evidence, and nothing even intimating this hypothesis, that LLM commits were secretly being added to earlier releases before they were attributed, and that's why the rate of bugs is higher. There's no reason to think that it's an unreasonable thing to think, and there's no evidence for that whatsoever unless you beg the question and assume that higher bug counts must automatically indicate AI involvement, which is just circular reasoning. You're essentially just making up a hypothesis out of thin air to preserve your point.

Regarding your third point, that one's fair, but I've done the analysis and I can put it up if you want, as to how long it usually takes to find bugs and how far through the release cycle we are for each version.

aesthesia16d ago

Sorry, I should have said this explicitly in the original comment: I think you're likely _correct_ that there isn't a clear increase in the rate of bugs attributable to LLM-authored code in rsync. Your analysis provides evidence in this direction; these are just the things that made me go "hmm". They're not accusations or claims that the conclusion is invalid. But they're definitely things to be curious about.

Regarding unlabeled LLM-authored commits, I don't think it's unreasonable in general to think that an open-source project might have had unlabeled LLM-authored commits at some point before 2026. Looking more closely at rsync's recent commit history, I think it's less likely in this case. There's just a low number of commits in general, _until_ large batches of Claude-authored commits start showing up early this year. But this then raises some questions about the bugs-per-commit metric; it does correct for something like "size of release", but also obscures a significant shift in commit velocity that may be downstream of adding LLM development tools to the workflow.

Like I said, I don't have a dog in this fight, and I try not to approach sorts of questions from a position of explicit advocacy. I do think it's an interesting question, though, and we should try to understand what the data is actually telling us.

jonquark16d ago

Isn't the metric that you've used "bugs per commit ~ per new line of code" going to miss the issue?

All code is technical debt.

If rsync releases used to have 500 lines changed and 5 bugs in and AI-powered rsync releases have 50000 lines and 500 bugs, it's the same bugs/line but much worse experience for the user?

I've not looked into the details of this case and I do use AI assistance coding at work but in my experience, the problem is that it's too easy to write lots of code and therefore hard to review the huge volumes of code and this analysis will ignore that?

edit: actually your table shows there weren't unusually large numbers of commits in this release, so perhaps my initial skepticism shows a bias I have?

1 more reply

hariseldom16d ago

I started to look into the same thing considering releases are quite infrequent. To avoid the issue of unattributed LLM-authored commits, in my opinion the analysis should include a comparison to bug severity before and after release v3.3.0 (date April 6th, 2024)

PunchyHamster16d ago

Let's start with most outright alarming error - the claude statistics are taken out of whole 2 data points

logicprogOP16d ago

That's sort of the point. There isn't enough data to extrapolate, and yet that's exactly what those outraged about AI were doing, and when you do do the very minimal types of analyses (permutation tests, and looking at distributions, mostly) that are actually valid, safe, standard, and useful to do on such low amounts of date, again, no evidence for the outrage shows up, and the two releases look so normal that it sort of shows no one would've cared if they hadn't known or found out that Claude was involved.

I really think this a much better standard of evidence — limited though it is — to outrage-fueled cherry-picked anecdotes, which is what has been driving this whole thing. If you disagree, and think the outrage should go one when I've shown there's an absence of evidence entirely for it (although of course, that's not evidence of absence; maybe I'll have to eat my words 5 releases down the line, but appealing to that now feels like a Russell's Teapot), would you care to explain why?

2 more replies

runarberg16d ago

The interpretations of the p-value is also alarming. One of the first thing they teach you in statistics class is: “an absence of evidence is not evidence of absence”.

This analysis showed that there is indeed an absence of evidence, but it concludes there is evidence of absence.

Traditional p-hacking is done by oversampling and overtesting. If you do 20 analysis on average one will show p < 0.05 by random chance. This analysis is doing the inverse of that. Under-sampling, and concluding with p > 0.05

2 more replies

its-summertime16d ago

If one asks "Is the house on 123 Road Street, NJ, taller than the statistical average", then that there is only 1 datapoint for the house on 123 Road Street, NJ. Which is also 100% of the houses on 123 Road Street, NJ.

kelnos16d ago

You can apply that to the outrage too: the people pissed off about this are going off 2 measly data points.

jarym16d ago· 13 in thread

I've been coding for over 2 decades. I love it, I've always loved it and I likely always will.

I was an AI skeptic some months ago but truly Claude and Codex have changed my development style and velocity in a way I never imagined would ever be possible. With that, yes, I produce more code and am finding more bugs.

So looking over at comments in HN articles the amount of polarising hate to anything produced with AI is quite surprising. Just because some AI helped or even produced entirely doesn't suddenly make a project 'vibe coded' as if that's meant to be some insult levelled at users of LLMs.

It reminds me a lot of when offshore outsources started getting more software development work from the mid-90s with all the derogatory remarks made towards 'Indian developers'. Now we're in the mid 2020s and similar remarks are made towards AI.

I don't get it. I really don't. What I do know for sure is more and more code will be AI generated with or without the detractors.

jiggawatts16d ago

I work with outsourced code all the time and it is a tyre fire without exception. I just spent a week scrubbing a codebase where some dev “did the needful” and committed an on-by-default flag to bypass authentication checks because he didn’t known how to set up his local work environment.

People report the same “took a shortcut” issue with AI vibe coding, and I can confirm that I’ve had to rewrite practically everything the AI generated for me, despite using a frontier model dialed up to 11 thinking levels.

Having said that, AI is very useful for other activities like PR review, security vulnerability analysis, typo hunting, reverse engineering, etc.

I’m probably going to have to increase my subscription to the next tier but at the same time I still can’t use any of the code it generates.

If even one person can simultaneously experience "very useful, need to pay more for it" and "useless output code quality" then of course you'd expect a variety of opinions amongst the general user base.

albedoa16d ago

> I work with outsourced code all the time and it is a tyre fire without exception. I just spent a week scrubbing a codebase where some dev “did the needful” and committed an on-by-default flag to bypass authentication checks because he didn’t known how to set up his local work environment.

OP knows this but finds himself in the strange position of having to defend India slop in order to defend AI slop, totally unnecessarily and unprompted. It's baffling to you and me.

int_19h16d ago

I was similarly an AI skeptic 3 years ago. When GPT-4 was the state of the art, I thought we're going to plateau soon because of context size limits (remember back when you had to pay insane money just to get 32K)?

Last year was the first time I saw an AI agent actually debug and fix a non-trivial bug in a satisfactory way. Even then, trying to use it on larger tasks made it clear that it wasn't something I could just hand over the issue tracker to.

Now? I've been using Codex for the past several months to work on a nontrivial project. Which was prototyped in C++ (for library reasons mostly), then had the initial version written in Haskell, and more recently I got it ported to Rust to keep memory use in check on mobile.

These things are not trouble-free, but the sheer amount of progress made in just the last year alone is astounding. Skepticism is well and good, but healthy skepticism ought to yield to tangible evidence.

Joel_Mckay16d ago

Good code is a living document that shows intent, and allows ease of maintainability.

Most people feel more productive with chat bots, but often end up wasting more time chasing self-inflicted issues. Same clown-car of Dev-ops proponents no doubt billing by the hour. =3

nomel16d ago

I've always noticed, within any subject involving tools, there are people who like the tools, and some people who like to use the tools to do something else.

With programming, I've always been in the later: it's a tool that allows me to do what I actually love, which is problem solving, system level thinking, and providing some nice solution to that problem, that happens to be through software.

So, I have an absolute blast with AI, because it helps do the more boring bits. And, seeing my non-programming colleagues get excited to see their vibe coded ideas become reality has been so much fun.

I'm genuinely curious to hear the perspective of someone anti-AI, who works in software. Perhaps the impending doom/skill shift of our profession?

CapsAdmin16d ago

I'm not anti-AI but something I've been thinking about is the discipline it requires. As you said, it's a tool that allows you to rename a variable name on one end and do complete vibe coding on the other end. Developers may say that we should stay somewhere left on that spectrum, because that's where human's are more involved.

But developers also say good practices should be followed when talking to each other, and while some may do, reality is often very different.

It requires discipline, which varies a lot between developers, between projects, current mood, and so on.

In the beginning you might be careful doing small changes, but after a while you might get more tempted to accept the output for what it is, because ultimately that's much easier.

So the way I see it; the left side is harder work and potentially bigger but delayed dopamine hits, the right side is quick dopamine hits. How do we (at least those who struggle with discipline) resist just slipping to the right?

I started out carefully myself and slipped more into vibe coding, but I don't feel particularly proud of it for some reason.

1 more reply

yw341016d ago

I am anti-vibe coding if that meets your criteria?

Reviewing vibe-coded PRs and features has been utterly exhausting over the past few months.

I work on critical, mature software - a small change in behaviour can mean data loss or non-compliance with regulations for our customers. The biggest problem with AI PRs is the sheer amount of churn, extra code and lack of intent with the PRs it generates.

The only way I can describe the latter is that an AI-only PR feels to me like a painting where everything is high detail - and you have to comb over each part before you understand why it's there because so much is superfluous. A well written human PR on the other hand, is painted such that your eye naturally follows the thought process of the author so you can just nod along during the review, as if the solution was obvious.

Also when I'm _using_ the agent; at least 50 percent of my time is spent telling it to stop with it's approach so it doesn't go down a useless rabbit hole and waste tokens.

2 more replies

lelanthran16d ago

> So, I have an absolute blast with AI, because it helps do the more boring bits.

So... you're vibing? Not looking at the code at all?

Joel_Mckay16d ago

Personally, it would still bother me if some lazy bro hit a code-generator and people end up dead.

For context search, I find LLM quite useful... still wrong 20% of the time... but it has some utility.

Here is a thought experiment: If "AI" will eventually generate your work, than what actual value do you bring to the table? =3

tom_16d ago

I just really hate talking to the computer in human language.

albedoa16d ago

> It reminds me a lot of when offshore outsources started getting more software development work from the mid-90s with all the derogatory remarks made towards 'Indian developers'.

What was the impetus of the derogatory remarks?

kelnos16d ago

Some of it was indeed driven by sub-par work from the outsourcing firms, as the style of work was new and people on both sides hadn't developed the right skill set and processes to do the work well.

Some of it was genuine cultural differences. It's hard to work with people and get the results you want when you don't understand their culture, and how they communicate. (For example, people from some cultures just can't say "no" or "I don't know"; you need to learn how to communicate with them in a different way to get the understanding you need.)

Some of it was certainly a form of jingoistic or xenophobic protectionism.

Joel_Mckay16d ago

LLM are good for context search, and template output.

However, you also get the lowest common salient answer guaranteed, uncopyrightable work (differs from public domain), and potential legal peril from copyright bleed-through.

We are in the golden Napster age of isomorphic plagiarism. =3

GodelNumbering16d ago· 12 in thread

Was just looking at commits and came across a commit and its revert

original commit: https://github.com/RsyncProject/rsync/commit/d046525de39315d...

```

- if (!ptr)

- ptr = malloc(num * size);

- else if (ptr == do_calloc)

+ if (!ptr || ptr == do_calloc)

   ptr = calloc(num, size);

```

Written with claude. This is a good example of what slips through LLM attention. It forces all allocations to be calloc as if it is a strict upgrade. For large and recursive allocations, this becomes a significant cost.

reverted in https://github.com/RsyncProject/rsync/commit/7db73ad9a1b8721...

if you read the description of revert half carefully, it's easy to tell that even that was written by an LLM .

I can understand the sentiment of whoever posted the original thread.

wolletd16d ago

Also the amount of commits is suspicious. In the last two months, rsync had about as much commits as in the last two years before that. Most of them written with claude. And then stuff like this is in there.

That's exactly what I'd expect when someone is excited about AI usage and becomes... well, sloppy.

logicprogOP16d ago

Tridge already explains this:

"Like many developers of open source packages I’ve been hit by a flood of security reports lately in my role as the rsync maintainer. Many of those reports are AI generated (not all though, there are some notable ones with very careful and high quality manual analysis).

As this flood started to get more intense I realised I needed to raise the defences on rsync a lot — we needed much more thorough test suites, code coverage analysis, CI testing on a lot more platforms, deliberate and thorough scanning for possible security issues (so I find at least some of them before other people!) and the addition of a whole lot of defence-in-depth hardening techniques. This is all a huge amount of work. "

https://medium.com/@tridge60/rsync-and-outrage-d9849599e5a0

2 more replies

whateveracct16d ago

mythical man month only gets more prescient as time passes

lokar16d ago

I would expect a 10x change rate, even carried out by clones of the existing maintainers to result in more bugs.

gravypod16d ago

> Also the amount of commits is suspicious. In the last two months, rsync had about as much commits as in the last two years before that.

I wonder if the data looks worse or better when not doing per-10commit and instead do per-commit.

echelon16d ago

Seems like someone could use Claude to port rsync to Rust and the whole enterprise would be safer from things like this.

Start with unsafe then gradually convert into idiomatic Rust.

1 more reply

CaliforniaKarl16d ago

> Written with claude.

No.

The reversion commit references https://github.com/RsyncProject/rsync/issues/959. In that GitHub issue is this comment:

> It got a claude co-authored tag on it as I got it to do some tidy ups of a series of commits, and that is just what it does when it makes any modification. It doesn't mean the change was written by claude. It was written by me.

scottlamb16d ago

> This is a good example of what slips through LLM attention. It forces all allocations to be calloc as if it is a strict upgrade.

I wouldn't assume Claude made that decision; it's not as if that was some incidental thing that it snuck into a large commit. The commit message starts with "zero all new memory from allocations", and that's exactly what the commit does. What do you imagine the prompt was?

It seems totally plausible to me that a human initially thought this was an improvement, then rethought after discovering the RSS regression. And it's not a law of nature anyway that this change has to increase RSS; calloc could special-case the case in which memory was freshly returned from the OS, knowing fresh memory mappings are zeroed anyway.

I blame AI for these regressions mostly in the sense that it caused a flurry of vulnerability reports. Those led to a flurry of quick fixes. Sometimes quick fixes cause other problems.

delusional16d ago

You don't really have to guess. The guy told us the AI didn't suggest this specific change:

> The change to zero memory was my idea and my change. It was a reaction to a security report I got which caused use of an element past the end of an array. By zeroing the allocation I could ensure that misuse of that memory if a similar bug came up in the future could only cause a null ptr deref, which is better than the chance of a valid pointer. It got a claude co-authored tag on it as I got it to do some tidy ups of a series of commits, and that is just what it does when it makes any modification. It doesn't mean the change was written by claude. It was written by me.

https://github.com/RsyncProject/rsync/issues/959#issuecommen...

2 more replies

tom_16d ago

AI multiplied by Linux overcommit. What times we live in!

(My own view: 10.8 GB is nothing these days. Your sprintf buffers are probably larger than that. (And if they aren't: they should be. That, or you should start using snprintf...))

baq16d ago

sprintf() should be a longer way to write abort(), change my mind

1 more reply

alfiedotwtf16d ago

AI is fine, and in fact fun to use... committing AI written code without understanding Every. Single. Line. Of. Changes is on the committer. You can't LGFM for vibe code ffs

RustyRussell16d ago· 9 in thread

For those commenting, I suggest you read the post linked by the rsync author:

https://medium.com/@tridge60/rsync-and-outrage-d9849599e5a0

(Disclosure: while I haven't talked with him in years, Tridge was my colleague and mentor for many years. I feel it is worth considering his view before joining a crusade)

jorvi16d ago

> I thought it would be a good idea to do the core structure for the new test suite in public on master first though given all the rage that has generated maybe that was a bad idea.

I don't entirely understand what this is saying. People wouldn't have been outraged if only the tests had been updated and/or he pushed solely on master - but he pushed breaking changes onto the release branch(es) too. Breaking workflows that have worked for years is a prime way to get people irate, and then seeing "Claude" in the commits just pours gasoline onto the fire.

RustyRussell16d ago

It seems that wasn't the Claude part, though I haven't seen a full analysis of exactly what broke. I also only saw one report: are there multiple, or do you just perceive that?

Rsync has many options: I can totally believe that fixing a bug in one place broke someone's usage, to be fair.

jpalomaki15d ago

"yes, there were regressions in some use cases of rsync in the 3.4.3 release. I quite deliberately tried to err on the side of fixing security issues for that release, and there were some valid (but unusual) use cases that got caught up in the changes"

matheusmoreira16d ago

This should be the top comment.

I think it's pretty sad that he even had to write it. Quite a lot of judgement from people who aren't paying his bills.

Laurel123416d ago

Yeah a big reason you see so much pushback on clanker slop is that it's having (and there was certainly the expectation of it having) a negative impact on the ability of plenty of people to pay their bills.

2 more replies

dnnddidiej16d ago

The title at least sounds less like judgement and more analysis and more about AI assistance (and claude in particular) than rsync. Maybe I am too used to postmortems!

1 more reply

guilhas14d ago

> Now if any of the people posting the rage stuff want to actually review any of the code I’ve published and make constructive criticisms then that would be great!

When you quickly churn more lines of code in a few days than you changed in months, and then release them as a normal, not sure you're expecting "constructive criticism"

Also if I suspect the project is just slopping high amount of code without proper thought, I probably won't invest my time into reading those changes

advael12d ago

I get that there's a lot of loud nonsense flying around about AI, both positive and negative, and I echo the sentiment that people should have some damn perspective when talking to FOSS maintainers, but I think writing a bunch of AI-assisted code that causes regressions and then responding to that by throwing out a strawman about how critics (with PhDs no less!) are telling him these things can't do anything at all and can't possibly understand how literally everything has fundamentally changed in the last few months sounds way more like a guy who has a motivated (and understandable - he's retired ffs) reason to... a little bit buy into the hype

I think he makes a lot of good points here, but also think that kind of statement is unlikely to assuage the real concerns of people using the software. I think people are more likely to fork rsync now rather than rely on a more diverged earlier alternative implementation though

nullc16d ago

I think that's an extremely well done response on his part.

1 more reply

dang16d ago· 8 in thread

[stub for offtopicness]

[see https://news.ycombinator.com/item?id=48416020 for how all this happened in the first place]

logicprogOP17d ago

Some notes on this:

- I used GLM 5.1 to help with the coding and math for this.

- However, I explicitly dictated where the data should be pulled from (GitHub, Bugzilla, mailing list), how it should be tagged and grouped, and what data to look at (e.g. bugs instead of regressions)

- Additionally, I consulted with my wife, who has a master's degree in statistics from Penn State University for what sort of statistical methodology would be justified for this very limited data set, while still giving as much information as possible.

- I know the website looks like we stereotypically consider vibe-coded websites to look, but I actually explicitly asked for that. The original HTML design looked like a website from 1995, and I just prefer how this looks. It's pretty!

3 more replies

ex-aws-dude16d ago

So the original unfounded claim has 400+ comments because its perfect HN ragebait

The author provides evidence to the contrary and the HNers won't even engage with it instead just talking about the writing of the article in classic HN bikeshedding fashion.

How about after that we talk about the formatting of the website and the colors?

This site is really going down hill

Where is the accountability for your own opinions?

Are you guys only upvoting things that confirm your existing gripes?

1 more reply

roywiggins17d ago

> A simple distributional analysis of every rsync release with bug data. No model. No assumptions. Just placement.

If you want me to read your analysis, you are going to have to make it not read like Claude wrote it. What does "placement" even mean here?

3 more replies

dang16d ago

This submission was heavily flagged, presumably because the article sounded like genai. But the article now says the following:

> After posting this on Hacker News and recieving almost no substantive input, discussion, or response on the actual content of the article, I decided to rewrite all of the prose in my own voice.

I've therefore turned off the flags and hopefully people can actually now discuss the claims/findings being reported.

2 more replies

mschuster9117d ago

This article reeks of LLM "assistance" at the very least.

Please, why can't people write stuff by hand themselves any more? It's a good analysis but how can I trust it without reviewing everything myself?!

1 more reply

tappio17d ago

A lot of people criticizing because it's heavily written with LLM, but I mean, if someone produced this piece pre-LLM, would they criticize it? is the critique due to use of LLM or due to the content being truly hard to follow? I read it and I would say, there are some problems with the writing, but its not a bad piece.

Of course this is a bigger problem, as its now harder to distinguish content that is "AI slop" with "content co-authored with AI that is carefully reviewed" with a quick glimpse, and the "AI smell" is quite off-putting. My initial reaction was also negative, but after glimpsing it through and reading the summaries, I found it decent summary, which also... speaks of this thread, of the content of the blog post and everything about the discussion and the strong feelings people have developed around the use of LLMs.

Anyhow, it would be good to disclose the repo with the code for the statistics & use of LLM in the writing right up front. Which model, and why it was used to do the writing, etc. Its enough to say "I think it writes better than I do" or "I was in a hurry, sorry" or what ever, but it really should be disclosed. It reads more honest.

ps. really... that sideways scroll? plz fix it.

3 more replies

sfink17d ago

Wow.

I am pretty insensitive to AI writing. I have never commented before about something sounding like AI, because mostly I don't notice. But this was so over the top that I spent the whole article trying to decide whether it was an intentional parody of AI writing style.

This article's language is not en-US. It's not en-BR. It's en-SLOP.

Yes, that was my clumsy attempt at AI parody. Here's another: this article doesn't just have AI tells. It is AI tells.

Every sentence is saturated with AI style. Perhaps the author so AI-indoctrinated that they can't see this? It doesn't read as even vaguely plausible human writing. Which is mightily ironic given the thesis of "AI generated stuff is just fine, m'kay?" The writing style does more to defeat its conclusion than the analysis itself.

As for the substance of the analysis, it seems pretty good to me but I see some flaws that weaken it a bit.

The presence of "The Outlier Nobody Noticed" proves nothing and deserves no more than a passing mention. A random release introduced way more bugs than the Claude-containing releases. That provides evidence that Claude doesn't introduce more bugs only if your hypothesis is a very naive "AI is the only thing that can ever increase bug introduction rates."

The whole analysis has very limited data. It's necessarily based off a single pair of releases at the very end of the chronological timeline. You would never be able to reject a null hypothesis based only on that, so it's even less sound to present it as proving the null hypothesis. (By the same token, it would be incorrect for critics to claim that it proves their point. Did anyone claim this, though? The heated complaints seemed more based on priors about AI code.)

"The critics' claim is a simple comparison: did the rate go up?" That's reductive. For one, these releases are known to be in reaction to a flood of (AI-discovered!) security reports, which is a novel situation and in fact is a huge confound to anyone arguing about what those two releases mean -- they're both heavily AI-written, but in response to an unusual situation. When the samples are only drawn from a distinct scenario, statistic analysis can only speak to the quality of code in that scenario.

Also, another reasonable hypothesis could be: AI-written code has bugs of a different flavor that bothers users more. It's optimized for passing tests and convincing people and AIs that security holes are closed, which means other considerations like preserving functionality can more easily be regressed as compared to if humans were doing it. (If true, it still doesn't support the claim that depending on AI code is a catastrophe, fwiw.)

I'm not arguing the conclusion is wrong. I'm saying the analysis proves far less than it claims to. As for whether it's a debacle for rsync to become dependent on AI code generation, I think that's a reasonable debate to have but it's not going to be resolved this reductively.

1 more reply

duk3luk317d ago

This article is unfortunately unreadable because all of the prose is unfiltered LLM slop.

ch_fr16d ago· 7 in thread

This article is a rant disguised as data analysis.

I don't know how to word this in a non-confrontational, respectful way, but this article just feels like ammo for your next "debate with your anti-AI ennemies" where you get to say "look, I proved with data that those people had a disproportionate reaction and have double standards, therefore anyone who dislikes LLMs or their impact are the same!". Like, sorry, I know that sounds really reductive, but this really is the vibe I get when reading this and your other replies where you repeatedly talk about "showing the hypocrisy and double standards".

The global LLM discourse has grown massive, it spans trillions of dollars in promises and investments and affects pretty much everyone, so it's the easiest thing in the world for both sides to just find some people being assholes in the other camp and say "look, here's how [other camp] behaves".

The irrational, extreme, and heinous reactions are partly bandwagoning, and you can go about your day thinking that anyone who reacts like that is evil. But if you wanna dig a bit further, you'll notice that the entire media sphere has been screaming in everyone's ears for a few years now, that they're expandable, low-value-human-capital. All the money in the world (exaggerating a little) is being spent on making sure to remind anyone who opens a computer, opens a website, looks at a billboard, or turns on his tv... that their boss really really really wants to replace them.

Now you'll say that the friendly rsync contributor has nothing to do with any of this and... well yeah he doesn't. You don't need to agree with an emotional response to understand where it's coming from, and even if you're still dead set on considering them "the enemy", then understanding why the anti-AI crowd reacts like that is STILL a positive for you.

pie_flavor15d ago

Why is the guy being rigorous worthy of criticism, but the guys being idiots aren't? Did you post any similar calm-down comments in either of the HN threads on the original attacks?

ch_fr15d ago

I am more inclined to be critical of AI boosters, so what? Am I supposed to crumble under the weight of immense cognitive dissonance because I have... a stance in the discourse?

These guys on the github thread aren't my friends, I have no concern for them embarrassing themselves or leaving a bad digital footprint by drawing ms paint gore. I also have no concern for OP, but it just so happened to be the post I found, and I just so happened to be in the mood to leave a comment.

Engaging in LLM discourse is already a waste of my time, I'm not going to waste more of it just to avoid fallacious accusations of double standards because I didn't "do the same for the other side".

simianwords15d ago

Strange thing to say. The post does a good job of showing that there is no evidence Claude had anything to do with the regressions.

Your problem is that this was shown? You don't value epistemics -- you care about the ideology more than truth. Even if you don't like AI you should still do it in the right way.

Your comment comes across as more unself aware and more destructive. Lets keep this place truth first and ideology second.

runarberg15d ago

This article merely showed that one particular sample (two releases of rsync) or not statistically significantly different (p > 0.05) on one particular metric (bugs per commit). All that says is that you cannot generalize group difference over the population using that sample.

This still leaves the anecdotal evidence. And anecdotal evidence is still evidence, and in the absence of better evidence, it is perfectly rational to react based on the evidence you do have.

ch_fr15d ago

Yeah yeah the usual "look, these anti-ai people are so EMOTIONAL and HYSTERICAL while we're very logical and fact-based", I've lurked for a while so I read that one plenty of times.

OP spends a lot of time time doing statistics, but when another person replies "hey, 2 claude-authored features is not really statistically significant", the author literally agrees and says "my point was to show that you can't draw conclusions". Direct quote:

> I'm only trying to show there's no evidence for the anti-AI hypothesis

---

And here's another thing. How exactly is it self-unaware to say "Hey, I get your frustration with people being assholes, I'm not excusing it, but it can never hurt to understand why some of them have become so extreme in their vitriol, here's a few reasons for their feelings".

Is "feelings" a curse word or something? What's so wrong with understanding the emotional component of the AI discourse?

Talking about "emotions" is not destructive when the topic at hand is literally people being driven by emotion under a github thread.

What the article is saying is "these people are acting irrational because they're evil and the enemy, so here's how I prove them wrong with statistics!", and my comment to the article was "hey, you seem to go with the assumption that this is all based on pure evil, here are a few reasons why people might get tired, and then angry, about this whole thing".

You quite literally exemplify my point when I said that the analysis is mostly just >ammo for your next "debate with your anti-AI ennemies", it's a tool that allows you to not engage and dismiss any argument as "not on the side of truth because not on the side of the numbers".

All of this even though, and I need to state this again, I never once rejected the analysis or the results that OP came to, all I did was point out that OP is also engaging in "us vs them" think with the occasional "wink wink, CLASSIC AI hater amirite?" sprinkled in the article.

1 more reply

bwfan12315d ago

> "look, here's how [other camp] behaves"

Motivated reasoning in both camps.

There are folks incentivized by AI: engineers and managers working for AI related companies who justify their beliefs with selective facts. And then there are engineers who are threatened by AI and are extra-sensitive to slop.

The article is missing the point that once camps have made up their minds - no amount of analysis is going to change that.

davrosthedalek15d ago

Why it is probably and regrettably true that few people in camps will change their mind, data analysis can help people who haven't been captured yet to either stay away from the camps or at least fall into the "more correct" one.

Stay out of camps, people!

faitswulff17d ago· 7 in thread

> The analysis uses a single metric: bugs per 10 commits (bugs/10c).

Bugs per commit as a metric papers over severity, both in terms of security severity as well as the effect on the user. A mislabeled button has the same weight as the entire app crashing in this framework.

germanjoey16d ago

IMO "bugs per commit" is even worse than that, because, in addition to what you say, it also hides the extraordinary spike of commit activity of a project that had previously been stable. [0]

It is the exact metric you'd choose if you wanted to make the current situation of rsync look like not a big deal.

[0] https://github.com/RsyncProject/rsync/graphs/commit-activity

logicprogOP16d ago

Yes, but we know why there was an "extraordinary spike," and it has nothing to do with rsync being "vibe coded." The maintained has directly addressed this.

2 more replies

ex-aws-dude16d ago

Why don't you prove the bugs increased then?

Why is it that some unfounded claim is made and the onus is suddenly on the project maintainer to prove it beyond all doubt?

It should be on the person making the claim to prove it

logicprogOP16d ago

I've now resolved this. The new version, which should be live on GH Pages soon, uses — what I think is — a pretty good methodology for assigning severity to each bug, normalizes it to 0.0-1.0, sums that, and treats that as the total severity weighted bugs, then does the analysis based on that. It did not change the analysis in any material way.

bsza16d ago

No Claude, it still makes zero sense as a metric.

A commit is a measure of nothing. Severity weighted bugs per unit of nothing? What does that even mean? In any repo it's trivial to achieve a sev/10c that's arbitrarily close to zero while completely ruining everything.

I suggest you practice some humility and update your conclusion instead of updating the mental gymnastics you used to arrive at the same conclusion.

1 more reply

skeledrew17d ago

There was no analysis of severity in all of the rage posting that occurred. The single point being pushed was "use of an LLM led/leads to more bugs". The author specifically states that's what they're addressing (blunt accusation -> blunt response).

atmavatar16d ago

The specific problems mentioned were all reasonably severe. The original post itself described a show-stopping bug:

    So my systems recently updated to rsync 3.4.3, and as soon as that happened my backup system - which does incremental backups using multiple --compare-dest= arguments - started to fail on anything but a full backup.

Incremental backups is perhaps the primary use of rsync, and they were broken for this person. That's pretty severe.

The second reply is similar:

    i wondered why my 3d printers were running like sh*t and at 100% cpu; turns out log2ram uses rsync.

This one I took with a grain of salt, since it read more like a dogpile than an actual bug report. However, if it's genuine, it's also reasonably severe.

Later in the comments, someone attempted to provide a list of issues that had been added: https://github.com/RsyncProject/rsync/issues/929#issuecommen.... The list included several failures to build or run rsync that appear to have resulted from broken backward compatibility. That seems reasonably severe. If intentional, I would have expected mention in the release notes about the removal of backwards compatibility, but none was made.

The issue comments already degraded into a lot of unnecessary vitriol even before the above mentioned comment and only gets worse from there, so I stopped. But, the fact remains that the whole issue started with a severe bug.

I applaud the attempt at dispassionately analyzing whether the recent LLM releases of rsync were normal or outliers as far as bugs are concerned, but I don't think you can do so properly without analyzing severity.

1 more reply

xmddmx16d ago· 6 in thread

There's a meta-level of irony here that's important to note.

TFA is defending the use of AI, and it very clearly (to me) used AI to analyze the data and present the results.

In doing so, the author used statistics in a way they do not appear to understand, and ended up making numerous false claims (you can see the thread discussing these here https://news.ycombinator.com/item?id=48417626 )

In short, the study doesn't have sufficient statistical power, and is making "no difference" claims that aren't justified.

The meta-irony is this: the author used an LLM to interpret data in this study, and seems to have made the same category of mistake (confidently asserting falsehoods) that the study was supposed to be investigating (confidently submitting bad commits to the rsync project).

simianwords15d ago

The meta-meta level irony is that the reaction to this post is based on vibes and misunderstands the point of the article to wage ideological warfare -- much the same way the original github issue was written.

classified16d ago

AI is so much like a religion. There is nothing you can say to a believer that will make them question their believes. Or more generally, you cannot reason anyone out of something that they want to believe.

newsoftheday15d ago

AI is nothing like religion. People behave similarly to AI when debating their favorite sports team, or for Java coders, Checked vs Runtime exceptions.

Religion is about faith and what people feel and sense as much as believe.

Joel_Mckay16d ago

It gets pretty dark if you pull that thread of reason. =3

https://en.wikipedia.org/wiki/The_True_Believer

logicprogOP16d ago

The statistical methodology I used is mine. As is the interpretation. Completely. To the degree that I misunderstood statistics (and it is under debate even in the thread you link, and the people accusing me of misunderstanding statistics there are universally misrepresenting my point, which is to point out a total absence of evidence for any difference, not to prove the null hypothesis) that's on me

MichaelDickens16d ago

FWIW I understood your point just fine. It seemed to me that you made a clear enough distinction between "evidence that Claude didn't increase bugs" and "no good evidence either way".

2 more replies

wookmaster17d ago· 5 in thread

Claude is just a tool ? The developers who merged that code and didn't properly test increased the bugs.

everdrive17d ago

"Did cars increase traveling deaths?"

"Cars are just a tool. The drivers who piloted the vehicles and weren't careful enough [are responsible for the deaths.]"

roywiggins17d ago

If something's a bad tool that misleads people into doing bad work, it would be good to know that.

ebiederm17d ago

Please read the article.

The unsolicited security reports are the issue.

Angostura17d ago

This tool is claimed to be able to find and fix bugs.

runarberg16d ago

Feels like something a bad (and potentially dangerous) tool would say.

lbrito16d ago· 4 in thread

Wait, how is any of this relevant if there were only 2 Claude commits? My statistics courses are far behind me, but don't you need at least 30 data points to conclude anything?

logicprogOP16d ago

Depends on the methods you use. If you're trying to fit curves and so on, yes. The methods I use were designed for very low amounts of data, and are generally okay for that, specifically and especially when you're just trying to show a lack of evidence for some non-null hypothesis.

And again, that's kind of the point. There's exactly zero actual evidence, however you slice it, that "Claude broke rsync" except cherry-picked anecdata, and the whole point of my analysis is to demonstrate the total lack of any such trend/evidence at all, and just how in-distribution/normal these releases are, to show that if people hadn't known Claude was involved in them, they wouldn't have remarked on them.

kelnos16d ago

It wasn't 2 Claude commits. It's 2 releases where the (many) commits were largely co-authored by Claude.

> My statistics courses are far behind me, but don't you need at least 30 data points to conclude anything?

That cuts both ways. If we say that the author here can't claim any conclusion because there are only 2 Claude-authored releases, then we must also say that the people claiming "Claude broke rsync" have no statistical basis to draw that conclusion, either.

matheusmoreira16d ago

> My statistics courses are far behind me, but don't you need at least 30 data points to conclude anything?

There is no fixed number. Sample size depends on the size of the set you're sampling, desired margin of error and confidence interval.

If your total set has a million items, you need ~16600 samples to draw conclusions with 99% ±1% certainty.

wlonkly16d ago

It's not uncommon to have small amounts of data come out of experiments. These are appropriate tests for the size of the data. These tests failed to disprove the null hypothesis.

scsh17d ago· 3 in thread

> It does not control for commit complexity, security intensity, or bug severity. It does not distinguish between a one-line typo fix and a CVE patch. It is a blunt instrument. But the critics' accusation is also blunt: "Claude is making things worse." A blunt instrument is the fairest response.

If by fairest you mean to say that this analysis and response is sufficient, then I'm sorry but I have to disagree. We really need to understand if the nature of the bugs are worse from a user's perspective. Even if the rate stayed unchanged, if the result is the perceived quality of the software declined then I would personally consider that worse, especially if I were a project maintainer.

That's not meant to be wholly dismissive either. But in general, I don't think quantitative analysis alone is enough to fully answer this type of question.

skeledrew17d ago

But it is fair. Up to this point I have yet to see anyone say they did an analysis of the code and found X regressions of Y severity. All they say is "there are more bugs because LLM". This analysis, which you can verify yourself if you wish, says "the bugs [number of] are pretty average even with LLM", which is a direct response to that. If you'd like a more nuanced analysis you're welcome to do one and share the result, if you're so inclined.

MostlyStable16d ago

That which is asserted without evidence can be dismissed without evidence. This is more evidence, and of greater rigor, than was used to make the assertions. That's good enough for me. If someone wants to actually do the work to support the original claims with better evidence, great. I'd love to see it. Until then, I'm going to not worry about this issue.

ex-aws-dude16d ago

The burden of proof is on the one making the claim?

cobertos16d ago· 3 in thread

This post just gives me more questions than answers and I'm unable to form a decision:

* Why was v3.4.1 the most buggy, right before the Claude commits? Why did "nobody notice"? It's way to strange to just say welp, it must be human error. * Why does v3.4.2 have 0 bugs, or 0 bug score. And why was such an outlier (no other commit seemingly has this??) allowed to mix into aggregate statistics and bring all the "is Claude buggy?" scores down. Tbh idk how that _wasn't_ a red flag in the author's analysis...

This article feels like half of an analysis presented as a highly complex finished product due all the advanced stats they're running.

logicprogOP16d ago

> Why was v3.4.1 the most buggy, right before the Claude commits? Why did "nobody notice"? It's way to strange to just say welp, it must be human error.

Why wouldn't it be except question begging priors assuming it couldn't be?

> Why does v3.4.2 have 0 bugs, or 0 bug score. And why was such an outlier (no other commit seemingly has this??) allowed to mix into aggregate statistics and bring all the "is Claude buggy?" scores down.

My original metrics which didn't filter out feature requests and questions had it at four bugs and prior to that it was even higher and it didn't make much of a difference to the overall analysis (fell well within the IQR, the lower end of it too). Also, removing one outlier just because it looks kind of funny to you, especially when we only have two Claude releases at all, would be worse in my opinion and more arbitrary.

cobertos16d ago

> Why wouldn't it be except question begging priors assuming it couldn't be?

A multitude of reasons? A change in maintainer. A change in the mental state of a maintainer. A sudden focus by the community on a given undesirable behavior. Someone else here suggested use of Claude AI before it was disclosured. The framing implies that it was human-produced coding error, but my point is it could be _any other human error_ or even just some odd benign human behavior (a stampede of bug submitters), affecting the data. Which does not lead to the conclusion that AI code > human code. Not looking at these potentials is so unsatisfying.

> My original metrics which didn't filter out feature requests...

It still feels like a lot of weight of the phrase "If that doesn't look like a red flag to you, you'd be right." hinges on the fact that one of the versions has 0 bugs and it really killed the weight of that statement for me, because the oddity of there being 0 bugs just wasn't explained.

---

Could you please post the duckdb file that has the raw bug -> severity + version mapping to the GitHub repo? I have a desire to dig into this myself

1 more reply

Laurel123416d ago

> Tbh idk how that _wasn't_ a red flag in the author's analysis...

Because he didn't analyze shit, just asked a clanker to rationalize his "clankers are great" conclusion.

geraneum17d ago· 3 in thread

> But the critics' accusation is also blunt: "Claude is making things worse." A blunt instrument is the fairest response.

So the criticism was bad, and that somehow makes it ok to use a bad metric?

logicprogOP17d ago

That's not what I'm saying. What I'm saying is that if the criticism is referring to a broad set of metrics like bugs per release and number of commits that were made by Claude, then it's correct to look at precisely those things because that's what the claim is about.

abirch16d ago

AI + Interest != Expertise

I come to hn because I get very nuanced, informed information and glorious puns.

epolanski16d ago

What would be a better one?

dvt16d ago· 3 in thread

It's always the most insufferable people that make the biggest hullabaloo about a project they have nothing to do with and have never contributed to. People with literally zero skin in the game using the AI boogeyman to push some agenda or some anti-agenda. OSS has become so incredibly toxic in the past decade, and consumers of OSS have become extremely entitled.

I run a smallish project with ~1k stars and I've stopped maintaining it last year because people feel like they're absolutely owed features or bug-fixes or whatever. It's tiring and a complete shame that author has to make such an insane deep dive into a random accusation that just caught on social media. I want to emphasize that this has nothing to do with AI, it's just tech tourists, consumers (as opposed to creators), and engagement farmers that have taken over. AI slop probably doesn't help, but the underlying issue has been brewing for at least a decade.

Also, the "making soup for the homeless & pissing in it" is not only an off-base analogy (software is pretty low on Maslow’s Hierarchy of Needs), but also somehow looks down on both people in need and the volunteers that help them. Just absolutely gross.

matheusmoreira16d ago

Absolutely agree. Quite a lot of judgement from people who benefited from this guy's software for over 20 years, probably without ever helping him pay his bills even once.

Panino16d ago

> It's always the most insufferable people that make the biggest hullabaloo about a project they have nothing to do with and have never contributed to.

Agreed, and similarly, as a hobbyist programmer who loves Rust and Go, I've always felt that the people who command others to "rewrite it in xyz" are not themselves developers, they're "ideas people." There's a mass of these people whose main interactions with the world are through the dramatic forcing of their correct opinions.

> I run a smallish project with ~1k stars and I've stopped maintaining it last year because people feel like they're absolutely owed features or bug-fixes or whatever.

That's a bummer and it's something I'm fearful of. I post some code on my website, not on a github type site, and don't interact with people about it. It's nice and plenty of people do it. Is that something you'd consider?

unphased16d ago

haha, that analogy says more about whoever wrote it than it ever could to get the intended point across!

TZubiri16d ago· 3 in thread

I haven't used this thing for like 10 years, when my modus operandi was googling my question and installing whatever stackoverflow suggested.

Can someone explain why one would ever use rsync (pre vibecode version) instead of cp and dd?

Can't we just 'apt remove rsync' and save ourselves the time even spent on evaluating this dependency?

Thanks

Joel_Mckay16d ago

If you deal with large numbers of files, the ability to dynamically skip compressing media and zipped files for transfer can be extremely handy.

While stuff like sshfs is great for a few small files (and win11), it will be an order of magnitude slower than an rsync task.

Most smart folks automate backup/recovery scripts, and only sometimes edit them with a new OS install. =3

Arcuru16d ago

> rsync (remote sync) is a utility for transferring and synchronizing files between a computer and a storage drive and across networked computers by comparing the modification times and sizes of files.

https://wikipedia.org/wiki/Rsync

int_19h16d ago

Because cp will copy everything, while rsync will copy only the things that actually need copying, and also delete the things that should be gone?

pushcx16d ago· 3 in thread

    What followed was extraordinary: 329 comments and counting, ranging from thoughtful concern to outright harassment.
    The thread did not stop at words. One user posted My Little Pony drawings of themselves strangling the "project janitor that pushed vibecoded commits":
    It spread to Hacker News and Lobsters, generating hundreds more comments.

This is false, it did not appear on Lobsters. Here is the function in the codebase that prohibits this kind of brigading: https://github.com/lobsters/lobsters/blob/main/app/models/st...

Please correct your article.

tptacek16d ago

It is neat that Lobsters has this feature (and HN should too), and I'm glad you took a beat to explain it. I think you didn't need the last sentence, though.

logicprogOP16d ago

I have done so! that was a misremembering on my part. first mention of Lobsters is now here:

> On Lobste.rs, in response to the Medium essay Tridge himself posted in response, finally some users like boramalper begin to actually ask for evidence one way or another:

pushcx12d ago

Thanks, I appreciate you sorting out the timeline on such a heated issue.

aarjaneiro16d ago· 2 in thread

> "Claude clearly made things worse" &emdash; the main claim

Even this report is full of claude-introduced bugs

runarberg16d ago

For those that don’t know the html entity is — not &emdash; although I think in modern codebases people usually just type — directly.

This mistake does exist in the wild though: https://github.com/search?q=%26emdash%3B&type=code

If I was more ambitious I would plot the dates of the blames of these results in a histogram and see if an there is a significant increase in these mistakes (over a baseline —) correlating with the release of some models.

logicprogOP16d ago

That one was on me. I always mess that up.

vlovich12316d ago· 2 in thread

If the author is this concerned about security, I’m curious why rsync doesn’t just build with fil-c by default and skip the noise. Those who need the extra perf to do more than 1 gigabit/s can build it in “unsafe” mode.

saagarjha16d ago

Because Fil-C is not a serious project

int_19h16d ago

If you make claims like that, you need to expand on them or at least provide some references.

1 more reply

PunchyHamster16d ago· 2 in thread

The fact last few commits were attributed to claude doesn't mean previous ones didn't use it.

Also if you write a paper where you get statistical conclusions out of whole 2 datapoints you'd be laughed out of the room

logicprogOP16d ago

> Also if you write a paper where you get statistical conclusions out of whole 2 datapoints you'd be laughed out of the room

I'm using methods appropriate to that low amount of data, first of all. Second of all, since I'm only trying to show there's no evidence for the anti-AI hypothesis (not disprove it, or prove the null hypothesis), that's sufficient in itself. Also, I wonder why nobody said things like you're saying ("there's too little data to tell") in response to all the absolutist claims that AI caused rsync to get worse?

> The fact last few commits were attributed to claude doesn't mean previous ones didn't use it.

At this point, you're just positing Russel's Teapot: you'll keep assuming more and more of the code was "secretly" Claude when there's no evidence for it and no reason to think so, just because you've started with the assumption that Claude makes things worse and you want to find a way to prove it.

vintagedave16d ago

Why not? Claude marks its commit messages. That there were none, and then there were, seems a signal.

Especially since if the earlier commits were so clearly AI authored yet without the Claude marker, surely you or anyone would be able to spot them. You could say, X commit does not have the Claude commit marker yet was AI written. But for all the speculation on this thread, I haven’t seen anyone actually doing that. What may be possible is that the rsync maintainers used AI to assist yet reviewed and edited themselves, as many devs do, and if so then the stats in this article are still notable: there are no poor quality outliers that can reliably be attributed to AI and if one specific release (3.4.0) was, the subsequent releases which presumably also had as much AI as this speculative hidden AI release only show improvement and thus act as a pro-AI argument.

The blog has many more datapoints than two. It compares many releases. You’re looking at 2-vs, not 2.

Polarity17d ago· 2 in thread

so the answer is: no. actaully less bugs. thanks

gjvc16d ago

"fewer"

davrosthedalek16d ago

First rsync and now less? What comes next, cat?

1 more reply

amluto16d ago· 2 in thread

Reposting my previous comment because the post I commented on earlier was flagged to death:

This is kind of a sad situation. Tridge is an excellect programmer and a very respected member of the community, and I totally get it. rsync, like most old C projects, has a lot of accumulated cruft, and things that would be nice to fix, and bugs. And those bugs come in at least three classes: semantic bugs, improper interactions with the OS, and memory safety bugs. And the author and long-time maintainer has the same problem as every other maintainer and team: not enough time to deal with everything. And now LLMs come along, and they are so, so seductive. They will fix your bugs if you ask them to. They will even find your bugs. And they're right a remarkably large fraction of the time. It's magic! You can write an agent loop or magic harness or swarm and let them do this on their own if you want. And so you start getting through your backlog, and it's fun, and you feel good, and you let your guard down. And you start having problems: - Your favorite LLM does not have the context that lives in your head. I use rsync because Tridge wrote a fine piece of software, and he knows how to write serious software, and I'm willing to accept that it's in C and therefore almost certainly has a safety bug or three. If I wanted to use claude-ersatz-rsync, I'd use that instead, but I really don't, TYVM. - Remember how LLMs are right a remarkable fraction of the time? The fraction is remarkable, but it's nowhere close to 100%. (Yet? Who knows. Right now, it's DEFINITELY nowhere near 100%.) - The training process for the current crop of LLMs does not adequately reinforce long-term maintainability of the outputs. And, for all the LLMs seem magic, they seem to love a workload in which they write code with poorly named functions and no docs and sort of assume that they can parse their own code down the road and figure out WTF is going on, and they are AT BEST only a tiny bit right. Because every project has interfaces where one module touches another, and every LLM has very limited context (larger than humans' in straight up verbatim working memory but MUCH MUCH WORSE than humans' (for now, anyway) in actual broad picture retention), and this workload doesn't work. If it did, we could give up on structured programming and just have the LLMs vomit up uncommented asm. And so, where humans have conventions and decently named functions and ideas that you shouldn't churn your code just for funsies (at least not in a production context), LLMs do this: https://github.com/RsyncProject/rsync/commit/30656c5e358b1c6... Most of that is blindly changing calls do functions like do_foo(args) (which makes sense) to do_foo_at(the same args), which makes no sense. Sorry, but the world of POSIXish-targetting programers (including, presumably, Claude) knows what _at means, and it means "at" the specified directory fd. Which is not specified in the call sites. It makes no sense at all. Buried in all that mess [0] is the implementations, which are sloppy. Seriously: - There's a function called do_utimensat_at. Is Claude stuttering? - There's a lovely comment in syscall.c:1660-1673 that's quite bad. It's handling strings that contain "/../" and such. If there's some actual contract that the function makes to its callers (and there surely is -- this is critical security-sensitive code), then SAY WHAT THE CONTRACT IS. Don't bury a partial explanation in a comment in the middle. - There's a repeated pattern: In do_foobar_at(path), there is, in effect: if (!path) do_foobar(path); Nice NULL pointer handling. Is NULL a valid argument or not? Why handle it by forwarding it to the less secure variant? - Those nice, supposedly secure "at" variants check for paths that start with '/' and forward to the raw insecure syscall. And they don't check for .. in the middle. So what, exactly, is the special code for .. promising to do? (See above.) I don't think more details are needed. But my take is that this whole thing is a mistake. I personally work on the sort of code where messes like this are entirely unacceptable. And using an LLM while maintaining the kind of oversight that prevents it is mentally taxing and not exactly fun. If you want to fix all the gunk in a C program like rsync by LLM magic, go rewrite it in Rust or something -- you're already exposing yourself to a massive rewrite and all the risks that entails, and you're pretty much guaranteeing a high level of sloppiness, so at least use a language that is more resistant to slop.

[0] Which GitHub doesn't even render by default because their diff viewer is so bad.

[There were follow-ups. See https://news.ycombinator.com/item?id=48352182]

j16sdiz16d ago

Tips: In HN, You need blank line (i.e. Hit ENTER twice) to start a new paragraph. -- Everything jams into an incomprehensible wall of text if you use one new line.

amluto16d ago

Ugh. The source comment, which this was literally a copy and paste of, had newlines. I wish HN could roundtrip from itself via the clipboard correctly.

tiahura16d ago· 2 in thread

Write with your own voice and then polish with ai.

dgellow16d ago

Or just do not polish? Write with your own voice accept it as it is, humans communicating to humans

cube0016d ago

Please don't, even "polish" can make it sound completely AI written.

yobid2016d ago· 2 in thread

needs a tldr; im not reading all that. maybe claude can summarize it for me.

logicprogOP16d ago

And anti-AI people accuse people who use AI of being intellectually lazy. First of all, it's long because it's expanded to respond to all the criticisms. It seems that either something can be short, and dismissed as incomplete, or it can be complete, and dismissed as being long. Nice Kafka trap. Additionally, there's literally an Executive Summary section right there, for your TLDR.

noAnswer16d ago

Asked your Clanker what a joke is.

iainctduncan16d ago· 2 in thread

What strikes me about the post is that it goes to great lengths to talk about proper statistical methods, but then is written in the most clearly biased language ("what stupid AI haters get wrong etc). If you want people to take your study seriously, why wreck it by coming across with such a strong prior bias? I stopped reading...

int_19h16d ago

To be fair, the tone of the article is practically chill compared to the comments it is written in response to.

logicprogOP16d ago

If they're the statistical methods and metrics hold up, or they don't. Also, if you don't want to read my opinion on things, then just grab the GitHub repo and run the end-to-end replication and look at the output data yourself.

nairboon17d ago· 2 in thread

Is this an analysis made by/with Claude?

quentindanjou17d ago

It very obviously is. "The Outlier Nobody Noticed" -_-"

overgard16d ago

FWIW, I asked ChatGPT to review the article just for my amusement. It's conclusion was:

"My honest assessment is that this is a competent calculation performed on a badly confounded measurement, followed by conclusions substantially stronger than the calculation warrants. It is useful as a rebuttal to “the Claude releases are obviously unprecedented disasters,” but not as evidence that Claude was harmless."

the_real_cher17d ago· 2 in thread

Is there a non vibe coded fork of rsync?

throwaway735617d ago

Yes: https://news.ycombinator.com/item?id=48390931

So far it reintroduced several security issues and replaced the README.md.

MYEUHD16d ago

There is openrsync, which the OpenBSD re-implementation of rsync

It's not a fork, but it's 8 years old, and is already shipped by default in OpenBSD and macOS.

2 more replies

gadrev17d ago· 2 in thread

Ok.

  $ apt-cache policy rsync | grep Installed
    Installed: 3.4.1+ds1-7ubuntu0.2
  $ sudo apt-mark hold rsync     
    rsync set on hold.

imurray17d ago

That version has security fixes from the same day as the latest rsync release: https://ubuntu.com/security/notices/USN-8283-1

As usual, Ubuntu backported fixes and didn't upgrade to a new version. Whether or not they also backported regressions in edge cases that afflict the latest rsync, I don't know. Pinning the Ubuntu package may prevent getting further regressions, but is preventing you getting any future such backported security fixes.

1 more reply

logicprogOP17d ago

Did you face any actual bugs or regressions? Or are you doing this just because of the bandwagon that's going around right now? Because until you can actually present an argument for why this release is worse than any of the others, which is precisely the subject of my post, then this is not an argument against my post at all. This is just a self-referential appeal to authority.

2 more replies

MantisShrimp9016d ago· 1 in thread

I think this writer kinda took the bait which is fine someone had to do this so we couldn't debate endlessly.

But the reality is that if you were already set enough to call rsync slop because of a single post, you aren't going to be more down now. Even in these responses I see everyone nitpicking and moving goalposts as if one more commit being actually claude-aided will tip the scales from stable project to "vibe coded slop".

Software has always been fuzzy, we have never come up with an objective way to handle software quality, and this Uber hatred of llm contributions lets the humans who make egregious bugs and mistakes off the hook.

Taking a step back, we need to have more empathy and thoughtfulness of one another in this space. Its new and people are experimenting and there will be nothing good coming from personal insults and DDOsing a good project just because someone got ragebaited on threads, x, mastodon or whatever else.

How do we determine bugs and increase quality? Its almost like we have been grappling with this question for decades and I still hear people fight on the best way forward. Simple design, test driven development, user surveys, all of the above have been used as a proxy for software and they all failed to capture everything. Back in the day we used that ambiguity to give each other grace, now we use that ambiguity to tear down other creators. Whatever, if open source software really is dying its because of this toxic shit just as much as the llms

thin_carapace16d ago

'this toxic shit' would not be occurring if we didn't invent a machine that can be used either as a firehose or a scalpel. I do acknowledge that behaving hurtfully towards somebody giving something away for free is unwarranted behaviour. perhaps a universally agreed quality control method does not exist - this does not suggest that ai slop is anything but low quality code. ai can indeed be used well, however you yourself mentioned letting humans off the hook for making egregious mistakes. pushing out ai slop IS an egregious mistake. when a release contains more commits than the previous N releases, slop likelihood increases, therefore further evidence is required to prove non sloppiness.

tptacek16d ago· 1 in thread

This is a neat post and I'm glad it got written and this is a little bit off-topic but:

Hey, 'logicprog, your writing is fine!

Use LLMs to critique your writing, check its structure, vet your choice of topic sentences, check flow from graf to graf and section to section, look for passive voice and overused words. LLMs are fantastic for that. But don't use a single word an LLM suggests in your actual writing. If it suggests something really fucking good, too bad, those words are disqualified. It's an easy red line to adhere to, easier than it sounds, and it'll keep your writing human.

(You ended up somewhere around here anyways, but that was after you posted something with LLM-written language because you weren't confident enough in your own writing. The things you do "worse" than an LLM are what make you you; be protective of them!)

logicprogOP16d ago

Thank you!

AEVL16d ago· 1 in thread

How does the analysis look if we only count the >=90 severity cases—that is, if we downgrade the severity of all <90 cases to 0?

logicprogOP16d ago

Feel free to run it and find out. I don't think it would produce very much useful information though

esailija16d ago· 1 in thread

What on earth is this. Literally the only thing that matters is are there more bugs after AI written code is allowed into the codebase at all. We all know the answer to that lol. But it's always nice to see "data" can be used to make any conclusion you need.

logicprogOP16d ago

The data literally shows there aren't, there have been worse releases before. In what way did I manipulate the data?

steno13216d ago· 1 in thread

This is just narrow thinking. Say Claude did increase the bugs in rsync by a negligible factor.

So what? You've saved a significant amount of time for a decent number of humans, and if those humans are working on other projects, the overall net output for the world is net positive compared to without LLMs.

You have to broaden your perspective. It's not just about how rsync was affected.

boxed16d ago

Let me translate this comment:

> ok, so I was wrong and badly, but I will double down and say I was right anyway

jrflowers16d ago· 1 in thread

Tl;dr:

Yes, it did. Here is some math showing that you shouldn’t care about that.

logicprogOP16d ago

In what way did it create more bugs? It literally doesn't show up in the data. What are you talking about?

1 more reply

e4016d ago

While I'm grateful for all Andrew has done to create and maintain rsync, I rely heavily on it for backing up files between machines on my home network, so I've spent the time to figure out how to pin the Homebrew version of rsync to 3.4.1 because the bugs in the subsequent two versions really scare me (as does the original report that triggered all this).

Here is the process I used to do it, which was way more complex than I thought it would be:

https://gist.github.com/e40/caa67c1b8d439a528695f996d0519d8e

logicprogOP17d ago

Okay, I really have to point out to everyone: the numbers and report cards are TEMPLATED IN BY A SCRIPT. Hallucinations are a moot point. https://github.com/alexispurslane/rsync-analysis/blob/main/s...

igregoryca16d ago

Claude in general probably increases observed bugs in rsync, because it can churn out vulnerability reports that necessitate tons of changes to software that people are accustomed to working flawlessly in non-pathological use cases.

I don't have empirical evidence for this claim, but best I can tell, security patches are the principal source of observed bugs in software of a certain vintage, because they cause churn. (Just think of Windows updates that break drivers.)

zzo38computer7d ago

I agree that the bug report is not very good. I also agree that having commits written by Claude is not necessarily what caused the bug, although it might be (it is also possible that some of them introduced bugs and others didn't); whether or not it is in this case, is I don't know (some people think it is, but some think not). (Software without code written by generative AI will still have bugs too.)

However, the claim that "the original post was [...] no bug report" seems wrong; it does have a bug report, although not a very good one. It says that incremental backups using multiple --compare-dest arguments do not work, so it is a bug report. But, it should have been written differently, including by putting the text directly instead of a screenshot, giving a proper title, better details about the bug being reported, etc.

Their claims that they introduced deliberate bugs, are unlikely to be accurate, and not worth making those claims nor the violence that they involve.

I do have reasons for not wanting LLMs to commit code, so I agree with their opinion about that, but that does not justify making a bad bug report and the other stuff that they did. If it is FOSS, someone who disagrees with the project can fork it and make their own version, as has been done with other FOSS projects as well.

I think it is good that they are making statistical analysis. However, they used a language model to classify bug reports. They mentioned some things that might be missed, and they could be missed whether or not you are using a language model to classify bug reports, although there are some other possibilities e.g. whether or not a single report should count as multiple bugs in some cases, and mistakes in marking reports as duplicate.

gravypod16d ago

This is a really cool post but I think one metric we may want to also look at is does using agentic coding tools in one domain impact your coding abilities in another domain? A lot of people I know have been talking about getting rusty on the fundamentals recently. This is not something I am particularly feeling as I do a mix of running agents in parallel and writing some code manually where it makes sense. But if people who have been prompt-only at work come home and work on rsync and are more "rusty" maybe that could also lead to more bugs?

This would be even harder to measure.

WesolyKubeczek16d ago

The discussions around this have devolved to excrement anyway, I feel tempted to invoke the meme where the goose asking a guy what his jacket is made of, asks “where is your reproducer case!?” instead.

Instead we have a shitstorm over presumably legit issue, for which the only source is some mastodon post.

One command that used to work in 3.4.1 and stopped working in 3.4.3. Just one! We could have already bisected the living shit out of this and go home, but no.

rovr13817d ago

I'm just curious about testing.

Is this a configuration that's not common and thus not tested?

If people think they can do better, I want to see their forks and them keeping up with it.

https://github.com/RsyncProject/rsync/graphs/contributors?fr...

parliament3216d ago

Thank you for (re)writing this in your own voice. Despite how much effort might be put into methodology, data collection, etc.. reading slop is unbearable, full stop. It's not intentional, but I have almost a nauseated reaction when the "AI tone" comes though, regardless of how good the data or how accurate the writing is.

Your verbosity and sentence structure are not a problem. I hope that publishing this gives you a bit more confidence in your writing, because it's legitimately good.

rswail16d ago

Personally, I'm going to believe tridge, someone that has contributed more to software than 99.9(recurring) of the software development community over the last 3+ decades, than a bunch of brigaders jumping on the anti-AI backlash.

There was one regression bug apparently (related to multiple destinations and the way people do backups), but all the attention/anger has been about a test suite that makes the rsync development better, more rigorous and copes with the onslaught of both good and bad AI generated PRs as well as hardening something that has two decades of C code in it.

People need to grow up and appreciate what others in the community (especially people like tridge) have provided.

bwfan12315d ago

Software is only as healthy as that of the mental-models of its "human" maintainers. This was implicit in the writings of Peter Naur (programming as theory building) and the Fred Brooks (mythical man month) ages ago. AI as a tool can assist just as IDEs and linters have assisted. But eventually the mental models of the human maintainers is the gate and bottleneck. Applying this to rsync, its maintainer would need to foster and grow other human contributors to eventually become maintainers so that the human mental models are carried forward.

mikaeluman16d ago

Not going to critique this survey. Must have taken a lot of time and required a lot of patience. Great work!

I think it will be up to some group in academia to make a real full blown study across several repositories.

There must be tons to learn on how LLMs have changed software development and perhaps the cleanest separation will simply be going by what repositories declare e.g. "No LLM involved" vs those that proudly do the opposite or are neutral.

Bugs is not the only variable of interest here. I am guessing someone is already doing this as we discuss it here...

guilhas14d ago

Rsync is a highly trusted software, included in many distros. To move important, and high quantity

If several or critical lines of code get changes quickly, and keeps breaking things, with or without llms, there will be backlash

Rsync should rightly loose reputation if the project allows the release breaking changes to follow the latest hype trend

throw716d ago

Trust is slowly gained and easily lost. The amount of apologia I hear from top-tier developers signals an inflection point downward.

moktonar15d ago

I think we should start having ai versions like beta ora alpha versions and then consolidate them into human made versions with time, at least one is free to stay safe or on the bleeding edge as one likes and we all get a win-win best-of-all-worlds situation (hopefully)

htk16d ago

Flagged. Article is as AI heavy as the commits that people are complaining about.

ladax7270715d ago

So a project is using a GPL licence, but instead of forking you harass the authors and you somehow think that you are the smart one and that you are doing anyone a favour?

ltbarcly316d ago

I'm noticing more and more AI writing everywhere, from youtube to this article: From the subtitle: "Nothing complicated, answers only one question: " clearly LLM generated.

Havoc16d ago

What’s the deal with anti ai people being so rude

foxes16d ago

I wonder if all the commits which involve adding tons more test are the basis for a rewrite in rust anthropic marketing event

drankinatty13d ago

I've got no love for AI, don't use it, but also after writing code for more than 40 years, keeping things in perspective helps. Whether it's you pecking away or some coding assistant helping, there will always be the potential for a regression or two to fly under the radar. (not like I've never done that before... nope.)

The issue the coding tools like Claude present is the sheer size and scope of changes and commits they generate that would take mere mortals months of careful coding to do.

That's an issue everyone using those tools will have to confront. I don't know Andrew personally, from a "let's go have a beer" standpoint, but I've known him from the samba list and his work with rsync for a very long time.

My take on the issue is less about the regressions and Claude screw-ups and more the lesson to all about the reliability of the coding tools and the diligence required to validate what they spit out.

It's an unfortunate black-eye, no doubt, but it's not a unique one. The takeaway is if something like this can slip by somebody like Andrew, then we all need to redouble the validation effort, lest we too are destined to share an unfortunate black-eye or two.

Never forget, "to err is human, but to really foul things up requires a computer."

AI just applies that adage at industrial-scale.

manlymuppet16d ago

Unrelated, but this post has a level of rigor you rarely see nowadays. I think it deserves to be commended for that.

HN relatively, is a very intellectual part of the internet, yet even still, it's really common to see very uneducated opinions here. Not that everyone needs to be very educated, but posts with plainly wrong assumptions and biases shouldn't go completely unchecked so rampantly.

nasretdinov16d ago

Regardless of the claims made in this analysis, I've personally observed that there are indeed more bugs (or more subtle issues, like nonsensical error messages) being shipped when using LLMs, but not _really_ because LLMs suck, but because you're spending less time thinking about the problem, and you subsequently miss more edge cases, etc.

The best approach I've tried that actually increases quality (and _may_ speed up development) is to write ~80% of the code yourself and then ask LLM to review it thoroughly. While it's doing its thing you're also thinking about the code and reviewing it yourself in parallel. You then merge the findings and fix stuff worth fixing. At this point the authorship of the code is still mostly yours, you _understand_ the system and you ship fewer bugs, slightly faster than otherwise. It's a moderate improvement to the workflow, but it actually doesn't cost nearly as much either, and definitely doesn't produce rage at the machine from the slop. The only downside is that it requires lots of discipline, and it's a relatively rare commodity among software engineers these days.

nelox16d ago

The peak of cascading effects from errant dependencies has yet to come

block_dagger16d ago

Do people enjoy interrogative headlines? Find out at 11.

themafia16d ago

> If anyone complains about my verbosity or sentence structure — as they usually do, which is the reason I originally let the AI write the prose, among other reasons obsoleted by templating — they can go fuck themselves.

You can write for an audience or you can write for yourself. Which is fine either way but you shouldn't pass the blame for bad results on to your audience.

> and recieving almost no substantive input, discussion, or response on the actual content of the article

Well did you write it for that purpose?

> "Just wait, more bugs will surface" -- v3.4.3 has been out long enough

Wait for _more releases_. As your own data shows the bug rate is not consistent between releases. So this is probably not a worthwhile metric. Perhaps systems touched, new features included, or attempted fixes would be a better way to contextualize releases and the goals of the author.

KronisLV16d ago

Pretty cool site!

> v3.4.3 has been out long enough that its rate (5.00) is already comparable to historical releases. The "wait and see" argument is an appeal to an unknowable future that shifts the burden of proof away from the critics. If more bugs surface, they will enter the distribution like every other release. There is no reason to expect a regime break.

I mean, as someone who uses LLMs, it might be a good idea to consider how one might limit the amount of bugs that will appear in the future at least a little bit: parallel iterative code review loops would probably be the easiest and most applicable to LLMs, though I guess test coverage and other code analysis tools help too.

logicprogOP16d ago

Another update: did an automated severity analysis on each bug report (~2000 of them!) using an LLM at temp=0 with a very strict rubric (and I checked to make sure that it rated things in a consistent, stable way using it). The rubric, LLM used, and some example ratings are included in the methodology section. For now, the information was just stored per-bug in the DuckDB and used to filter out non-bug bugs, to get a clearer signal. I'm going to try to use it to see if the post-Claude bugs were more severe in any way next.

nazgul1716d ago

In a scenario like this, where we can only know if the code has bugs, not if it doesn't, isn't survival analysis a more appropriate statistical technique? I.e. a technique where time is a first-class citizen.

By the way, I did find this a bit hard to read but, as instructed by OP, I'll go fuck myself.

For what it's worth, I find AI written prose easy to read, and am annoyed by all the constant HN comments which just point out the author was AI, without anything else substantive to add.

overgard16d ago

The TLDR seems to be: needs more data.

WhereIsTheTruth16d ago

LLMs don't create bugs, people do

nilslindemann16d ago

Plot twist: This blog post was written using Claude too.

1 more reply

mwkaufma16d ago

Smokescreen of highly-contingent analysis and appeals to authority over a premotivated-conclusion.

1 more reply

1a527dd516d ago

I'm amazed that this is still being discussed.

It's open source, no one is forcing you to use it.

If you don't trust the newer versions; use the old versions.

If you no longer like the maintainer because of reasons, fork it/start your own.

It's not that hard.

Storm in a teacup.

mmonaghan16d ago

I think there's evolution at play here - if you dislike AI enough to opt out of using any ai-generated code, you will likely suffer. I think there's definitely a conversation to be had about whether to disclose AI use or not but that's a separate issue if you assume that everyone is using it in some respect.

j / k navigate · click thread line to collapse

567 comments

217 comments · 68 top-level

thorum16d ago· 19 in thread

zzyzxd16d ago

I never care about AI usage disclosure, because I don't believe that human produced code is necessarily better than AI produced code, unless it's someone I personally know.

People need to be responsible for code they commit and push anyways. This has never changed. Whether the code is written by hand, by their cat walking over keyboard, or by AI, is not my concern.

calvinmorrison16d ago

delusional16d ago

> People need to be responsible for code they commit and push anyways.

Well the GPL (which rsync is licensed under) says: "This program comes with ABSOLUTELY NO WARRANTY" so actually nobody is responsible for anything.

2 more replies

matheusmoreira16d ago

> You’re just going to make people disable Claude attribution on their commits to avoid drama.

Aurornis16d ago

The value of the Claude attribution is that you can tell at a glance who used AI.

I don't care about the advertising angle. We all know Claude by now. I want some indicator that AI was used.

3 more replies

julianeon16d ago

1 more reply

trwired16d ago

eli16d ago

Yeah I think it's a bad thing. It's context about how open source code was written that is lost.

And I guess maybe there's no such thing as bad press but at least in this cases it doesn't seem like effective marketing for Anthropic.

eschaton16d ago

“Don’t get mad at people for doing something unethical or immoral, or they’ll do something unethical or immoral!”

Disabling attribution of LLM-generated code is fraud, because you’re saying you wrote the code.

Of course that fits right in with the use of an LLM to generate code in the first place, since what it’s actually doing is regurgitating its inputs stripped of any license and copyright notice.

UebVar16d ago

In academia this is miss-attribution, outside of academia this does not exist.

If this is unethical or immoral is a totaly different question. I really dont think so and I dont think you argue that position well.

1 more reply

jhack16d ago

"Disabling attribution of LLM-generated code is fraud, because you’re saying you wrote the code."

Should there by attribution for Google or Stack Overflow copy/paste? Who should we bully about this?

2 more replies

Leynos16d ago

Outside of situations where it is required by contract, attributing AI usage is a courtesy, nothing more.

1 more reply

infamouscow16d ago

It's only fraud if a person signed their name stating such.

Their name being attached to the commit is itself, irrelevant, as their is no way to submit a patch otherwise. You could use a fake name, but you're just moving this fraud problem around.

You're going to have a hard time convincing anyone that using a tool constitutes fraud. Frankly, it's silly, if not genuinely stupid.

1 more reply

Unit32716d ago

This argument gets trotted out every time but it doesn't convince me of anything. Yes, calling things out creates an incentive for people to hide them, but so what?

mohamedkoubaa16d ago

I'd be willing to be that an undisclosed LLM disclosure will follow a developer around for the rest of their career

eschaton16d ago

That kind of fraud absolutely should. (I suspect you mean “undisclosed LLM use.”)

1 more reply

Daishiman16d ago

I'm willing to be that in two years that's going to be completely irrelevant because the amount of code written by hand will drop to less than 10%.

overgard16d ago

hnav16d ago

Depends on what the claude attribution actually means. A lot of people will just get the thing building and then ship. To me that attribution is generally a red flag.

1 more reply

aesthesia16d ago· 13 in thread

I don't have a dog in this fight, but a few points that look a little suspicious:

- Relatedly, more recent releases have had less time to have bugs filed against them, so there may be a bit of a bias toward evaluating recent releases as less buggy.

theteapot16d ago

Agree. From the article:

iandinwoodie16d ago

Also from the article:

> "Claude clearly made things worse" &emdash; the main claim

This article was clearly generated by AI, yet I found no mention/attribution of that by author.

3 more replies

OptionOfT16d ago

You can use LLMs in multiple ways, from very hands on to make local changes to completely hands-off.

So maybe the author of the code started off small and it grew over time?

hparadiz16d ago

I have been experimenting with both aforementioned styles with interesting results.

2 more replies

logicprogOP16d ago

aesthesia16d ago

jonquark16d ago

Isn't the metric that you've used "bugs per commit ~ per new line of code" going to miss the issue?

All code is technical debt.

If rsync releases used to have 500 lines changed and 5 bugs in and AI-powered rsync releases have 50000 lines and 500 bugs, it's the same bugs/line but much worse experience for the user?

edit: actually your table shows there weren't unusually large numbers of commits in this release, so perhaps my initial skepticism shows a bias I have?

1 more reply

hariseldom16d ago

PunchyHamster16d ago

Let's start with most outright alarming error - the claude statistics are taken out of whole 2 data points

logicprogOP16d ago

2 more replies

runarberg16d ago

The interpretations of the p-value is also alarming. One of the first thing they teach you in statistics class is: “an absence of evidence is not evidence of absence”.

This analysis showed that there is indeed an absence of evidence, but it concludes there is evidence of absence.

2 more replies

its-summertime16d ago

kelnos16d ago

You can apply that to the outrage too: the people pissed off about this are going off 2 measly data points.

jarym16d ago· 13 in thread

I've been coding for over 2 decades. I love it, I've always loved it and I likely always will.

I don't get it. I really don't. What I do know for sure is more and more code will be AI generated with or without the detractors.

jiggawatts16d ago

Having said that, AI is very useful for other activities like PR review, security vulnerability analysis, typo hunting, reverse engineering, etc.

I’m probably going to have to increase my subscription to the next tier but at the same time I still can’t use any of the code it generates.

albedoa16d ago

OP knows this but finds himself in the strange position of having to defend India slop in order to defend AI slop, totally unnecessarily and unprompted. It's baffling to you and me.

int_19h16d ago

Joel_Mckay16d ago

Good code is a living document that shows intent, and allows ease of maintainability.

Most people feel more productive with chat bots, but often end up wasting more time chasing self-inflicted issues. Same clown-car of Dev-ops proponents no doubt billing by the hour. =3

nomel16d ago

I've always noticed, within any subject involving tools, there are people who like the tools, and some people who like to use the tools to do something else.

I'm genuinely curious to hear the perspective of someone anti-AI, who works in software. Perhaps the impending doom/skill shift of our profession?

CapsAdmin16d ago

But developers also say good practices should be followed when talking to each other, and while some may do, reality is often very different.

It requires discipline, which varies a lot between developers, between projects, current mood, and so on.

In the beginning you might be careful doing small changes, but after a while you might get more tempted to accept the output for what it is, because ultimately that's much easier.

I started out carefully myself and slipped more into vibe coding, but I don't feel particularly proud of it for some reason.

1 more reply

yw341016d ago

I am anti-vibe coding if that meets your criteria?

Reviewing vibe-coded PRs and features has been utterly exhausting over the past few months.

Also when I'm _using_ the agent; at least 50 percent of my time is spent telling it to stop with it's approach so it doesn't go down a useless rabbit hole and waste tokens.

2 more replies

lelanthran16d ago

> So, I have an absolute blast with AI, because it helps do the more boring bits.

So... you're vibing? Not looking at the code at all?

Joel_Mckay16d ago

Personally, it would still bother me if some lazy bro hit a code-generator and people end up dead.

For context search, I find LLM quite useful... still wrong 20% of the time... but it has some utility.

Here is a thought experiment: If "AI" will eventually generate your work, than what actual value do you bring to the table? =3

tom_16d ago

I just really hate talking to the computer in human language.

albedoa16d ago

> It reminds me a lot of when offshore outsources started getting more software development work from the mid-90s with all the derogatory remarks made towards 'Indian developers'.

What was the impetus of the derogatory remarks?

kelnos16d ago

Some of it was indeed driven by sub-par work from the outsourcing firms, as the style of work was new and people on both sides hadn't developed the right skill set and processes to do the work well.

Some of it was certainly a form of jingoistic or xenophobic protectionism.

Joel_Mckay16d ago

LLM are good for context search, and template output.

However, you also get the lowest common salient answer guaranteed, uncopyrightable work (differs from public domain), and potential legal peril from copyright bleed-through.

We are in the golden Napster age of isomorphic plagiarism. =3

GodelNumbering16d ago· 12 in thread

Was just looking at commits and came across a commit and its revert

original commit: https://github.com/RsyncProject/rsync/commit/d046525de39315d...

```

- if (!ptr)

- ptr = malloc(num * size);

- else if (ptr == do_calloc)

+ if (!ptr || ptr == do_calloc)

   ptr = calloc(num, size);

```

reverted in https://github.com/RsyncProject/rsync/commit/7db73ad9a1b8721...

if you read the description of revert half carefully, it's easy to tell that even that was written by an LLM .

I can understand the sentiment of whoever posted the original thread.

wolletd16d ago

That's exactly what I'd expect when someone is excited about AI usage and becomes... well, sloppy.

logicprogOP16d ago

Tridge already explains this:

https://medium.com/@tridge60/rsync-and-outrage-d9849599e5a0

2 more replies

whateveracct16d ago

mythical man month only gets more prescient as time passes

lokar16d ago

I would expect a 10x change rate, even carried out by clones of the existing maintainers to result in more bugs.

gravypod16d ago

> Also the amount of commits is suspicious. In the last two months, rsync had about as much commits as in the last two years before that.

I wonder if the data looks worse or better when not doing per-10commit and instead do per-commit.

echelon16d ago

Seems like someone could use Claude to port rsync to Rust and the whole enterprise would be safer from things like this.

Start with unsafe then gradually convert into idiomatic Rust.

1 more reply

CaliforniaKarl16d ago

> Written with claude.

No.

The reversion commit references https://github.com/RsyncProject/rsync/issues/959. In that GitHub issue is this comment:

scottlamb16d ago

> This is a good example of what slips through LLM attention. It forces all allocations to be calloc as if it is a strict upgrade.

I blame AI for these regressions mostly in the sense that it caused a flurry of vulnerability reports. Those led to a flurry of quick fixes. Sometimes quick fixes cause other problems.

delusional16d ago

You don't really have to guess. The guy told us the AI didn't suggest this specific change:

https://github.com/RsyncProject/rsync/issues/959#issuecommen...

2 more replies

tom_16d ago

AI multiplied by Linux overcommit. What times we live in!

(My own view: 10.8 GB is nothing these days. Your sprintf buffers are probably larger than that. (And if they aren't: they should be. That, or you should start using snprintf...))

baq16d ago

sprintf() should be a longer way to write abort(), change my mind

1 more reply

alfiedotwtf16d ago

AI is fine, and in fact fun to use... committing AI written code without understanding Every. Single. Line. Of. Changes is on the committer. You can't LGFM for vibe code ffs

RustyRussell16d ago· 9 in thread

For those commenting, I suggest you read the post linked by the rsync author:

https://medium.com/@tridge60/rsync-and-outrage-d9849599e5a0

(Disclosure: while I haven't talked with him in years, Tridge was my colleague and mentor for many years. I feel it is worth considering his view before joining a crusade)

jorvi16d ago

> I thought it would be a good idea to do the core structure for the new test suite in public on master first though given all the rage that has generated maybe that was a bad idea.

RustyRussell16d ago

It seems that wasn't the Claude part, though I haven't seen a full analysis of exactly what broke. I also only saw one report: are there multiple, or do you just perceive that?

Rsync has many options: I can totally believe that fixing a bug in one place broke someone's usage, to be fair.

jpalomaki15d ago

matheusmoreira16d ago

This should be the top comment.

I think it's pretty sad that he even had to write it. Quite a lot of judgement from people who aren't paying his bills.

Laurel123416d ago

2 more replies

dnnddidiej16d ago

The title at least sounds less like judgement and more analysis and more about AI assistance (and claude in particular) than rsync. Maybe I am too used to postmortems!

1 more reply

guilhas14d ago

> Now if any of the people posting the rage stuff want to actually review any of the code I’ve published and make constructive criticisms then that would be great!

When you quickly churn more lines of code in a few days than you changed in months, and then release them as a normal, not sure you're expecting "constructive criticism"

Also if I suspect the project is just slopping high amount of code without proper thought, I probably won't invest my time into reading those changes

advael12d ago

nullc16d ago

I think that's an extremely well done response on his part.

1 more reply

dang16d ago· 8 in thread

[stub for offtopicness]

[see https://news.ycombinator.com/item?id=48416020 for how all this happened in the first place]

logicprogOP17d ago

Some notes on this:

- I used GLM 5.1 to help with the coding and math for this.

3 more replies

ex-aws-dude16d ago

So the original unfounded claim has 400+ comments because its perfect HN ragebait

The author provides evidence to the contrary and the HNers won't even engage with it instead just talking about the writing of the article in classic HN bikeshedding fashion.

How about after that we talk about the formatting of the website and the colors?

This site is really going down hill

Where is the accountability for your own opinions?

Are you guys only upvoting things that confirm your existing gripes?

1 more reply

roywiggins17d ago

> A simple distributional analysis of every rsync release with bug data. No model. No assumptions. Just placement.

If you want me to read your analysis, you are going to have to make it not read like Claude wrote it. What does "placement" even mean here?

3 more replies

dang16d ago

This submission was heavily flagged, presumably because the article sounded like genai. But the article now says the following:

> After posting this on Hacker News and recieving almost no substantive input, discussion, or response on the actual content of the article, I decided to rewrite all of the prose in my own voice.

I've therefore turned off the flags and hopefully people can actually now discuss the claims/findings being reported.

2 more replies

mschuster9117d ago

This article reeks of LLM "assistance" at the very least.

Please, why can't people write stuff by hand themselves any more? It's a good analysis but how can I trust it without reviewing everything myself?!

1 more reply

tappio17d ago

ps. really... that sideways scroll? plz fix it.

3 more replies

sfink17d ago

Wow.

This article's language is not en-US. It's not en-BR. It's en-SLOP.

Yes, that was my clumsy attempt at AI parody. Here's another: this article doesn't just have AI tells. It is AI tells.

As for the substance of the analysis, it seems pretty good to me but I see some flaws that weaken it a bit.

1 more reply

duk3luk317d ago

This article is unfortunately unreadable because all of the prose is unfiltered LLM slop.

ch_fr16d ago· 7 in thread

This article is a rant disguised as data analysis.

pie_flavor15d ago

Why is the guy being rigorous worthy of criticism, but the guys being idiots aren't? Did you post any similar calm-down comments in either of the HN threads on the original attacks?

ch_fr15d ago

I am more inclined to be critical of AI boosters, so what? Am I supposed to crumble under the weight of immense cognitive dissonance because I have... a stance in the discourse?

Engaging in LLM discourse is already a waste of my time, I'm not going to waste more of it just to avoid fallacious accusations of double standards because I didn't "do the same for the other side".

simianwords15d ago

Strange thing to say. The post does a good job of showing that there is no evidence Claude had anything to do with the regressions.

Your problem is that this was shown? You don't value epistemics -- you care about the ideology more than truth. Even if you don't like AI you should still do it in the right way.

Your comment comes across as more unself aware and more destructive. Lets keep this place truth first and ideology second.

runarberg15d ago

This still leaves the anecdotal evidence. And anecdotal evidence is still evidence, and in the absence of better evidence, it is perfectly rational to react based on the evidence you do have.

ch_fr15d ago

Yeah yeah the usual "look, these anti-ai people are so EMOTIONAL and HYSTERICAL while we're very logical and fact-based", I've lurked for a while so I read that one plenty of times.

> I'm only trying to show there's no evidence for the anti-AI hypothesis

---

Is "feelings" a curse word or something? What's so wrong with understanding the emotional component of the AI discourse?

Talking about "emotions" is not destructive when the topic at hand is literally people being driven by emotion under a github thread.

1 more reply

bwfan12315d ago

> "look, here's how [other camp] behaves"

Motivated reasoning in both camps.

The article is missing the point that once camps have made up their minds - no amount of analysis is going to change that.

davrosthedalek15d ago

Stay out of camps, people!

faitswulff17d ago· 7 in thread

> The analysis uses a single metric: bugs per 10 commits (bugs/10c).

germanjoey16d ago

IMO "bugs per commit" is even worse than that, because, in addition to what you say, it also hides the extraordinary spike of commit activity of a project that had previously been stable. [0]

It is the exact metric you'd choose if you wanted to make the current situation of rsync look like not a big deal.

[0] https://github.com/RsyncProject/rsync/graphs/commit-activity

logicprogOP16d ago

Yes, but we know why there was an "extraordinary spike," and it has nothing to do with rsync being "vibe coded." The maintained has directly addressed this.

2 more replies

ex-aws-dude16d ago

Why don't you prove the bugs increased then?

Why is it that some unfounded claim is made and the onus is suddenly on the project maintainer to prove it beyond all doubt?

It should be on the person making the claim to prove it

logicprogOP16d ago

bsza16d ago

No Claude, it still makes zero sense as a metric.

I suggest you practice some humility and update your conclusion instead of updating the mental gymnastics you used to arrive at the same conclusion.

1 more reply

skeledrew17d ago

atmavatar16d ago

The specific problems mentioned were all reasonably severe. The original post itself described a show-stopping bug:

    So my systems recently updated to rsync 3.4.3, and as soon as that happened my backup system - which does incremental backups using multiple --compare-dest= arguments - started to fail on anything but a full backup.

Incremental backups is perhaps the primary use of rsync, and they were broken for this person. That's pretty severe.

The second reply is similar:

    i wondered why my 3d printers were running like sh*t and at 100% cpu; turns out log2ram uses rsync.

This one I took with a grain of salt, since it read more like a dogpile than an actual bug report. However, if it's genuine, it's also reasonably severe.

1 more reply

xmddmx16d ago· 6 in thread

There's a meta-level of irony here that's important to note.

TFA is defending the use of AI, and it very clearly (to me) used AI to analyze the data and present the results.

In short, the study doesn't have sufficient statistical power, and is making "no difference" claims that aren't justified.

simianwords15d ago

classified16d ago

newsoftheday15d ago

AI is nothing like religion. People behave similarly to AI when debating their favorite sports team, or for Java coders, Checked vs Runtime exceptions.

Religion is about faith and what people feel and sense as much as believe.

Joel_Mckay16d ago

It gets pretty dark if you pull that thread of reason. =3

https://en.wikipedia.org/wiki/The_True_Believer

logicprogOP16d ago

MichaelDickens16d ago

FWIW I understood your point just fine. It seemed to me that you made a clear enough distinction between "evidence that Claude didn't increase bugs" and "no good evidence either way".

2 more replies

wookmaster17d ago· 5 in thread

Claude is just a tool ? The developers who merged that code and didn't properly test increased the bugs.

everdrive17d ago

"Did cars increase traveling deaths?"

"Cars are just a tool. The drivers who piloted the vehicles and weren't careful enough [are responsible for the deaths.]"

roywiggins17d ago

If something's a bad tool that misleads people into doing bad work, it would be good to know that.

ebiederm17d ago

Please read the article.

The unsolicited security reports are the issue.

Angostura17d ago

This tool is claimed to be able to find and fix bugs.

runarberg16d ago

Feels like something a bad (and potentially dangerous) tool would say.

lbrito16d ago· 4 in thread

Wait, how is any of this relevant if there were only 2 Claude commits? My statistics courses are far behind me, but don't you need at least 30 data points to conclude anything?

logicprogOP16d ago

kelnos16d ago

It wasn't 2 Claude commits. It's 2 releases where the (many) commits were largely co-authored by Claude.

> My statistics courses are far behind me, but don't you need at least 30 data points to conclude anything?

matheusmoreira16d ago

> My statistics courses are far behind me, but don't you need at least 30 data points to conclude anything?

There is no fixed number. Sample size depends on the size of the set you're sampling, desired margin of error and confidence interval.

If your total set has a million items, you need ~16600 samples to draw conclusions with 99% ±1% certainty.

wlonkly16d ago

It's not uncommon to have small amounts of data come out of experiments. These are appropriate tests for the size of the data. These tests failed to disprove the null hypothesis.

scsh17d ago· 3 in thread

That's not meant to be wholly dismissive either. But in general, I don't think quantitative analysis alone is enough to fully answer this type of question.

skeledrew17d ago

MostlyStable16d ago

ex-aws-dude16d ago

The burden of proof is on the one making the claim?

cobertos16d ago· 3 in thread

This post just gives me more questions than answers and I'm unable to form a decision:

This article feels like half of an analysis presented as a highly complex finished product due all the advanced stats they're running.

logicprogOP16d ago

> Why was v3.4.1 the most buggy, right before the Claude commits? Why did "nobody notice"? It's way to strange to just say welp, it must be human error.

Why wouldn't it be except question begging priors assuming it couldn't be?

cobertos16d ago

> Why wouldn't it be except question begging priors assuming it couldn't be?

> My original metrics which didn't filter out feature requests...

---

Could you please post the duckdb file that has the raw bug -> severity + version mapping to the GitHub repo? I have a desire to dig into this myself

1 more reply

Laurel123416d ago

> Tbh idk how that _wasn't_ a red flag in the author's analysis...

Because he didn't analyze shit, just asked a clanker to rationalize his "clankers are great" conclusion.

geraneum17d ago· 3 in thread

> But the critics' accusation is also blunt: "Claude is making things worse." A blunt instrument is the fairest response.

So the criticism was bad, and that somehow makes it ok to use a bad metric?

logicprogOP17d ago

abirch16d ago

AI + Interest != Expertise

I come to hn because I get very nuanced, informed information and glorious puns.

epolanski16d ago

What would be a better one?

dvt16d ago· 3 in thread

matheusmoreira16d ago

Absolutely agree. Quite a lot of judgement from people who benefited from this guy's software for over 20 years, probably without ever helping him pay his bills even once.

Panino16d ago

> It's always the most insufferable people that make the biggest hullabaloo about a project they have nothing to do with and have never contributed to.

> I run a smallish project with ~1k stars and I've stopped maintaining it last year because people feel like they're absolutely owed features or bug-fixes or whatever.

unphased16d ago

haha, that analogy says more about whoever wrote it than it ever could to get the intended point across!

TZubiri16d ago· 3 in thread

I haven't used this thing for like 10 years, when my modus operandi was googling my question and installing whatever stackoverflow suggested.

Can someone explain why one would ever use rsync (pre vibecode version) instead of cp and dd?

Can't we just 'apt remove rsync' and save ourselves the time even spent on evaluating this dependency?

Thanks

Joel_Mckay16d ago

If you deal with large numbers of files, the ability to dynamically skip compressing media and zipped files for transfer can be extremely handy.

While stuff like sshfs is great for a few small files (and win11), it will be an order of magnitude slower than an rsync task.

Most smart folks automate backup/recovery scripts, and only sometimes edit them with a new OS install. =3

Arcuru16d ago

https://wikipedia.org/wiki/Rsync

int_19h16d ago

Because cp will copy everything, while rsync will copy only the things that actually need copying, and also delete the things that should be gone?

pushcx16d ago· 3 in thread

    What followed was extraordinary: 329 comments and counting, ranging from thoughtful concern to outright harassment.
    The thread did not stop at words. One user posted My Little Pony drawings of themselves strangling the "project janitor that pushed vibecoded commits":
    It spread to Hacker News and Lobsters, generating hundreds more comments.

This is false, it did not appear on Lobsters. Here is the function in the codebase that prohibits this kind of brigading: https://github.com/lobsters/lobsters/blob/main/app/models/st...

Please correct your article.

tptacek16d ago

It is neat that Lobsters has this feature (and HN should too), and I'm glad you took a beat to explain it. I think you didn't need the last sentence, though.

logicprogOP16d ago

I have done so! that was a misremembering on my part. first mention of Lobsters is now here:

> On Lobste.rs, in response to the Medium essay Tridge himself posted in response, finally some users like boramalper begin to actually ask for evidence one way or another:

pushcx12d ago

Thanks, I appreciate you sorting out the timeline on such a heated issue.

aarjaneiro16d ago· 2 in thread

> "Claude clearly made things worse" &emdash; the main claim

Even this report is full of claude-introduced bugs

runarberg16d ago

For those that don’t know the html entity is — not &emdash; although I think in modern codebases people usually just type — directly.

This mistake does exist in the wild though: https://github.com/search?q=%26emdash%3B&type=code

logicprogOP16d ago

That one was on me. I always mess that up.

vlovich12316d ago· 2 in thread

saagarjha16d ago

Because Fil-C is not a serious project

int_19h16d ago

If you make claims like that, you need to expand on them or at least provide some references.

1 more reply

PunchyHamster16d ago· 2 in thread

The fact last few commits were attributed to claude doesn't mean previous ones didn't use it.

Also if you write a paper where you get statistical conclusions out of whole 2 datapoints you'd be laughed out of the room

logicprogOP16d ago

> Also if you write a paper where you get statistical conclusions out of whole 2 datapoints you'd be laughed out of the room

> The fact last few commits were attributed to claude doesn't mean previous ones didn't use it.

vintagedave16d ago

Why not? Claude marks its commit messages. That there were none, and then there were, seems a signal.

The blog has many more datapoints than two. It compares many releases. You’re looking at 2-vs, not 2.

Polarity17d ago· 2 in thread

so the answer is: no. actaully less bugs. thanks

gjvc16d ago

"fewer"

davrosthedalek16d ago

First rsync and now less? What comes next, cat?

1 more reply

amluto16d ago· 2 in thread

Reposting my previous comment because the post I commented on earlier was flagged to death:

[0] Which GitHub doesn't even render by default because their diff viewer is so bad.

[There were follow-ups. See https://news.ycombinator.com/item?id=48352182]

j16sdiz16d ago

Tips: In HN, You need blank line (i.e. Hit ENTER twice) to start a new paragraph. -- Everything jams into an incomprehensible wall of text if you use one new line.

amluto16d ago

Ugh. The source comment, which this was literally a copy and paste of, had newlines. I wish HN could roundtrip from itself via the clipboard correctly.

tiahura16d ago· 2 in thread

Write with your own voice and then polish with ai.

dgellow16d ago

Or just do not polish? Write with your own voice accept it as it is, humans communicating to humans

cube0016d ago

Please don't, even "polish" can make it sound completely AI written.

yobid2016d ago· 2 in thread

needs a tldr; im not reading all that. maybe claude can summarize it for me.

logicprogOP16d ago

noAnswer16d ago

Asked your Clanker what a joke is.

iainctduncan16d ago· 2 in thread

int_19h16d ago

To be fair, the tone of the article is practically chill compared to the comments it is written in response to.

logicprogOP16d ago

nairboon17d ago· 2 in thread

Is this an analysis made by/with Claude?

quentindanjou17d ago

It very obviously is. "The Outlier Nobody Noticed" -_-"

overgard16d ago

FWIW, I asked ChatGPT to review the article just for my amusement. It's conclusion was:

the_real_cher17d ago· 2 in thread

Is there a non vibe coded fork of rsync?

throwaway735617d ago

Yes: https://news.ycombinator.com/item?id=48390931

So far it reintroduced several security issues and replaced the README.md.

MYEUHD16d ago

There is openrsync, which the OpenBSD re-implementation of rsync

It's not a fork, but it's 8 years old, and is already shipped by default in OpenBSD and macOS.

2 more replies

gadrev17d ago· 2 in thread

Ok.

  $ apt-cache policy rsync | grep Installed
    Installed: 3.4.1+ds1-7ubuntu0.2
  $ sudo apt-mark hold rsync     
    rsync set on hold.

imurray17d ago

That version has security fixes from the same day as the latest rsync release: https://ubuntu.com/security/notices/USN-8283-1

1 more reply

logicprogOP17d ago

2 more replies

MantisShrimp9016d ago· 1 in thread

I think this writer kinda took the bait which is fine someone had to do this so we couldn't debate endlessly.

thin_carapace16d ago

tptacek16d ago· 1 in thread

This is a neat post and I'm glad it got written and this is a little bit off-topic but:

Hey, 'logicprog, your writing is fine!

logicprogOP16d ago

Thank you!

AEVL16d ago· 1 in thread

How does the analysis look if we only count the >=90 severity cases—that is, if we downgrade the severity of all <90 cases to 0?

logicprogOP16d ago

Feel free to run it and find out. I don't think it would produce very much useful information though

esailija16d ago· 1 in thread

logicprogOP16d ago

The data literally shows there aren't, there have been worse releases before. In what way did I manipulate the data?

steno13216d ago· 1 in thread

This is just narrow thinking. Say Claude did increase the bugs in rsync by a negligible factor.

You have to broaden your perspective. It's not just about how rsync was affected.

boxed16d ago

Let me translate this comment:

> ok, so I was wrong and badly, but I will double down and say I was right anyway

jrflowers16d ago· 1 in thread

Tl;dr:

Yes, it did. Here is some math showing that you shouldn’t care about that.

logicprogOP16d ago

In what way did it create more bugs? It literally doesn't show up in the data. What are you talking about?

1 more reply

e4016d ago

Here is the process I used to do it, which was way more complex than I thought it would be:

https://gist.github.com/e40/caa67c1b8d439a528695f996d0519d8e

logicprogOP17d ago

igregoryca16d ago

zzo38computer7d ago

Their claims that they introduced deliberate bugs, are unlikely to be accurate, and not worth making those claims nor the violence that they involve.

gravypod16d ago

This would be even harder to measure.

WesolyKubeczek16d ago

Instead we have a shitstorm over presumably legit issue, for which the only source is some mastodon post.

One command that used to work in 3.4.1 and stopped working in 3.4.3. Just one! We could have already bisected the living shit out of this and go home, but no.

rovr13817d ago

I'm just curious about testing.

Is this a configuration that's not common and thus not tested?

If people think they can do better, I want to see their forks and them keeping up with it.

https://github.com/RsyncProject/rsync/graphs/contributors?fr...

parliament3216d ago

Your verbosity and sentence structure are not a problem. I hope that publishing this gives you a bit more confidence in your writing, because it's legitimately good.

rswail16d ago

People need to grow up and appreciate what others in the community (especially people like tridge) have provided.

bwfan12315d ago

mikaeluman16d ago

Not going to critique this survey. Must have taken a lot of time and required a lot of patience. Great work!

I think it will be up to some group in academia to make a real full blown study across several repositories.

Bugs is not the only variable of interest here. I am guessing someone is already doing this as we discuss it here...

guilhas14d ago

Rsync is a highly trusted software, included in many distros. To move important, and high quantity

If several or critical lines of code get changes quickly, and keeps breaking things, with or without llms, there will be backlash

Rsync should rightly loose reputation if the project allows the release breaking changes to follow the latest hype trend

throw716d ago

Trust is slowly gained and easily lost. The amount of apologia I hear from top-tier developers signals an inflection point downward.

moktonar15d ago

htk16d ago

Flagged. Article is as AI heavy as the commits that people are complaining about.

ladax7270715d ago

So a project is using a GPL licence, but instead of forking you harass the authors and you somehow think that you are the smart one and that you are doing anyone a favour?

ltbarcly316d ago

I'm noticing more and more AI writing everywhere, from youtube to this article: From the subtitle: "Nothing complicated, answers only one question: " clearly LLM generated.

Havoc16d ago

What’s the deal with anti ai people being so rude

foxes16d ago

I wonder if all the commits which involve adding tons more test are the basis for a rewrite in rust anthropic marketing event

drankinatty13d ago

The issue the coding tools like Claude present is the sheer size and scope of changes and commits they generate that would take mere mortals months of careful coding to do.

My take on the issue is less about the regressions and Claude screw-ups and more the lesson to all about the reliability of the coding tools and the diligence required to validate what they spit out.

Never forget, "to err is human, but to really foul things up requires a computer."

AI just applies that adage at industrial-scale.

manlymuppet16d ago

Unrelated, but this post has a level of rigor you rarely see nowadays. I think it deserves to be commended for that.

nasretdinov16d ago

nelox16d ago

The peak of cascading effects from errant dependencies has yet to come

block_dagger16d ago

Do people enjoy interrogative headlines? Find out at 11.

themafia16d ago

You can write for an audience or you can write for yourself. Which is fine either way but you shouldn't pass the blame for bad results on to your audience.

> and recieving almost no substantive input, discussion, or response on the actual content of the article

Well did you write it for that purpose?

> "Just wait, more bugs will surface" -- v3.4.3 has been out long enough

KronisLV16d ago

Pretty cool site!

logicprogOP16d ago

nazgul1716d ago

By the way, I did find this a bit hard to read but, as instructed by OP, I'll go fuck myself.

For what it's worth, I find AI written prose easy to read, and am annoyed by all the constant HN comments which just point out the author was AI, without anything else substantive to add.

overgard16d ago

The TLDR seems to be: needs more data.

WhereIsTheTruth16d ago

LLMs don't create bugs, people do

nilslindemann16d ago

Plot twist: This blog post was written using Claude too.

1 more reply

mwkaufma16d ago

Smokescreen of highly-contingent analysis and appeals to authority over a premotivated-conclusion.

1 more reply

1a527dd516d ago

I'm amazed that this is still being discussed.