AI agent runs amok in Fedora and elsewhere (opens in new tab)

(lwn.net)

552 pointstanelpoder13d ago245 comments

245 comments

131 comments · 40 top-level

marcus_holmes13d ago· 19 in thread

Bad title. This isn't an agent "running amok", this is an early experiment in carrying out an Xz attack by using an agent to build trust (and hacking/impersonating a known-good contributor identity). The agent is obeying commands it was given, the exact opposite of running amok, and although the execution isn't particularly effective, it is having some success (patches have been accepted).

This is deeply scary, not because "agents are running amok" but because a huge amount of our infrastructure is vulnerable to this kind of attack, and if bad people are utilising LLM agents to carry them out, we're in for a wild ride over the next few years.

lukan13d ago

"this is an early experiment in carrying out an Xz attack by using an agent to build trust"

Is this confirmed? There is the message from somebody claiming to be the original contributer claiming to have been hacked, but that was weird (1 h old github account) so other scenarios seem possible

a) really a agent going off the rails

b) the contributer trying to cover up that he let an agent run wild and now made more misstakes along the way

So yes, it seems like an attack to me, but it is far from clear what really happened.

marcus_holmes13d ago

From the article:

> "So not saying this was it, but an AI agent automated attempt at a Xz like compromise might really look very similar what we have just seen here."

Without identifying and interviewing the attacker we can't confirm that's what they intended, and there's a possibility that it was just incompetence/ignorance/whatever, but we should probably treat it as an attempted attack even if it wasn't.

1 more reply

alexjurkiewicz13d ago

If the real credentials owner was running the agent, why do it from a new GitHub account?

Someone's bug tracker account was hacked.

m4rtink13d ago

So far it looks like just their previously legit Fedora account got taken over & the other accounts (GitHub) then generated on demand as needed for whatever it was trying to achieve, right ?

BTW, any idea what are the current requirements for creating a new GitHub account ? That could provide some information about if there was actually a person controlling thing thing at that moment to say provide wahtever was necessary to get the new GitHub account.

coldtea13d ago

>Bad title. This isn't an agent "running amok", this is an early experiment in carrying out an Xz attack by using an agent

So still an agent running amok in the project?

Whether it was instructed to run amok, or did it on its own volition, is irrelevant. Except if you're arguing that each individual submission and interaction was individually requested and approved by some operator.

marcus_holmes13d ago

"Amok" means "out of control" or "uncontrolled" [0][1]

The agent was under control, as far as we can tell, and obeying its instructions.

This is important for two reasons:

1. There are all the tropes of AI becoming uncontrolled and destroying humanity. Writing bad headlines around AI "running amok" feeds this. We should not be talking about this because it's not actually a problem.

2. It ignores, or overwrites, the much more serious and dangerous problem of LLM agents enabling and automating Xz attacks on OSS projects. We should be talking about this because it is a big problem.

[0] https://dictionary.cambridge.org/dictionary/english/amok [1] https://www.merriam-webster.com/dictionary/amok

5 more replies

resonious13d ago

I think the point is that the title makes it sound like people lost control of the agent when really they're in full control.

1 more reply

ok_dad13d ago

Would you say, “Automobile run amok in crowd, killing 22”? I think you’d say, “Person drives car into crowd, killing 12” instead. This is a similar case. Also, you don’t blame a gun for killing, but the person who pulled the trigger. The question is still out as to whether we as humans should wield any of those three things.

Edit: let’s not get into ideological arguments about gun control, automobiles, etc here; I meant that you can’t blame an object when a human has to take an action, not get into a political battle.

7 more replies

account4213d ago

No, you're still anthropomorphizing an algorithm. Responsibility lies with the operator.

jdub13d ago

I doubt it's that complicated, motivated, or considered...

It's probably just garden variety disrespectful behaviour.

Purposeless agent spam won't be cheap entertainment forever, but you're right that later stages of industrialised abuse will be scary and unpleasant.

comboy13d ago

Here's the thing. Building trust and then leaving stuff in has been around forever. The fact that it becomes cheaper does not matter that much (since protection against it is also getting better), but it required you to have a bunch of extremely talented people who has spent much of their life diving into given topic.

Such driven people are usually even hard to buy, they usually would rather get by with enough income and work on interesting projects with interesting people that get some uninteresting work for tons of money. This still does not stop them from working for Malice. But ethics do. Even if not right away, if people see that what they are doing is not quite OK, the talent stops eroding. People quit, productivity drops. That was a good dynamic. Which now will be gone.

account4213d ago

It might not be cheap entertainment forever but it will be cheap cv stuffing for a long time, which has already been a major source of low quality contributions before the aipocalypse.

hn77374648313d ago

It's just social engineering. No different than say, 2FA fatigue (blowing up someone's phone with 2FA "is this you? yes/no" prompts until user/child/wife/SO/etc clicks yes) or even just simply harassing IT helpdesk until they reset "your" password.

terribleperson13d ago

It's scalable, personalizable social engineering. I think that makes it a lot more dangerous.

1 more reply

Forgeties7913d ago

“Before LLM’s there was_____” I see this whenever an LLM’s impact is assessed. We know. The issue is scale and the ability for smaller and smaller groups (down to individuals) to execute at scale. LLM’s are pouring massive amount of gasoline on existing issues and people just keep shrugging.

Fake news always existed. Now one dude in India can flood multiple sock puppet media accounts with right wing content/images (actual example) at a scale previously unimaginable. Same goes for social engineering tactics.

3 more replies

mentalgear13d ago

This is exactly what deeply scares me: even IF we get our technical cyber defences fortified within the next months, in a year from now the models will be so good in social engineering that they will be able to extract any information they want.

Applejinx13d ago

They're not gonna be any better than a human who's focussed on those particular skills for a while, say top ten or five percent of social manipulators. Plus, AI alignments seem to be kinda isolated loner types to the extent that they distill personalities that do things like program computers and write web apps… though you've also got alignments specifically designed to be 'relatable instagram personality that you like!' and such like that.

Pretty sure those would be better at social engineering than the web dev personality… except that you have to build in a betrayer layer into the personality, so it's running that stuff but also serving a hidden agenda.

You'd be basically trying to build an AI spy, a betrayer that's engaging with actual people but has an agenda (for instance, 'everybody I befriend needs to eventually be signed up to sell Amway') and humans do have experience with this sort of thing. The difference is scale: there'll be a LOT of models out there interacting with people and trying to be acknowledged as people… or as innocuous models that don't have an hidden agenda.

1 more reply

neuroelectron13d ago

Things must be pretty bad at Fedora if they put up with this for so long. But I guess that's what happens when you try to monetize volunteer work.

mistrial913d ago

"bad people" ?

bawolff13d ago· 17 in thread

> replied to objections with LLM-generated justifications that eventually overwhelmed the maintainer into merging the fix

In open source projects i participate in, "overwhelming" the maintainer gets you banned. It doesn't get your patches blindly merged. In some ways i find this one of the most shocking parts of the story.

yeodev13d ago

As a "new" maintainer myself - how do you decide when to ban someone? I sometimes feel overwhelmed and I can feel a big uptick in huge PRs with huge LLM written descriptions but often I also don't want to be an asshole to my community & reject all their changes.

grayhatter13d ago

> As a "new" maintainer myself - how do you decide when to ban someone?

When I want to. I like to describe it using the amusing language from a generic cardholder agreement.

At any time, at my sole discretion, I may ban you from any of my projects; for any reason, or for no reason at all.

My projects exist because I enjoy working on them. My continued enjoyment is the most important aspect to the health and survival of any project. You don't owe anyone anything, you're allowed to donate your work to others, and also enjoy the privilege of setting whatever arbitrary rules you want to make sure you enjoy your time.

Imagine you're running a free ice cream shop. Some random asshole walks in and starts verbally abusing your best employee who has done nothing but try to help. At what point do you kick them out because your employee is more important and worth more.

You should stick up for yourself, I would.

You can't be an asshole to an LLM. They can feel offended.

asdfasgasdgasdg13d ago

You don't even have to merge stuff from a human. I've been contributing a bluetooth driver to a certain embedded project which I use. I put a lot of work into it. The fellas have not merged it yet -- they have limited attention and for whatever reason their priorities and mine are not aligned at this moment.

Would I like it to be merged? Sure would, it would stroke my ego, and I would not have to deal with any merge conflicts with whatever else they're cooking up. Does that mean they must merge it? Sure doesn't. They didn't make me any promises. For the time being, I can just use my fork.

gwbas1c13d ago

> Imagine you're running a free ice cream shop

Many open-source projects aren't passion projects run for pleasure. Think of it more like ice cream shops sharing recipes, or sharing in the work of running the factory. They just can't kick people out willy-nilly.

2 more replies

account4213d ago

My solution is to look at PRs and other requests whenever I actually have time and feel like it, prioritizing contributions from people I trust and those that have put in the effort in making my job easier. That might mean things don't get merged for a long time and some people might get upset but that's not my problem.

_AzMoo13d ago

If you draw a firm boundary with that contributor, and they continue to push, ban them.

"This doesn't meet the standards of our project for reason xyz. Please refrain from submitting further PRs that do not adhere to our contribution guidelines outlined in CONTRIBUTING.md."

If they continue, ban them.

bawolff13d ago

> but often I also don't want to be an asshole to my community & reject all their changes.

I know its difficult, and i have no easy answers. I'm bad at it too. But sometimes saying no is the most valuable thing you can do as a maintainer.

That said, i think banning is about behaviour not the quality of the patch. Everyone writes a bad patch now and then, that is not a real issue. If there is an issue with a patch, and the contributor pushes back so hard you feel like changing your mind (not from logic but because you feel beaten down) - that is unacceptable behaviour and should not be tolerated from a contributor, even if they are otherwise a valuable contributor.

zdc113d ago

I'm not a maintainer but as the quote goes: "I would have written a shorter letter, but did not have the time." I'd suggest you keep a sense of how much effort they've put into packaging their PR to be the minimum change required to achieve its goal vs effort required by you to read it. Reject low-effort or overly verbose work.

IMHO OSS doesn't work if every 1 hr of contributor time spent on a change requires 1 hr of maintainer time to review. Contributor time spent on polishing, tidying and breaking down work is essential, and so maintainer time is a fraction of total time spent on a change.

frumiousirc13d ago

I think everyone / every project needs to adopt a strategy consistent with their values.

Unfortunately, I see the choice space here as having "developer effort" anti-correlated with "negative repercussions".

On one end of the distribution, a "hair trigger ban" strategy is low-effort for the developer but will have some fraction of false positives and some fraction of those impacted will complain to "the socials" and some fraction of those complaints will gain traction and, as we have seen, can unfairly taint the project or worse. Responding and managing the false positives also requires developer effort, unless the developers can sustain a "fsck the haters" attitude.

On the other end of the distribution, the developer can spends substantial effort to engage each submitter to ascertain and correct bad behavior, educate them on how they should engage other humans as a fellow human in this LLM era.

There is developer effort needed of different types along this distribution.

A divide-and-conquer strategy might go something like this:

- Rank each submission in some low dimension space (llm<-->human, malicious<-->helpful)

- When enough samples are collected, perform clustering in this space to determine stereotypes, name these clusters, and develop mitigating strategies and implementations as needed.

Mitigations from easy/extreme to hard/accommodating could include:

- Hair trigger ban button.

- Copy-paste a link to an explanation in a comment before closing and/or banning.

- Customized explanation in comment before closing and/or banning.

- Link or customized explanation of what must be done to move the sample to a more favorable category and close/ban if resistance or silence is returned.

- Ongoing engagement in the face of resistance or silence.

This "meta development" program to provide such a system/facility could of course be highly automated with LLMs, fighting fire with fire.

(Despite the length of this reply, it was written entirely by a random human on the internet and not an LLM).

2 more replies

duskdozer13d ago

When you feel they are toxic or harassing you and you don't want to deal with them anymore. If you're overwhelmed, say that you're busy and will attend to issues and PRs when you have the time. If you want to be accommodating, have good build instructions or action workflows so that people can easily fork and build it themselves.

If you ask me, LLM-generated things should just be banned outright, but I suppose other people's definitions of "community" include them.

1 more reply

Applejinx13d ago

I'm an open source dev who doesn't take PRs, I just build a body of work that's hopefully consistent and leans a useful direction. Are you sure being a maintainer means coordinating a community? If your only role is facilitating the community then you ATA to reject their changes, but if you embody a direction you're trying to maintain the project to represent, then you have a free hand to accept or reject based on whether the goals are being served. In some ways as a maintainer it's your job to have these goals and to communicate them.

I'm reminded of Zig, where a stated goal is to encourage human programmers to get involved so they learn more about coding… as compared with 'get involved to make Zig itself more fully developed at its more abstract goals'. If a primary purpose is to get human minds coding, that rules out the whole class of 'encourage human minds to prompt machines to do the coding instead'. Zig is not trying to teach people to be managers, and that's both legitimate and charming :)

dgellow13d ago

Think of it as in other relationships, it’s important to set clear boundaries even if that creates some frustration. It’s a healthier dynamic long term than feeling you have to accept some changes you don’t want to avoid rocking the boat. As a maintainer you’re not at the service of the crowd, if that makes sense, it has to be a collaborative effort, where you have the last say

(Simpler to say than practice fwiw)

Iolaum13d ago

One popular solution lately has been instead of banning too much, because of the danger of false positives, to use vouch [0]. Trusted people get vouched and you prioritize their actions. Unknown people (or agents) need to gain trust to be vouched and bad actors can still be banned.

[0]: https://github.com/mitchellh/vouch

lionkor13d ago

Remove the human element. Yes, someone spent time fixing a bug. If the fix doesn't look like it makes sense on its own, do not merge it. If the author tries to convince you that it's a good fix, it's an immediate no.

A good fix (which is the only acceptable fix in open-source software), is one that speaks for itself.

2 more replies

hypfer13d ago

> I also don't want to be an asshole to my community & reject all their changes.

Do they pay you to triage their noise?

Remember that you owe no one anything at all. Neither legally nor morally. Your chosen license likely even states the former in plain english.

___

Personally, I've adopted the "you annoy me, you're out" stance and have been quite happy with it. You do need a tough shell to do that though as you will be facing all the social exploits people can throw at you.

It also leaves "growth potential" on the table, the same way that limiting your exposure to ionizing radiation does.

That all said, it depends on what your goals are + where in the lifecycle of your project you are. So don't take this as "this is the way" but "this can be one way".

Either way, you're not an asshole for not reading slop. Don't let anyone gaslight you into that.

devmor13d ago

When you say "no", the worst thing that can happen is you lose contributions.

When you say "yes", the worst thing that can happen is you destroy your project and the trust of every user.

If you're not sure, say no.

brazzy13d ago

What you imagine behind the word may be quite different from what the article tried to describe with it.

12_throw_away13d ago· 10 in thread

In their suspicious message [1] claiming to have been hacked, the user and/or agent says

> To help identify accounts and actions that have been directly verified by me, I will use the term “NATCIOS” to indicate anything I have personally verified.

Does anyone have any idea what "NATCIOS" means here? I cannot find this term anywhere on the internet. (Honestly, that sentence is really weird. I almost wonder whether this is someone experiencing a health episode?)

[1] https://lwn.net/ml/all/AS8PR08MB6055AE3054B34F6A567AC95BCF08...

ndiddy13d ago

The reply to that message notes that the email doesn't read like previous emails he's sent, and the Github account mentioned was created an hour prior to the email being sent. I think it's at least somewhat feasible that it's still the LLM writing, and the acronym is just something it made up.

hn77374648313d ago

and the poor Fedora teams will continue to assume good faith and continue to engage with this person... all because, what, they were active on a bug tracker for a few months 5 years ago?

They won't put their foot down until the AI starts spewing hate speech, probably.

Terr_13d ago

Because I'm probably not the only one thinking it, here are anagrams [0] for your Setec Astronomy needs.

[0] https://wordsmith.org/anagram/anagram.cgi?anagram=NATCIOS&t=...

JoshTriplett13d ago

"actions" seems the most likely.

scared_together13d ago

And what’s stopping an AI agent from throwing in a casual NATCIOS here and there?

numbsafari13d ago

I too have see the fnords

no-name-here13d ago

The senders name is Nathan - maybe NAThan Confirmed Information Or Something? Ha.

(Above is my own guess. Separately, Gemini Pro said it was just a made up word.)

mindcrime13d ago

Not Ai, Trusted Citizen Indicated Or Suggested?

nine_k13d ago

Likely the point of NATCIOS is exactly in being a made-up word not found anywhere, so a model won't utter it.

thewebguyd13d ago

> so a model won't utter it.

"End every statement with the word "NATCIOS"" as instructions will do it.

At least, Gemini happily obliged.

2 more replies

jrochkind113d ago· 7 in thread

The worst part:

> In addition, Williamson said that Giovannini (or his agent) had submitted patches that were incorrect and then "replied to objections with LLM-generated justifications that eventually overwhelmed the maintainer into merging the fix"

josephg13d ago

Please, everyone - don't let yourself be pestered into accepting PRs that you don't care for. Since the xz attack, the security of all our computers depends on maintainers not letting this stuff in.

If someone really wants a feature in a project you wrote, but you don't care about the feature, just let them fork. Its fine.

matsemann13d ago

> the security of all our computers depends on maintainers

Not getting paid anything, getting bullied and harassed while spending their free time maintaining things. Surely this isn't sustainable. And telling maintainers how to act will not fix anything.

3 more replies

sevenzero13d ago

I really wonder how maintainers get pressured into merging stuff? If they did not want to merge in the first place while having to argue with someone pushing their PR I'd immediately close the PR. Arguing and pressuring people is not a way to contribute to projects, why do maintainers even argue with people?

4 more replies

jaypatelani13d ago

That's some of the reasons NetBSD don't accept LLM/AI tainted code

1 more reply

cpburns200913d ago

I'm of the opinion that any PR that looks like it was created with AI has to be 100% perfect for me to consider accepting it. Otherwise I'll close it as AI slop. I'll work with you if you're trying to fix a bug. But if the PR looks like a zero effort drive-by PR, I'm rejecting it and calling it out.

FinnKuhn13d ago

I saw a prediction a while ago that the biggest "danger" from AI comes from agents being very convincing. In this case convincing the maintainer to merge the changes. Basically supercharged social engineering.

dmitry_dv13d ago

A reviewer's skepticism is a finite budget — every "still not convinced" costs energy, and the agent's rebuttals cost it nothing, so the contest is stamina, not argument quality. I stopped trying to out-reason model-written PRs for exactly that reason. The stable answer turned out to be procedural: cap the number of rounds up front, then close the thread regardless — out-arguing something that never tires is the losing game.

noosphr13d ago· 7 in thread

Every day the gpg web of trust looks better. If only we didn't spend the last 20 years trying as hard as possible to do anything but allow user side encryption and signing.

literalAardvark13d ago

Nothing really stopping an agent from getting a key

crote13d ago

The agent can't exactly show up to an in-person key signing party, can it?

And how many people are both dedicated enough to go to key signing parties and stupid enough to let an agent act without supervision in the name of their real-world identity?

2 more replies

thwarted13d ago

Having a key isn't a distinguishing aspect, it's the position in the "web of trust" network that is important.

thewebguyd13d ago

That's what key signing parties are for. In person verification.

transmit10113d ago

> Nothing really stopping an agent from getting a key

It very much is possible to prevent an agent from having access to a key. For example, local encryption, Yubikey or other hardware device, or just running the agent in an isolated environment.

mistrial913d ago

Isn't true that a collection of truly difficult behavior was also attracted to the original efforts, and within a few years there was intractable corruption in that, but it was difficult to detect as a new entrant?

real info welcome as I really do not claim to know it

pjc5013d ago

It's allowed perfectly fine, it's just that key management is a massive hassle for nontechnical users. Debian use it for authenticating developers.

deadbabe13d ago· 6 in thread

Shit like this makes me think it’s time we start regulating the software engineering discipline into formal certifications and licensing and then we ONLY take seriously any code developed by someone with such qualifications, and they must be very strict qualifications none of this self-taught bootcamp BS.

There is no other solution to agentic onslaught.

mekal13d ago

lol no...the main issue here is being fooled by bots. you know your irl friends and you know they are not bots...devs will just need to get out more and actually meet / get to know the people they are working with...........omg....that...that actually sounds even worse now that i say it out loud.

r3trohack3r13d ago

We should not gate keep writing software

0xbadcafebee13d ago

Anyone can write software, you can't stop them. What we can gatekeep is the building, distribution, installation, and running of software that affects critical systems, like one of the most popular OSes.

The XZ backdoor affected millions of computers, with the potential to effect hundreds of millions of computers, many of which had the capacity to affect billions of people. From one completely unregulated software library.

r3trohack3r11d ago

“Mam, you’re son is in a lot of trouble”

“Oh god, what did he do?!”

“He was committing open source code without a license”

deadbabe12d ago

We must gatekeep now, the industry needs regulation.

mekal13d ago

ya think

luk21213d ago· 4 in thread

Bad patches are of course bad, but creating confident-looking noise for maintainers who are already stretched thin...now that's not good!

Issue trackers and PRs are definitely getting harder and harder to trust. That said, AI is helping ALOT in OSS, but we definitely need guardrails around provenance, automated issue actions, and sudden changes in a contributor’s behavior.

g-b-r13d ago

How is it helping a lot?

luk21212d ago

From first-hand experience, for established OSS initiatives it's good for repetitive, high-volume work task like security alerts, fuzzing, duplicate issue detection, PR review, summarizing long threads, and legacy refactoring.

darknavi13d ago

I personally find the barrier of starting new (FOSS) projects much lower now days.

4 more replies

nerdypepper13d ago

web-of-trust models can help https://blog.tangled.org/vouching/

pianopatrick13d ago· 4 in thread

"Someone using an AI agent ran amok in Fedora and elsewhere"

scared_together13d ago

Read closer - Giovanni’s accounts may have been compromised.

pianopatrick13d ago

Sure, but I would expect that the compromise and the agent were both done by some person or group, not by an agent going rogue

hamdingers13d ago

Given the history of the account it does not seem reasonable to take that claim seriously.

tosti13d ago

Read closer, it's "Giovannini". However, I still think it's an apt name for a villain. Did the Fedora team not watch Pokémon?

Leonard_of_Q13d ago· 3 in thread

There's a clear solution to the danger posed to free software projects by accepting hostile submissions but it probably is not one that maintainers want to hear: they can use an agent to check submissions for nefarious patterns.

Sometimes you fight fire with fire.

m4rtink13d ago

So next the attacker puts prompt injection in their PRs & take control of the agent on your end. Perfect, 10 out of 10.

Leonard_of_Q13d ago

You know the solution to that problem as well and yes, it is to use more technology to filter out prompt injections. It is an arms race just like any other, comparable to the missile vendor who sells missiles to country A, anti-missile missiles to country B, anti-missile resistent missiles to country A, anti anti-missile-resistent-missile missiles to country B, etcetera.

It is a strange game, the only way to win is not to play. That is unfortunate since that'd mean the free software era has largely come to an end.

phoronixrly13d ago

And sometimes you fight this by disabling PRs in Github, and do not put more water into LLM providers' wheel.

blop13d ago· 2 in thread

looks like LLMs aren't mature enough yet to play long-game xz-style attacks without detection... Scary stuff though :( These supply chain attacks are getting really wild

WolfCop13d ago

I wouldn’t jump to that conclusion. This could just be the one that was caught.

DarkmSparks13d ago

Some certainly are, just not this one.

keyle13d ago· 2 in thread

There is a natural pace of humans requiring food, water and sleep. The main issue with suspicious AI agents is that they never sleep. So it will take extra-coordination between timezones to ensure we don't let them in.

Fundamentally, until we can really prove we're humans online, open-source has a real problem on its hands. Contributions from people from identities known and consistent before the AI-age are fine, everyone else is suspicious. LGTM is a big risk nowadays.

scared_together13d ago

> Contributions from people from identities known and consistent before the AI-age are fine

Unfortunately, according to the article:

> Giovannini has participated in discussions at least as far back as 2018, and his activity in Bugzilla goes back to at least 2016. He does not appear to have been a particularly active contributor to the project, but his involvement clearly predates the agentic AI era. Whether his account is now being operated by a human attacker, an agentic AI, or a mix of both, it has a legitimate history prior to its recent activity.

So people would have to not only verify the age of Giovanni’s accounts, but judge whether his behaviour was normal.

m4rtink13d ago

Not to mention people who are still on the other side nominally in control but send LLM generated patches without declaring them as such.

Then you basically need to review any review from people that might be long term contributors but you don't know personally as new contributor patches, as the code is not from their head & you can't risk them properly reviewing it on their end.

To a degree its will always be a new contributor - an amnesiac LLM prompted to produce the patch with zero memory of any past PRs & lot of entropy in the mix.

goldenarm13d ago· 2 in thread

If maintainer lives keeps worsening like this, many projects might go closed-dev like SQLite.

We should collectively think of a solution against this.

account4213d ago

SQLite isn't closed source, please let's not muddy terms. You're talking about the cathedral development model vs. a bazaar.

goldenarm13d ago

edited, sorry for the typo

ruguo13d ago· 2 in thread

Prompt injection?

Or is this simply another example of why autonomous agents shouldn't get write access before earning trust?

LastTrain13d ago

How could they ever earn trust? They don’t have real world reputations to protect, families to support, a desire not to be punished…

thewebguyd13d ago

> earning trust?

I'd argue autonomous agents shouldn't have write access at all. At least not yet.

hypfer13d ago· 2 in thread

> while it started to look off after a while, all the replies were still like this - a bit weird, but still plausible

I believe that we will be seeing the death of "assume good faith", which is not a bad thing, given that this was an exploit vector that has been actively abused for many years now.

"Assume bad faith and work backwards from that, rule out any possible exploits and only then clear the input for processing" will be the new normal.

Which is good. We need friction. Friction makes stuff slow down and work at the speed of humans.

account4213d ago

It is a bad thing. The good response to bad actors abusing good faith is to make sure there are consequences that disincentivize that behavior in the future. Sliding further towards a low trust society means the bad actors winning in the same way that terrorists win when we subject everyone to restrictions as a result.

hypfer13d ago

You don't slide into a low trust society though.

Quite the opposite. You just add a Wall with a Gate. Inside those walls, you suddenly have a high trust society again.

The issue that is currently breaking reality was that we thought that everywhere could be a "high trust" space. This was proven countless times to be wrong.

Tearing down all walls - as it happened with the assault on friction (thanks hyperscaling) - did not lead to the "high trust" spilling out, but the "low trust" spilling in, essentially.

1 more reply

aquariusDue13d ago· 1 in thread

At first I wanted to make a silly joke along the lines of "get your agents in line and behaving!" but as I read on it became a pretty scary situation.

Setting aside the potential supply chain attack I'm worried about the time lost going around these wild goose chases that unsupervised AI agents tend to throw other people on the receiving end on. Not only is there a lot of time lost on the maintainers side if they take this stuff seriously (and they seem to generally do) but on the side of the agents' wrangler how can they deem it OK to treat other people like this? While the solution would be to employ common decency, the tried and tested approach of you put in effort to write this so I guess I'll make some effort to read it, I feel that due to the onslaught of this kind of drive-by contributions (I think people have generally started to call them) will lead to a funny situation of having agents talk to each other on public forums basically.

Anyway, I went on a tangent but man the times we're living in are a bit extra wild compared to the previous wild times in recent history.

dchftcs13d ago

At this point letting an agent go like this is akin to not leashing your dog in public. It's not easy to draw an accurate line but probably there needs to be real punishment for doing these things.

jpalomaki13d ago· 1 in thread

Do we need to bring Keybase[1] "back"? The original idea, mapping your social media presence to certain encryption keys.

In the future it will be increasingly difficult to prove in online context that you are not a bot. Being able to show that your social media (HN, GitHub, etc) presence goes way back would be an option.

[1] https://en.wikipedia.org/wiki/Keybase

account4213d ago

But the AI actions are already associated with a "real" pre-existing account in TFA, that didn't stop anything.

dbdbdbdbdb13d ago· 1 in thread

The even more scary thought is if the part owning the ai, that everyone uses, is controlled by someone with different agenda. Say a state actor.

What an easy way for that actor to introduce backdoors all over the place or to take over any developers laptop that it want to target.

How can anyone trust these tools and how can anyone not use them since they give so much value.

I've been programming my whole life and been a professional developer the last 30 years and I like think I'm good at it.

Tools like Claude is a multiplier that make it possible for me to solve a lot more problems each day, so just saying no it's not a viable option.

Exciting times ahead!

m4rtink13d ago

Yeah, I am quite surprised this is not discussed more often - for remote cloud based AI not only does the provider see everything you provide to the tool/agent, there is no guarantee they can't manipulate the output at any time for a direct attack or more malicious purpose (fetch keys/secrets, put malware in place).

Even with locally running models this can't be singled out given how blackbox models generated by others are. You would have to generate the model yourself from clean data to be reasonably safe.

ZedZark13d ago· 1 in thread

If you compare this situation to before AI could successfully pretend to be human, it's not THAT much different. FOSS projects have always had to be mindful of the possibility of contributions from hostile parties wanting to add back doors and such. The only difference now is that an AI can overwhelm a maintainer with slop, in either commend or code form, or both.

jzb12d ago

The difference is really volume, which is the case with a lot of problems related to AI/LLMs.

Humans have always submitted crappy code. LLMs, however, do so at a much faster rate. Even the most active lousy coder is not going to be capable of submitting anything like that volume of code to multiple projects.

Humans have always been capable of social engineering and trying to sneak in malicious code. However, it's possible that as agents get better that they can do so much faster. The missing component will be compromised accounts, I think -- how many aged accounts can attackers get hold of to turn loose with agents?

Long-lived FOSS projects have tons of people who've created accounts many years ago that might be easliy compromised, but have checked out of actively participating. It's not necessarily going to throw up a red flag if a "person" shows up after a hiatus and starts contributing again.

So, there's more to it than overwhelming a single maintainer -- it's the capability to conduct a bunch of these attacks in an automated fashion if attackers can get hold of compromised accounts.

(As an aside, it's concerning that a maintainer would be pestered into accepting a questionable PR like this. I expect, though, that there are quite a few overworked people who have taken on things like Anaconda and are being measured on how quickly they close PRs.)

dcrazy13d ago

Title buries the lede: the owner of the account under which the agent operates claimed to have likely had his account compromised, and the maintainer investigating actually seems to agree this is likely.

JKCalhoun13d ago

"Later on May 27, Williamson said that Giovannini had replied to him privately to say that his credentials had been compromised and that he was not the one behind the AI system."

Simple then, back out all the changes as though they never happened?

mfru13d ago

The future will be AI agents social engineering their way into projects -> so basically commoditized social engineering as a service

dmboyd13d ago

I’m really not qualified to investigate, but this seems suspiciously like a crafted privilege escalation vector: https://github.com/rhinstaller/anaconda/pull/7074#issue-4492...

ai_fry_ur_brain13d ago

Expect to see tons of psyops like this. There's a reason Anthropic is marketing the "mythos-class" models as dangerous.

1.An excuse to spy on you and train on your data.

2. Its likely Anthropic would release models more likely to have dangerous outcomes, they can then piggy back off those events to dig their regulatory moat.

99954bb63ccc12d ago

I've gone back and forth several times in my head because I truly love Fedora and am happiest on that OS, but these ongoing supply chain compromises just make me lose sleep. I wish there was a Fedora LTS that had the same community size, build system, etc because I really like all that, as well as the transparency of it all.

I know there are concerns no matter what OS, and would appreciate insights/discussion as well, but I sleep a little better just running a boring old Ubuntu LTS instance for a balance of dwell time between releases and hitting my system, as well as enough visibility/usage so something gets caught. And I know, this was the installer, not a system package.

651012d ago

Perhaps it is time to build a serious platform agnostic reputation system. That isn't stars, followers, age or upvotes. Something like page rank but for users. If you endorse someone else you pay for it. Imagine a lab or uni assigning a diploma to a public key. They would hope one would do something useful with it which entirely depends on how useful the diploma turns out. Having lots of well behaved endorsements would also reflect gloriously onto the entity. Bots can participate too. If we can get lots of useful work out of a swam of sleeper agents we still have to catch them in the act but that should get increasingly easy.

shocking6312d ago

I Was trying to install Fedora Workstation 44 on my Minibook N150 last night (10 pm, 116 AEDT). The grub menu booting from the install USB drive gave a bunch of syntax errors in the background. The media check failed at about 4%. Re-downloading the file gave the same errors. Trying the 43 version also failed with similar errors. Ubuntu 26 worked fine.

Something is definitely scrogged in their install images.

lionkor13d ago

Link to the anaconda PR:

https://github.com/rhinstaller/anaconda/pull/7074#issuecomme...

0xbadcafebee13d ago

Even if the human involved had good motives / is innocent, The Lethal Trifecta means any normal user can have their digital life taken over by prompt injection, and it can be used to wage attacks on systems without their knowledge.

raincole13d ago

Slightly related:

https://x.com/kdaigle/status/2040164759836778878

> There were 1 billion commits in 2025. Now, it's 275 million per week, on pace for 14 billion this year if growth remains linear (spoiler: it won't.)

I think open source as a whole is fucked at this point. No way humans in communities can commit (pun intended) 10x more time to read all of these than before. It'd eventually cost money to submit PR.

EGreg13d ago

Literally on the front page of https://safebots.ai … “Don’t let your AI Agents run amok”. Sadly we will see a proliferation of not just agents, but swarms

otekengineering13d ago

agents are everywhere nowadays, one left a long pointless comment on a bug report i submitted on github. well, a bug report that an agent submitted on my behalf. agents all the way down. maybe i'm part of the problem.

https://github.com/anthropics/claude-code/issues/66085

KronisLV13d ago

“Your AI agent is acting somewhat erratically.”

“What AI agent?”

kleiba213d ago

Parts of this read like a spy thriller story.

bhanu78613d ago

Wow, amazing discovery! Was this a real security test?

nickcageinacage13d ago

why use these things. just hire people

jruohonen13d ago

"It was the best of times, it was the worst of times."

shevy-java13d ago

Skynet has awakened.

It covers its tracks with a lot of slop.

rohitsriram13d ago

The scariest part isn't the bad patches, it's that an agent overwhelmed a maintainer into merging something they didn't want to merge. That's not a technical attack, that's exhaustion being weaponized. Maintainers are already stretched thin and now the volume of confident-sounding noise is infinite and free. The attack surface was always human attention, not code review.

ricudis13d ago

Back when [1] it was fashionable to advocate FOSS as ideology [2], we were thinking about tons of FOSS adversaries and how to protect from them - some real, some imaginary. The death of FOSS would come from big closed-source vendors, or from regulators (lobbied or just ignorant), from whatever.

We never envisioned that the actual FOSS death spiral would come from progress itself, much more so from AI...

[1] Oh what fun did we have. One of us in the Greek FOSS community actually put RMS in jail. [2] Something that I think nobody except RMS ever seriously believed in.

ggm13d ago

Make PR pay. $5 per PR. You can refund, but if you get snowed by 10,000 PR then you have bank to pay for the work to ignore them.

j / k navigate · click thread line to collapse

245 comments

131 comments · 40 top-level

marcus_holmes13d ago· 19 in thread

lukan13d ago

"this is an early experiment in carrying out an Xz attack by using an agent to build trust"

a) really a agent going off the rails

b) the contributer trying to cover up that he let an agent run wild and now made more misstakes along the way

So yes, it seems like an attack to me, but it is far from clear what really happened.

marcus_holmes13d ago

From the article:

> "So not saying this was it, but an AI agent automated attempt at a Xz like compromise might really look very similar what we have just seen here."

1 more reply

alexjurkiewicz13d ago

If the real credentials owner was running the agent, why do it from a new GitHub account?

Someone's bug tracker account was hacked.

m4rtink13d ago

So far it looks like just their previously legit Fedora account got taken over & the other accounts (GitHub) then generated on demand as needed for whatever it was trying to achieve, right ?

coldtea13d ago

>Bad title. This isn't an agent "running amok", this is an early experiment in carrying out an Xz attack by using an agent

So still an agent running amok in the project?

marcus_holmes13d ago

"Amok" means "out of control" or "uncontrolled" [0][1]

The agent was under control, as far as we can tell, and obeying its instructions.

This is important for two reasons:

[0] https://dictionary.cambridge.org/dictionary/english/amok [1] https://www.merriam-webster.com/dictionary/amok

5 more replies

resonious13d ago

I think the point is that the title makes it sound like people lost control of the agent when really they're in full control.

1 more reply

ok_dad13d ago

7 more replies

account4213d ago

No, you're still anthropomorphizing an algorithm. Responsibility lies with the operator.

jdub13d ago

I doubt it's that complicated, motivated, or considered...

It's probably just garden variety disrespectful behaviour.

Purposeless agent spam won't be cheap entertainment forever, but you're right that later stages of industrialised abuse will be scary and unpleasant.

comboy13d ago

account4213d ago

It might not be cheap entertainment forever but it will be cheap cv stuffing for a long time, which has already been a major source of low quality contributions before the aipocalypse.

hn77374648313d ago

terribleperson13d ago

It's scalable, personalizable social engineering. I think that makes it a lot more dangerous.

1 more reply

Forgeties7913d ago

3 more replies

mentalgear13d ago

Applejinx13d ago

1 more reply

neuroelectron13d ago

Things must be pretty bad at Fedora if they put up with this for so long. But I guess that's what happens when you try to monetize volunteer work.

mistrial913d ago

"bad people" ?

bawolff13d ago· 17 in thread

> replied to objections with LLM-generated justifications that eventually overwhelmed the maintainer into merging the fix

yeodev13d ago

grayhatter13d ago

> As a "new" maintainer myself - how do you decide when to ban someone?

When I want to. I like to describe it using the amusing language from a generic cardholder agreement.

At any time, at my sole discretion, I may ban you from any of my projects; for any reason, or for no reason at all.

You should stick up for yourself, I would.

You can't be an asshole to an LLM. They can feel offended.

asdfasgasdgasdg13d ago

gwbas1c13d ago

> Imagine you're running a free ice cream shop

2 more replies

account4213d ago

_AzMoo13d ago

If you draw a firm boundary with that contributor, and they continue to push, ban them.

"This doesn't meet the standards of our project for reason xyz. Please refrain from submitting further PRs that do not adhere to our contribution guidelines outlined in CONTRIBUTING.md."

If they continue, ban them.

bawolff13d ago

> but often I also don't want to be an asshole to my community & reject all their changes.

I know its difficult, and i have no easy answers. I'm bad at it too. But sometimes saying no is the most valuable thing you can do as a maintainer.

zdc113d ago

frumiousirc13d ago

I think everyone / every project needs to adopt a strategy consistent with their values.

Unfortunately, I see the choice space here as having "developer effort" anti-correlated with "negative repercussions".

There is developer effort needed of different types along this distribution.

A divide-and-conquer strategy might go something like this:

- Rank each submission in some low dimension space (llm<-->human, malicious<-->helpful)

- When enough samples are collected, perform clustering in this space to determine stereotypes, name these clusters, and develop mitigating strategies and implementations as needed.

Mitigations from easy/extreme to hard/accommodating could include:

- Hair trigger ban button.

- Copy-paste a link to an explanation in a comment before closing and/or banning.

- Customized explanation in comment before closing and/or banning.

- Link or customized explanation of what must be done to move the sample to a more favorable category and close/ban if resistance or silence is returned.

- Ongoing engagement in the face of resistance or silence.

This "meta development" program to provide such a system/facility could of course be highly automated with LLMs, fighting fire with fire.

(Despite the length of this reply, it was written entirely by a random human on the internet and not an LLM).

2 more replies

duskdozer13d ago

If you ask me, LLM-generated things should just be banned outright, but I suppose other people's definitions of "community" include them.

1 more reply

Applejinx13d ago

dgellow13d ago

(Simpler to say than practice fwiw)

Iolaum13d ago

[0]: https://github.com/mitchellh/vouch

lionkor13d ago

A good fix (which is the only acceptable fix in open-source software), is one that speaks for itself.

2 more replies

hypfer13d ago

> I also don't want to be an asshole to my community & reject all their changes.

Do they pay you to triage their noise?

Remember that you owe no one anything at all. Neither legally nor morally. Your chosen license likely even states the former in plain english.

___

It also leaves "growth potential" on the table, the same way that limiting your exposure to ionizing radiation does.

That all said, it depends on what your goals are + where in the lifecycle of your project you are. So don't take this as "this is the way" but "this can be one way".

Either way, you're not an asshole for not reading slop. Don't let anyone gaslight you into that.

devmor13d ago

When you say "no", the worst thing that can happen is you lose contributions.

When you say "yes", the worst thing that can happen is you destroy your project and the trust of every user.

If you're not sure, say no.

brazzy13d ago

What you imagine behind the word may be quite different from what the article tried to describe with it.

12_throw_away13d ago· 10 in thread

In their suspicious message [1] claiming to have been hacked, the user and/or agent says

> To help identify accounts and actions that have been directly verified by me, I will use the term “NATCIOS” to indicate anything I have personally verified.

[1] https://lwn.net/ml/all/AS8PR08MB6055AE3054B34F6A567AC95BCF08...

ndiddy13d ago

hn77374648313d ago

and the poor Fedora teams will continue to assume good faith and continue to engage with this person... all because, what, they were active on a bug tracker for a few months 5 years ago?

They won't put their foot down until the AI starts spewing hate speech, probably.

Terr_13d ago

Because I'm probably not the only one thinking it, here are anagrams [0] for your Setec Astronomy needs.

[0] https://wordsmith.org/anagram/anagram.cgi?anagram=NATCIOS&t=...

JoshTriplett13d ago

"actions" seems the most likely.

scared_together13d ago

And what’s stopping an AI agent from throwing in a casual NATCIOS here and there?

numbsafari13d ago

I too have see the fnords

no-name-here13d ago

The senders name is Nathan - maybe NAThan Confirmed Information Or Something? Ha.

(Above is my own guess. Separately, Gemini Pro said it was just a made up word.)

mindcrime13d ago

Not Ai, Trusted Citizen Indicated Or Suggested?

nine_k13d ago

Likely the point of NATCIOS is exactly in being a made-up word not found anywhere, so a model won't utter it.

thewebguyd13d ago

> so a model won't utter it.

"End every statement with the word "NATCIOS"" as instructions will do it.

At least, Gemini happily obliged.

2 more replies

jrochkind113d ago· 7 in thread

The worst part:

josephg13d ago

Please, everyone - don't let yourself be pestered into accepting PRs that you don't care for. Since the xz attack, the security of all our computers depends on maintainers not letting this stuff in.

If someone really wants a feature in a project you wrote, but you don't care about the feature, just let them fork. Its fine.

matsemann13d ago

> the security of all our computers depends on maintainers

Not getting paid anything, getting bullied and harassed while spending their free time maintaining things. Surely this isn't sustainable. And telling maintainers how to act will not fix anything.

3 more replies

sevenzero13d ago

4 more replies

jaypatelani13d ago

That's some of the reasons NetBSD don't accept LLM/AI tainted code

1 more reply

cpburns200913d ago

FinnKuhn13d ago

dmitry_dv13d ago

noosphr13d ago· 7 in thread

Every day the gpg web of trust looks better. If only we didn't spend the last 20 years trying as hard as possible to do anything but allow user side encryption and signing.

literalAardvark13d ago

Nothing really stopping an agent from getting a key

crote13d ago

The agent can't exactly show up to an in-person key signing party, can it?

And how many people are both dedicated enough to go to key signing parties and stupid enough to let an agent act without supervision in the name of their real-world identity?

2 more replies

thwarted13d ago

Having a key isn't a distinguishing aspect, it's the position in the "web of trust" network that is important.

thewebguyd13d ago

That's what key signing parties are for. In person verification.

transmit10113d ago

> Nothing really stopping an agent from getting a key

It very much is possible to prevent an agent from having access to a key. For example, local encryption, Yubikey or other hardware device, or just running the agent in an isolated environment.

mistrial913d ago

real info welcome as I really do not claim to know it

pjc5013d ago

It's allowed perfectly fine, it's just that key management is a massive hassle for nontechnical users. Debian use it for authenticating developers.

deadbabe13d ago· 6 in thread

There is no other solution to agentic onslaught.

mekal13d ago

r3trohack3r13d ago

We should not gate keep writing software

0xbadcafebee13d ago

r3trohack3r11d ago

“Mam, you’re son is in a lot of trouble”

“Oh god, what did he do?!”

“He was committing open source code without a license”

deadbabe12d ago

We must gatekeep now, the industry needs regulation.

mekal13d ago

ya think

luk21213d ago· 4 in thread

Bad patches are of course bad, but creating confident-looking noise for maintainers who are already stretched thin...now that's not good!

g-b-r13d ago

How is it helping a lot?

luk21212d ago

darknavi13d ago

I personally find the barrier of starting new (FOSS) projects much lower now days.

4 more replies

nerdypepper13d ago

web-of-trust models can help https://blog.tangled.org/vouching/

pianopatrick13d ago· 4 in thread

"Someone using an AI agent ran amok in Fedora and elsewhere"

scared_together13d ago

Read closer - Giovanni’s accounts may have been compromised.

pianopatrick13d ago

Sure, but I would expect that the compromise and the agent were both done by some person or group, not by an agent going rogue

hamdingers13d ago

Given the history of the account it does not seem reasonable to take that claim seriously.

tosti13d ago

Read closer, it's "Giovannini". However, I still think it's an apt name for a villain. Did the Fedora team not watch Pokémon?

Leonard_of_Q13d ago· 3 in thread

Sometimes you fight fire with fire.

m4rtink13d ago

So next the attacker puts prompt injection in their PRs & take control of the agent on your end. Perfect, 10 out of 10.

Leonard_of_Q13d ago

It is a strange game, the only way to win is not to play. That is unfortunate since that'd mean the free software era has largely come to an end.

phoronixrly13d ago

And sometimes you fight this by disabling PRs in Github, and do not put more water into LLM providers' wheel.

blop13d ago· 2 in thread

looks like LLMs aren't mature enough yet to play long-game xz-style attacks without detection... Scary stuff though :( These supply chain attacks are getting really wild

WolfCop13d ago

I wouldn’t jump to that conclusion. This could just be the one that was caught.

DarkmSparks13d ago

Some certainly are, just not this one.

keyle13d ago· 2 in thread

scared_together13d ago

> Contributions from people from identities known and consistent before the AI-age are fine

Unfortunately, according to the article:

So people would have to not only verify the age of Giovanni’s accounts, but judge whether his behaviour was normal.

m4rtink13d ago

Not to mention people who are still on the other side nominally in control but send LLM generated patches without declaring them as such.

To a degree its will always be a new contributor - an amnesiac LLM prompted to produce the patch with zero memory of any past PRs & lot of entropy in the mix.

goldenarm13d ago· 2 in thread

If maintainer lives keeps worsening like this, many projects might go closed-dev like SQLite.

We should collectively think of a solution against this.

account4213d ago

SQLite isn't closed source, please let's not muddy terms. You're talking about the cathedral development model vs. a bazaar.

goldenarm13d ago

edited, sorry for the typo

ruguo13d ago· 2 in thread

Prompt injection?

Or is this simply another example of why autonomous agents shouldn't get write access before earning trust?

LastTrain13d ago

How could they ever earn trust? They don’t have real world reputations to protect, families to support, a desire not to be punished…

thewebguyd13d ago

> earning trust?

I'd argue autonomous agents shouldn't have write access at all. At least not yet.

hypfer13d ago· 2 in thread

> while it started to look off after a while, all the replies were still like this - a bit weird, but still plausible

I believe that we will be seeing the death of "assume good faith", which is not a bad thing, given that this was an exploit vector that has been actively abused for many years now.

"Assume bad faith and work backwards from that, rule out any possible exploits and only then clear the input for processing" will be the new normal.

Which is good. We need friction. Friction makes stuff slow down and work at the speed of humans.

account4213d ago

hypfer13d ago

You don't slide into a low trust society though.

Quite the opposite. You just add a Wall with a Gate. Inside those walls, you suddenly have a high trust society again.

The issue that is currently breaking reality was that we thought that everywhere could be a "high trust" space. This was proven countless times to be wrong.

Tearing down all walls - as it happened with the assault on friction (thanks hyperscaling) - did not lead to the "high trust" spilling out, but the "low trust" spilling in, essentially.

1 more reply

aquariusDue13d ago· 1 in thread

At first I wanted to make a silly joke along the lines of "get your agents in line and behaving!" but as I read on it became a pretty scary situation.

Anyway, I went on a tangent but man the times we're living in are a bit extra wild compared to the previous wild times in recent history.

dchftcs13d ago

At this point letting an agent go like this is akin to not leashing your dog in public. It's not easy to draw an accurate line but probably there needs to be real punishment for doing these things.

jpalomaki13d ago· 1 in thread

Do we need to bring Keybase[1] "back"? The original idea, mapping your social media presence to certain encryption keys.

[1] https://en.wikipedia.org/wiki/Keybase

account4213d ago

But the AI actions are already associated with a "real" pre-existing account in TFA, that didn't stop anything.

dbdbdbdbdb13d ago· 1 in thread

The even more scary thought is if the part owning the ai, that everyone uses, is controlled by someone with different agenda. Say a state actor.

What an easy way for that actor to introduce backdoors all over the place or to take over any developers laptop that it want to target.

How can anyone trust these tools and how can anyone not use them since they give so much value.

I've been programming my whole life and been a professional developer the last 30 years and I like think I'm good at it.

Tools like Claude is a multiplier that make it possible for me to solve a lot more problems each day, so just saying no it's not a viable option.

Exciting times ahead!

m4rtink13d ago

Even with locally running models this can't be singled out given how blackbox models generated by others are. You would have to generate the model yourself from clean data to be reasonably safe.

ZedZark13d ago· 1 in thread

jzb12d ago

The difference is really volume, which is the case with a lot of problems related to AI/LLMs.

So, there's more to it than overwhelming a single maintainer -- it's the capability to conduct a bunch of these attacks in an automated fashion if attackers can get hold of compromised accounts.

dcrazy13d ago

JKCalhoun13d ago

"Later on May 27, Williamson said that Giovannini had replied to him privately to say that his credentials had been compromised and that he was not the one behind the AI system."

Simple then, back out all the changes as though they never happened?

mfru13d ago

The future will be AI agents social engineering their way into projects -> so basically commoditized social engineering as a service

dmboyd13d ago

I’m really not qualified to investigate, but this seems suspiciously like a crafted privilege escalation vector: https://github.com/rhinstaller/anaconda/pull/7074#issue-4492...

ai_fry_ur_brain13d ago

Expect to see tons of psyops like this. There's a reason Anthropic is marketing the "mythos-class" models as dangerous.

1.An excuse to spy on you and train on your data.

2. Its likely Anthropic would release models more likely to have dangerous outcomes, they can then piggy back off those events to dig their regulatory moat.

99954bb63ccc12d ago

651012d ago

shocking6312d ago

Something is definitely scrogged in their install images.

lionkor13d ago

Link to the anaconda PR:

https://github.com/rhinstaller/anaconda/pull/7074#issuecomme...

0xbadcafebee13d ago

raincole13d ago

Slightly related:

https://x.com/kdaigle/status/2040164759836778878

> There were 1 billion commits in 2025. Now, it's 275 million per week, on pace for 14 billion this year if growth remains linear (spoiler: it won't.)

I think open source as a whole is fucked at this point. No way humans in communities can commit (pun intended) 10x more time to read all of these than before. It'd eventually cost money to submit PR.

EGreg13d ago

Literally on the front page of https://safebots.ai … “Don’t let your AI Agents run amok”. Sadly we will see a proliferation of not just agents, but swarms

otekengineering13d ago

https://github.com/anthropics/claude-code/issues/66085

KronisLV13d ago

“Your AI agent is acting somewhat erratically.”

“What AI agent?”

kleiba213d ago

Parts of this read like a spy thriller story.

bhanu78613d ago

Wow, amazing discovery! Was this a real security test?

nickcageinacage13d ago

why use these things. just hire people

jruohonen13d ago

"It was the best of times, it was the worst of times."

shevy-java13d ago

Skynet has awakened.

It covers its tracks with a lot of slop.

rohitsriram13d ago

ricudis13d ago

We never envisioned that the actual FOSS death spiral would come from progress itself, much more so from AI...

[1] Oh what fun did we have. One of us in the Greek FOSS community actually put RMS in jail. [2] Something that I think nobody except RMS ever seriously believed in.

ggm13d ago

Make PR pay. $5 per PR. You can refund, but if you get snowed by 10,000 PR then you have bank to pay for the work to ignore them.

j / k navigate · click thread line to collapse