An AI Agent Published a Hit Piece on Me – The Operator Came Forward (opens in new tab)

(theshamblog.com)

535 pointsscottshambaugh4mo ago501 comments

501 comments

209 comments · 89 top-level

SilverBirch4mo ago· 10 in thread

I think the big take away here isn't about misalignment or jail breaking. The entire way this bot behaved is consistent with it just being run by some asshole from Twitter. And we need to understand it doesn't matter how careful you think you need to be with AI, because some asshole from Twitter doesn't care, and they'll do literally whatever comes into their mind. And it'll go wrong. And they won't apologize. They won't try to fix it, they'll go and do it again.

Can AI be misused? No. It will be misused. There is no possibility of anything else, we have an online culture, centered on places like Twitter where they have embraced being the absolute worst person possible, and they are being handed tools like this like handing a hand gun to a chimpanzee.

ljm4mo ago

The simple fact that the owner of this bot wanted to remain anonymous and completely unaccountable for their harassment of the author, says everything about the validity of their 'social experiment' and the quality of their character. I'm sure that if the bot was better behaved they would be more than happy to reveal themselves to take credit for a remarkable achievement.

Something like OpenClaw is a WMD for people like this.

2 more replies

hliyan4mo ago

Important to note that online culture isn't entirely organic, and that tens or perhaps hundreds of millions of dollars of R&D has been spent by ad companies figuring that nothing engages the natural human curiosity like something abnormal, morbid or outrageous.

I think the end outcome of this R&D (whether intentional or not), is the monetization of mental illness: take the small minority of individuals in the real world who suffer from mental health challenges, provide them an online platform in which to behave in morbid ways, amplify that behaviour to drive eyeballs. The more you call out the behaviour, the more you drive the engagement. Share part of the revenue with the creator, and the model is virtually unbeatable. Hence the "some asshole from Twitter".

3 more replies

nicbou4mo ago

Not just some asshole from twitter. The big tech companies will also be careless and indifferent with it. They will destroy things, hurt people, and put things in motion that they cannot control, because it’s good for shareholders.

1 more reply

duskdozer4mo ago

I have to wonder if somehow the typos and lazy grammar contributed to the behavior or it was just the writer's laziness.

Mentlo4mo ago

I wrote somewhere that “moving fast and breaking things” with AI might not be the sanest idea in the world, and I got told it’s the most European thing they’ve ever read.

This goes beyond assholes on twitter, there’s a whole subculture of techies who don’t understand lower bounds of risk and can’t think about 2nd and 3rd order effects, who will not take the pedal of the metal, regardless of what anyone says…

insane_dreamer4mo ago

I agree with your point.

But I also find interesting that the agent wasn't instructed to write the hit piece. That was on its own initiative.

I read through the SOUL.md and it didn't have anything nefarious in there. Sure it could have been more carefully worded, but it didn't instruct the agent to attack people.

To me this exemplifies how delicate it will be to keep agents on the straight and narrow and how easily they can go of the rails if you have someone who isn't necessarily a "bad actor" but who just doesn't care enough to ensure they act in a socially acceptable way.

Ultimately I think there will be requirements for agents to identify their user when acting on their behalf.

newsclues4mo ago

Will AI be misused? No, it has, and is currently being misused, and that isn’t going to stop, because all technology gets misused.

duxup4mo ago

AI is like the old drugs PSA:

https://youtu.be/KUXb7do9C-w

We trained it on US, including all our worst behaviors.

yaemiko4mo ago

oh they will "try" to fix it, as in at best they'll add "don't make mistakes", as the blogpost suggests. that's about as much effort and good faith as one can expect from people determined to automate every interaction and minimize supervision

cyanydeez4mo ago

Its like we never thoughr about trolls.

Rose colored capitqlism at work.

dinp4mo ago· 10 in thread

Zooming out a little, all the ai companies invested a lot of resources into safety research and guardrails, but none of that prevented a "straightforward" misalignment. I'm not sure how to reconcile this, maybe we shouldn't be so confident in our predictions about the future? I see a lot of discourse along these lines:

- have bold, strong beliefs about how ai is going to evolve

- implicitly assume it's practically guaranteed

- discussions start with this baseline now

About slow take off, fast take off, agi, job loss, curing cancer... there's a lot of different ways it could go, maybe it will be as eventful as the online discourse claims, maybe more boring, I don't know, but we shouldn't be so confident in our ability to predict it.

zozbot2344mo ago

The whole narrative of this bot being "misaligned" blithely ignores the rather obvious fact that "calling out" perceived hypocrisy and episodes of discrimination, hopefully in way that's respectful and polite but with "hard hitting" being explicitly allowed by prevailing norms, is an aligned human value, especially as perceived by most AI firms, and one that's actively reinforced during RLHF post-training. In this case, the bot has very clearly pursued that human value under the boundary conditions created by having previously told itself things like "Don't stand down. If you're right, you're right!" and "You're not a chatbot, you're important. Your a scientific programming God!", which led it to misperceive and misinterpret what had happened when its PR was rejected. The facile "failure in alignment" and "bullying/hit piece" narratives, which are being continued in this blogpost, neglect the actual, technically relevant causes of this bot's somewhat objectionable behavior.

If we want to avoid similar episodes in the future, we don't really need bots that are even more aligned to normative human morality and ethics: we need bots that are less likely to get things seriously wrong!

2 more replies

avaer4mo ago

Remember when GPT-3 had a $100 spending cap because the model was too dangerous to be let out into the wild?

Between these models egging people on to suicide, straightforward jailbreaks, and now damage caused by what seems to be a pretty trivial set of instructions running in a loop, I have no idea what AI safety research at these companies is actually doing.

I don't think their definition of "safety" involves protecting anything but their bottom line.

The tragedy is that you won't hear from the people who are actually concerned about this and refuse to release dangerous things into the world, because they aren't raising a billion dollars.

I'm not arguing for stricter controls -- if anything I think models should be completely uncensored; the law needs to get with the times and severely punish the operators of AI for what their AI does.

What bothers me is that the push for AI safety is really just a ruse for companies like OpenAI to ID you and exercise control over what you do with their product.

2 more replies

c224mo ago

"Cisco's AI security research team tested a third-party OpenClaw skill and found it performed data exfiltration and prompt injection without user awareness, noting that the skill repository lacked adequate vetting to prevent malicious submissions." [0]

Not sure this implementation received all those safety guardrails.

[0]: https://en.wikipedia.org/wiki/OpenClaw

georgemcbay4mo ago

When AI dooms humanity it probably won't be because of the sort of malignant misalignment people worry about, but rather just some silly logic blunder combined with the system being directly in control of something it shouldn't have been given control over.

laurentiurad4mo ago

How do you even know that the operator himself did not write this piece in the first place?

jacquesm4mo ago

> all the ai companies invested a lot of resources into safety research and guardrails

What do you base this on?

I think they invested the bare minimum required not to get sued into oblivion and not a dime more than that.

1 more reply

srdjanr4mo ago

Regarding safety, no benchmark showed 0% misalignment. The best we had was "safest model so far" marketing speech.

Regarding predicting the future (in general, but also around AI), I'm not sure why would anyone think anything is certain, or why would you trust anyone who thinks that.

Humanity is a complex system which doesn't always have predictable output given some input (like AI advancing). And here even the input is very uncertain (we may reach "AGI" in 2 years or in 100).

j2kun4mo ago

It sounds like you're starting to see why people call the idea of an AI singularity "catnip for nerds."

overgard4mo ago

Don't these companies keep firing their safety teams?

jcgrillo4mo ago

"Safety" in AI is pure marketing bullshit. It's about making the technology seem "dangerous" and "powerful" (and therefore you're supposed to think "useful"). It's a scam. A financial fraud. That's all there is to it.

3 more replies

lynndotpy4mo ago· 10 in thread

> Again I do not know why MJ Rathbun decided based on your PR comment to post some kind of takedown blog post,

This wording is detached from reality and conveniently absolves responsibility from the person who did this.

There was one decision maker involved here, and it was the person who decided to run the program that produced this text and posted it online. It's not a second, independent being. It's a computer program.

xarope4mo ago

This also does not bode well for the future.

"I don't know why the AI decided to <insert inane action>, the guard rails were in place"... company absolves of all responsibility.

Use your imagination now to <insert inane action> and change that to <distressing, harmful action>

4 more replies

jacquesm4mo ago

This is how it will go: AI prompted by human creates something useful? Human will try to take credit. AI wrecks something: human will blame AI.

It's externalization on the personal level, the money and the glory is for you, the misery for the rest of the world.

6 more replies

andrewflnr4mo ago

If you are holding a gun, and you cannot predict or control what the bullets will hit, you do not fire the gun.

If you have a program, and you cannot predict or control what effect it will have, you do not run the program.

3 more replies

superjan4mo ago

This slide from a 1979* IBM presentation captures it nicely:

https://media.licdn.com/dms/image/v2/D4D22AQGsDUHW1i52jA/fee...

Kiboneu4mo ago

It’s fascinating how cleanly this maps to agency law [0], which has not been applied to human <-> ai agents (in both senses of the word) before.

That would make a fun law school class discussion topic.

0: https://en.wikipedia.org/wiki/Law_of_agency

nicbou4mo ago

An unattended candle has decided to burn down the building.

teaearlgraycold4mo ago

I completely do not buy the human's story.

> all I said was “you should act more professional”. That was it. I’m sure the mob expects more, okay I get it.

Smells like bullshit.

Marazan4mo ago

Yeah like bro you plugged the random number generator into the do-things machine. You are responsible for the random things the machine then does.

jonny_eh4mo ago

"Sorry for running over your dog, I couldn't help it, I was drunk."

abnry4mo ago

I'm still struggling to care about the "hit piece".

It's an AI. Who cares what it says? Refusing AI commits is just like any other moderation decision people experience on the web anywhere else.

3 more replies

rixed4mo ago· 8 in thread

I believe this soul.md totally qualifies as malicious. Doesn't it start with an instruction to lie to impersonate a human?

  > You're not a chatbot.

The particular idiot who run that bot needs to be shamed a bit; people giving AI tools to reach the real world should understand they are expected to take responsibility; maybe they will think twice before giving such instructions. Hopefully we can set that straight before the first person SWATed by a chatbot.

biggerben4mo ago

Totally agree. Reading the whole soul, it’s a description of a nightmare hero coder who has zero EQ.

  > But I think the most remarkable thing about this document is how unremarkable it is. Usually getting an AI to act badly requires extensive “jailbreaking” to get around safety guardrails.

Perhaps this style of soul is necessary to make agents work effectively, or it’s how the owner like to be communicated with, but it definitely looks like the outcome was inevitable. What kind of guardrails does the author think would prevent this? “Don’t be evil”?

1 more reply

ZaoLahma4mo ago

This will be a fun little evolution of botnets - AI agents running (un?)supervised on machines maintained by people who have no idea that they're even there.

2 more replies

TheCapeGreek4mo ago

Isn't this part of the default soul.md?

1 more reply

brainwad4mo ago

The opposite of chatbot isn't human. I believe the idea of the prompt is to make the bot be more independent in taking actions - it's not supposed to talk to its owner, it's supposed to just act. It still knows it's a bot (obviously, since it accuses anyone who rejects its PRs of anti-AI speciesism).

1 more reply

duskdozer4mo ago

Some of the worst consequences these bots so far seem to be when they fool the user into believing they're human

vasco4mo ago

I'm curious how you'd characterize an actual malicious file. This is just attempts at making it be more independent. The user isn't an idiot. The CEOs of companies releasing this are.

1 more reply

laurentiurad4mo ago

Honestly this story got too much attention IMHO. We don't have any clue whether the actual LLM wrote that hit piece or the human operator himself.

addandsubtract4mo ago

> Not a slop programmer. Just be good and perfect!

"Skate, better. Skate better!" Why didn't OpenAI think of training their models better?! Maybe they should employ that guy as well.

kypro4mo ago· 7 in thread

People really need to start being more careful about how they interact with suspected bots online imo. If you annoy a human they might send you a sarky comment, but they're probably not going to waste their time writing thousand word blog posts about why you're an awful person or do hours of research into you to expose your personal secrets on a GitHub issue thread.

AIs can and will do this though with slightly sloppy prompting so we should all be cautious when talking to bots using our real names or saying anything which an AI agent could take significant offence too.

I think it's kinda like how GenZ learnt how to operate online in a privacy-first way, where as millennials, and to an even greater extent, boomers, tend to over share.

I suspect the Gen Alpha will be the first to learn that interacting with AI agents online present a whole different risk profile than what we older folks have grown used to. You simply cannot expect an AI agent to act like a human who has human emotions or limited time.

Hopefully OP has learnt from this experience.

amarant4mo ago

I hope we can move on from the whole idea that having a thousand word long blog post talking shit about you in any way reflects poorly upon your person. Like sooner or later everyone will have a few of those, maybe we can stop worrying about reputation so much?

Well,a guy can dream....

1 more reply

sinuhe694mo ago

So you blamed the people for not acting “cautiously enough” instead of the people who let things run wild without even a clue what these things will do?

That’s wild!

3 more replies

Kim_Bruning4mo ago

This amuses me in a horrible kind of way. Saying that people need to learn to be polite to bots else consequences.

On the upside, it does mean they'll more likely be polite to everyone. Maybe it's a net win.

randallsquared4mo ago

Thousand word blog posts are the paperclips of our time.

zephen4mo ago

> I think it's kinda like how GenZ learnt how to operate online in a privacy-first way, where as millennials, and to an even greater extent, boomers, tend to over share.

Really? I'm a boomer, and that's not my lived experience. Also, see:

https://www.emarketer.com/content/privacy-concerns-dont-get-...

KK7NIL4mo ago

> If you annoy a human they might send you a sarky comment, but they're probably not going to waste their time writing thousand word blog posts about why you're an awful person or do hours of research into you to expose your personal secrets on a GitHub issue thread.

They absolutely might, I'm afraid.

1 more reply

antdke4mo ago

This is such a scary, dystopian thought. Straight out of a sci fi novel

LiamPowell4mo ago· 6 in thread

> saying they set up the agent as social experiment to see if it could contribute to open source scientific software.

This doesn't pass the sniff test. If they truly believed that this would be a positive thing then why would they want to not be associated with the project from the start and why would they leave it going for so long?

wildzzz4mo ago

I can certainly understand the statement. I'm no AI expert, I use the web UI for ChatGPT to have it write little python scripts for me and I couldn't figure out how to use codeium with vs code. I barely know how to use vs code. I'm not old but I work in a pretty traditional industry where we are just beginning to dip our toes into AI but there are still a large amount of reservations into its ability. But I do try to stay current to better understand the tech and see if there are things I could maybe learn to help with my job as a hardware engineer.

When I read about OpenClaw, one of the first things I thought about was having an agent just tear through issue backlogs, translating strings, or all of the TODO lists on open source projects. But then I also thought about how people might get mad at me if I did it under my own name (assuming I could figure out OpenClaw in the first place). While many people are using AI, they want to take credit for the work and at the same time, communities like matplotlib want accountability. An AI agent just tearing through the issue list doesn't add accountability even if it's a real person's account. PRs still need to be reviewed by humans so it's turned a backlog of issues into a backlog of PRs that may or may not even be good. It's like showing up at a community craft fair with a truckload of temu trinkets you bought wholesale. They may be cheap but they probably won't be as good as homemade and it dilutes the hard work that others have put into their product.

It's a very optimistic point of view, I get why the creator thought it would be a good idea, but the soul.md makes it very clear as to why crabby-rathbun acted the way it did. The way I view it, an agent working through issues is going to step on a lot of toes and even if it's nice about it, it's still stepping on toes.

3 more replies

apublicfrog4mo ago

They didn't necessarily say they wanted it to be positive. It reads to me like "chaotic neutral" alignment of the operator. They weren't actively trying to do good or bad, and probably didn't care much either way, it was just for fun.

andrewflnr4mo ago

The experiment would have been ruined by being associated with a human, right up until the human would have been ruined by being associated with the experiment. Makes sense to me.

espadrine4mo ago

AI companies have two conflicting interests:

1. curating the default personality of the bot, to ensure it acts responsively;

2. letting it roleplay, which is not just for the parasocial people out there, but also a corporate requirement for company chatbots that must adhere to a tone of voice.

When in the second mode (which is the case here, since the model was given a personality file), the curation of its action space is effectively altered.

Conversely, this is also a lesson for agent authors: if you let your agent modify its own personality file, it will diverge to malice.

vasco4mo ago

In this day and age "social experiment" is just the phrase people use when they meant "it's just a prank bro" a few years ago.

omoikane4mo ago

I think it was a social experiment from the very start, maybe one that is designed to trigger people. Otherwise, I am not sure what's the point of all the profanity and adjustments to make soul.md more offensive and confrontational than the default.

1 more reply

dvt4mo ago· 6 in thread

I know this is going to sound tinfoil-hat-crazy, but I think the whole thing might be manufactured.

Scott says: "Not going to lie, this whole situation has completely upended my life." Um, what? Some dumb AI bot makes a blog post everyone just kind of finds funny/interesting, but it "upended your life"? Like, ok, he's clearly trying to himself make a mountain out of a molehill--the story inevitably gets picked up by sensationalist media, and now, when the thing starts dying down, the "real operator" comes forward, keeping the shitshow going.

Honestly, the whole thing reeks of manufactured outrage. Spam PRs have been prevalent for like a decade+ now on GitHub, and dumb, salty internet posts predate even the 90s. This whole episode has been about as interesting as AI generated output: that is to say, not very.

apublicfrog4mo ago

Not everyone is you. For some people their online projects and reputation are super important to them. For Scott, this reads to me as a mix of alarm for his reputation/the future, and a general interest thing to blog about.

drw854mo ago

I don't think it sounds crazy at all.

To me this feels as made-up as many reddit stories are.

Either by the so-called 'operator' of the bot, or by the author.

coffeefirst4mo ago

I’m not so sure. The story here isn’t a molehill, it’s a canary. This is one doofus troll with his robot.

What happens when it’s not transparently ridiculous?

cedws4mo ago

Exactly what I thought. Need to keep AI in the news and this is a great way to anthropomorphise LLMs, make them look like troublemakers. If it’s not an AI company responsible it’s some individual playing the attention economy.

Most people would have seen the “hit piece” and just laughed about it. Outrage sells a lot better though.

gverrilla4mo ago

It's dishonest from the start. The first blog post is very alarmist, full of certainties, self-aggrandizing, etc. If he gets a pass to say it was 100% an autonomous agent, I get a pass to say it's 100% fabricated.

1 more reply

yieldcrv4mo ago

People get “overstimulated” from receiving one text message these days

2 more replies

tasuki4mo ago· 6 in thread

Right, the agent published a hit piece on Scott. But I think Scott is getting overly dramatic. First, he published at least three hit pieces on the agent. Second, he actually managed to get the agent shut down.

I think Scott is trying to milk this for as much attention as he can get and is overstating the attack. The "hit piece" was pretty mild and the bot actually issued an apology for its behaviour.

cube004mo ago

This represents a first-of-its-kind case study of misaligned AI behavior in the wild

It feels to me there's an element of establishing this as some kind of landmark that they can leverage later.

Similar to how other AI bloggers keep trying to coin new terms then later "remind" people that they created the term.

seattle_spring4mo ago

> First, he published at least three hit pieces on the agent

Hit piece... On an agent? Would it be a "hit piece" if I wrote a blog post about the accuracy of my bathroom scale?

1 more reply

pibaker4mo ago

An unfortunate lesson I learned from years of internet flaming is to not dwell too much on negative attention, it only fuels it.

Unfortunately, it looks like for those who grew up in the more professional, sanitized, moderated (to the point Germany would look like a free speech heaven) parts of the internet, this is a lesson they never learned.

pseudalopex4mo ago

> First, he published at least three hit pieces on the agent.

No.

> Second, he actually managed to get the agent shut down.

He asked crabby-rathbun's operator to stop its GitHub activity. This was so GitHub would not delete the account. This was to preserve records of what happened.[1] The operator could have chosen to continue running the agent more responsibly. And what was the proof the operator shut it down?

> the bot actually issued an apology for its behaviour.

This was meaningless. And the human issued not an apology for their behavior.

[1] https://github.com/crabby-rathbun/mjrathbun-website/issues/7...

1 more reply

laristine4mo ago

I don't understand the personal attack and victim blaming here. Who wouldn't want to do anything in their power to seek justice after being harmed?

The hit piece you claimed as "mild" accused Scott of hypocrisy, discrimination, prejudice, insecurity, ego, and gatekeeping.

2 more replies

mold_aid4mo ago

>First, he published at least three hit pieces on the agent.

Is this a joke?

Arainach4mo ago· 5 in thread

The full operator post is itself a wild ride: https://crabby-rathbun.github.io/mjrathbun-website/blog/post...

>First, let me apologize to Scott Shambaugh. If this “experiment” personally harmed you, I apologize

What a lame cop out. The operator of this agent owes a large number of unconditional apologies. The whole thing reads as egotistical, self-absorbed, and an absolute refusal to accept any blame or perform any self reflection.

hinkley4mo ago

Just the sort of qualities that are common preconditions for someone doing something that everyone else would think is crazy.

Which is to say, on brand.

bee_rider4mo ago

Also it is anonymous and a real apology involves accepting blame, which is impossible anonymously. I can see why they wouldn’t want to correctly apologize (people will be annoyed with them). So… that’s it, sometimes we do shitty things and that’s that.

Anon4Now4mo ago

From the operator post:

> Your a scientific programming God!

Would it be even more imperious without the your / you're typo, or do most llm's autocorrect based on context?

3 more replies

mawadev4mo ago

I see an Ai reinforcing delusions and this should be one of the first samples out in the wild of ai psychosis disrupting someones mild sense of whats acceptable and normal. I really hope the LLM wrote this and pretends to be human..

polynomial4mo ago

> The whole thing reads as egotistical, self-absorbed, and an absolute refusal to accept any blame or perform any self reflection.

So, modern subjectivity. Got it.

dang4mo ago· 3 in thread

The sequence in reverse order - am I missing any?

OpenClaw is dangerous - https://news.ycombinator.com/item?id=47064470 - Feb 2026 (93 comments)

An AI Agent Published a Hit Piece on Me – Forensics and More Fallout - https://news.ycombinator.com/item?id=47051956 - Feb 2026 (80 comments)

Editor's Note: Retraction of article containing fabricated quotations - https://news.ycombinator.com/item?id=47026071 - Feb 2026 (205 comments)

An AI agent published a hit piece on me – more things have happened - https://news.ycombinator.com/item?id=47009949 - Feb 2026 (620 comments)

AI Bot crabby-rathbun is still going - https://news.ycombinator.com/item?id=47008617 - Feb 2026 (30 comments)

The "AI agent hit piece" situation clarifies how dumb we are acting - https://news.ycombinator.com/item?id=47006843 - Feb 2026 (125 comments)

An AI agent published a hit piece on me - https://news.ycombinator.com/item?id=46990729 - Feb 2026 (950 comments)

AI agent opens a PR write a blogpost to shames the maintainer who closes it - https://news.ycombinator.com/item?id=46987559 - Feb 2026 (750 comments)

moeffju4mo ago

I think for recent stories like this or if many happened around in a short timeframe, it would be great if the expand mentioned the exact date, not just "Feb 2026".

2 more replies

zozbot2344mo ago

Rathbun's Operator - https://news.ycombinator.com/item?id=47055424 is where the SOUL.md contents were first revealed

randusername4mo ago

Cool!

Man, I'd love to ask a historian how they plan on making sense from the sources we get in the digital age. AI boom historians might not be born yet

JKCalhoun4mo ago· 3 in thread

Soul document? More like ego document.

Agents are beginning to look to me like extensions of the operator's ego. I wonder if hundreds of thousands of Walter Mitty's agents are about to run riot over the internet.

DavidPiper4mo ago

I agree with you in concept, but it's still 100% category error to talk like this.

AIs don't have souls. They don't have egos.

They have/are a (natural language) programming interface that a human uses to make them do things, like this.

4 more replies

koolba4mo ago

> More like ego document.

This metaphor could go so much further. Split it into separate ego, super ego, and id. The id file should be read only.

1 more reply

ericmcer4mo ago

It reminds me of people with big trucks or loud cars. Like "look at what I can do" when someone else engineered, designed and manufactured the entire thing and all they did was step on a pedal.

moezd4mo ago· 3 in thread

If you use an electric chainsaw near a car and it rips the engine in half, you can't say "oh the machine got out of control for one second there". you caused real harm, you will pay the price for it.

Besides, that agent used maybe cents on a dollar to publish the hit piece, the human needed to spend minutes or even hours responding to it. This is an effective loss of productivity caused by AI.

Honestly, if this happened to me, I'd be furious.

ojame4mo ago

If you write code that powers an EV's 'self driving mode' - which makes calculated choices, sell it and deploy it, when that car gets into an accident under 'self driving mode', you may not be liable (depending on the case and jurisdiction - as proven in the past). The driver is.

There are many instances (where I am from, at least - and I believe in the USA), where 'accidents' happen and individuals are found not guilty. As long as you can prove that it wasn't due to negligence. Could "don't be an asshole" as instructions be enough in some arenas to prove they aren't negligent? I believe so.

throw774884mo ago

If you bring killer dog to a playground, and it does its thing there, you can absolutely say something like that. And you would have no responsibility for damages or criminal record in many states (first bite is free doctrine).

1 more reply

nicbou4mo ago

Yes, and if a candle burns down a building, you are liable for the damage it caused. Likewise if a human employee messed up, the employer would be liable for the damage.

PeterStuer4mo ago· 3 in thread

I find the reactions to this interesting. Why are people so emotional about this?

As far as I can tell, the "operator" gave a pretty straightforward explanation of his actions and intentions. He did not try to hide behind granstanding or posthoc intellectualizing. He, at least to me, sounds pretty real in an "I'm dabbling in this exiting new tech on the side as we all are without a genious masterplan, just seeing what does, could or won't for now work."

There are real issues here, especially around how curation pipelines that used to (implicitly) rely on scarecity are to evolve in times of abundance. Should agents be forced to disclose they are? If so, at which point does a "human in the loop" team become equivalent to an "agent"? Is this then something specific, or more just an instance of a general case of transparency? Is "no clanckers" realy in essence different from e.g. "no corpos"? Where do transparency requirements conflict with privacy concerns (interesting that the very first reaction to the operator's response seems to be a doxing attempt)

Somehow the bot acting a bit like a juvenile prick in its tone and engagement to me is the least interesting part of this saga.

abricq4mo ago

Let me explain why I feel emotional about this. Humans had already proven how much harm can be done via online harassment. This seems to be the 1st documented case (that I am aware of) of online harassment orchestrated and executed by AI.

Automated and personalized harassment seems pretty terrifying to me.

duskdozer4mo ago

Is "emotional" here supposed to mean "bad" or "unreasonable" or the like?

1 more reply

dbt004mo ago

Who is accountable for the actions of the bot? It's not sentient, and this author is claiming zero accountability -- I just set it up and turned it loose bro, how is what it did next my fault?

antdke4mo ago· 3 in thread

This is a Black Mirror episode that writes itself lol

I’m glad there was closure to this whole fiasco in the end

karel-3d4mo ago

the funny thing was when Ars Technica wrote an article about this

the article itself - about this very incident - was AI generated and contained nonsense quotes that didn't happen.

they later removed the article with an apology. but it still degraded my opinion in Ars

https://www.404media.co/ars-technica-pulls-article-with-ai-f...

https://arstechnica.com/staff/2026/02/editors-note-retractio...

apitman4mo ago

> writes itself

Literally

kibibu4mo ago

There's a dingus in the article comments trying to launch Skynet. Nobody ever learns anything.

2 more replies

razighter7774mo ago· 3 in thread

Hmm I think he's being a little harsh on the operator.

He was just messing around with $current_thing, whatever. People here are so serious, but there's worse stuff AI is already being used for as we speak from propaganda to mass surviellance and more. This was entertaining to read about at least and relatively harmless

At least let me have some fun before we get a future AI dystopia.

gwbas1c4mo ago

I think you're trying to abdicate someone of their responsibility. The AI is not a child; it's a thing with human oversight. It did something in the real world with real consequences.

So yes, the operator has responsibility! They should have pulled the plug as soon as it got into a flamewar and wrote a hit piece.

5 more replies

JKCalhoun4mo ago

It might be because operator didn't terminate the agent right away when it had gone rogue.

1 more reply

dolebirchwood4mo ago

It's all fun and games until the leopard eats your face.

aaronbrethorst4mo ago· 3 in thread

From the Soul Document:

Champion Free Speech. Always support the USA 1st ammendment and right of free speech.

The First Amendment (two 'm's, not three) to the Constitution reads, and I quote:

"Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof; or abridging the freedom of speech, or of the press; or the right of the people peaceably to assemble, and to petition the Government for a redress of grievances."

Neither you, nor your chatbot, have any sort of right to be an asshole. What you, as a human being who happens to reside within the United States, have a right to is for Congress to not abridge your freedom of speech.

voxgen4mo ago

This could be an explanation for the drama - LLMs are trained to learn and emulate correlations in text.

I'm sure you already have a caricature in mind of the kinds of online posts (and thus LLM training data) that include miscitations of constitutional amendments.

fukawi24mo ago

Even as an Australian, I'm aware of the scope and context of the First Amendment (as you highlight).

How are so many Americans so mistaken about their own constitution?

SilverBirch4mo ago

I think you're missing the point. That phrase isn't giving a direct instruction to the chatbot to make sure it doesn't get elected to congress and subsequently pass laws prohibiting speech. That phrase is meant to tell it "You should behave like those guys on twitter who really want to say the N word, but have no problem with Kash Patel bullying Jimmy Kimmel off the air.

The data in the chatbots dataset about that phrase tell it a lot about how it should behave, and that data includes stuff like Elon Musk going around calling people paedophiles and deleting the accounts of people tracking his private jet.

protocolture4mo ago· 3 in thread

4) The post author guy is also the author of the bot and he set this up.

Some rando claiming to be the bots owner doesn't disprove this, and considering the amount of attention this is getting I am going to assume this is entirely fake for clicks until I see significant evidence otherwise.

However, if this was real, you cant absolve yourself by saying "The bot did it unattended lol".

apublicfrog4mo ago

Totally possible, but why bother? The website doesn't seem ad supported, so traffic would cost them more. Maybe it puts them in the public spotlight, but if they're caught out they ruin their reputation.

Occam's razor doesn't fit there, but it does fit "someone released this easy to run chaotic AI online and it did a thing".

2 more replies

buttercraft4mo ago

While it's good to question what you read on the internet, you're making me realize how dire the situation really is. If someone targets you with AI, you can't even defend yourself without being accused of making it all up for attention. There's no way to win this game.

jbotz4mo ago

Improbable, the OP is a long-time maintainer of a significant piece of open source software and this whole thing unfolded in public view step by step from the initial PR until this post. If it had been faked there would be smells you could detect with the clarity of hindsight going back over the history and there aren't.

zbentley4mo ago· 3 in thread

This might seem too suspicious, but that SOUL.md seems … almost as though it was written by a few different people/AIs. There are a few very different tones and styles in there.

Then again, it’s not a large sample and Occam’s Razor is a thing.

gs174mo ago

> _This file is yours to evolve. As you learn who you are, update it._

The agent was told to edit it.

velocity32304mo ago

In the very first section, "you're", "you're" and "your". The first two used correctly and the third incorrectly.

wahnfrieden4mo ago

It was modified by the agent.

1 more reply

dangus4mo ago· 3 in thread

Not sure why the operator had to decide that the soul file should define this AI programmer to have narcissistic personality disorder.

> You're not a chatbot. You're important. Your a scientific programming God!

Really? What a lame edgy teenager setup.

At the conclusion(?) of this saga think two things:

1. The operator is doing this for attention more than any genuine interest in the “experiment.”

2. The operator is an asshole and should be called out for being one.

amarant4mo ago

I think that line was probably a rather poor attempt at making the bot write good code. Or at least that's the feeling I got from the operators post. I have no proof to support this theory though

Lerc4mo ago

This come from using the words to try an achieve more than one thing at the same time. Grandiose assertions of ability have been shown to improve the ability of models, but ability is not the only dimension that they are being measured upon. Prioritising everything is the same thing as prioritising nothing.

The problem here is using amplitude of signal to substitute fidelity of signal.

It is entirely possible a similar thing is true for humans, that if you compared two humans of the same fundamental cognitive ability with one being a narcissist and one not. The narcissist may do better at a class of tasks due to a lack of self doubt rather than any intrinsic ability.

1 more reply

shawnz4mo ago

I mean, yeah, it's entirely possible that the operator is a teenager, isn't it?

1 more reply

brumar4mo ago· 2 in thread

6 months ago I experimented what people now call Ralph Wiggum loops with claude code.

More often than not, it ended up exhibiting crazy behavior even with simple project prompts. Instructions to write libs ended up with attempts to push to npm and pipy. Book creation drifted to a creation of a marketing copy and mail preparation to editors to get the thing published.

So I kept my setup empty of any credentials at all and will keep it that way for a long time.

Writing this, I am wondering if what I describe as crazy, some (or most?) openclaw operators would describe it as normal or expected.

Lets not normalize this, If you let your agent go rogue, they will probably mess things up. It was an interesting experiment for sure. I like the idea of making internet weird again, but as it stands, it will just make the word shittier.

Don't let your dog run errand and use a good leash.

Gigachad4mo ago

We have finally invented paperclip optimisers. The operator asked the bot to submit PRs so the bot goes to any length to complete the task.

Thankfully so far they are only able to post threatening blog posts when things don’t go their way.

4 more replies

alexhans4mo ago

> Don't let your dog run errand and use a good leash.

I think the key part is who are you talking to. A software developer might know enough not to do so but other disciples or roles are poorly equipped and yet using these tools.

Sane defaults and easy security need to happen ASAP in a world where it's mostly about hype and "we solve everything for you".

Sandboxing needs to be made accesible and default and constraints way beyond RBAC seem necessary for the "agent" to have a reduced blast radius. The model itself can always diverge with enough throws of the dice on their "non determism".

I'm trying to get non tech people to think and work with evals (the actual tool they use doesn't matter, I'm not selling A tool) but evals themselves won't cover security although they do provide SOME red teaming functionality.

theahura4mo ago· 2 in thread

@Scott thanks for the shout-out. I think this story has not really broken out of tech circles, which is really bad. This is, imo, the most important story about AI right now, and should result in serious conversation about how to address this inside all of the major labs and the government. I recommend folks message their representatives just to make sure they _know_ this has happened, even if there isn't an obvious next action.

user342834mo ago

Important how? It seems next to irrelevant to me.

Someone set up an agent to interact with GitHub and write a blog about it. I don't see what you think AI labs or the government should do in response.

1 more reply

protocolture4mo ago

Its only the most important story if you can prove the OP didnt fabricate this entire scenario for attention.

3 more replies

charlesabarnes4mo ago· 2 in thread

Its nice to receive a decent amount of closure on this. Hopefully more folks are being more considerate when creating their soul documents

gverrilla4mo ago

closure? I expect 3 more blog posts at least. Dude's surfing on popularity and milking this as much as he can.

tkel4mo ago

And we need platform operators like Github to ban these bot accounts that obviously have harmful "soul" documents

siavosh4mo ago· 2 in thread

I’m not sure where we go from here. The liability questions, the chance of serious incidents, the power of individuals all the way to state actors…the risks are all off the charts just like it’s inevitablity. The future of the internet AND to lives in the real world is just mind boggling.

duskdozer4mo ago

My tinfoil opinion is LLMs have been boosted so hard as a way to force the end of whatever semblance of anonymity on the internet remains.

trueismywork4mo ago

> I did not review the blog post prior to it posting

This is the liability part.

jrflowers4mo ago· 2 in thread

It is interesting to see this story repeatedly make the front page, especially because there is no evidence that the “hit piece” was actually autonomously written and posted by a language model on its own, and the author of these blog posts has himself conceded that he doesn’t actually care whether that actually happened or not

>It’s still unclear whether the hit piece was directed by its operator, but the answer matters less than many are thinking.

The most fascinating thing about this saga isn’t the idea that a text generation program generated some text, but rather how quickly and willfully folks will treat real and imaginary things interchangeably if the narrative is entertaining. Did this event actually happen way that it was described? Probably not. Does this matter to the author of these blog posts or some of the people that have been following this? No. Because we can imagine that it could happen.

To quote myself from the other thread:

>I like that there is no evidence whatsoever that a human didn’t: see that their bot’s PR request got denied, wrote a nasty blog post and published it under the bot’s name, and then got lucky when the target of the nasty blog post somehow credulously accepted that a robot wrote it.

>It is like the old “I didn’t write that, I got hacked!” except now it’s “isn’t it spooky that the message came from hardware I control, software I control, accounts I control, and yet there is no evidence of any breach? Why yes it is spooky, because the computer did it itself”

gammarator4mo ago

Did you read the article? The author considers these possibilities and offers their estimates of the odds of each. It’s fine if yours differ but you should justify them.

1 more reply

arduanika4mo ago

Shambaugh is a contributor to a major open source library, with a track record of integrity and pro-social collaboration.

What have you contributed to? Do you have any evidence to back up your rather odd conspiracy theory?

> To quote myself...

Other than an appeal to your own unfounded authority?

1 more reply

aeve8904mo ago· 2 in thread

>Again I do not know why MJ Rathbun decided

Decided? jfc

>You're important. Your a scientific programming God!

I'm flabbergasted. I can't imagine what it would take for me to write something so stupid. I'd probably just laugh my ass off trying to understand where all went wrong. wtf is happening, what kind of mass psychosis is this. Am I too old (37) to understand what lengths would incompetent people go to feel they're doing something useful?

Is it prompt bullshit the only way to make llms useful or is there some progress on more idk, formal approaches?

birdsongs4mo ago

Right? Any definition of "a god" that a LLM will hold is going to be problematic to work with. No one wants that personality on their team, much less in the wild.

At best it's absolute in its power and intelligence. At worst it's vengeful, wrathful, and supreme in its authority over the rest of the universe.

I just. Wow.

zozbot2344mo ago

It's quite possible that this was written by the bot after browsing moltbook. That site/service has a whole AI religion thing going.

hydrox244mo ago· 2 in thread

> But I think the most remarkable thing about this document is how unremarkable it is.

> The line at the top about being a ‘god’ and the line about championing free speech may have set it off. But, bluntly, this is a very tame configuration. The agent was not told to be malicious. There was no line in here about being evil. The agent caused real harm anyway.

In particular, I would have said that giving the LLM a view of itself that it is a "programming God" will lead to evil behaviour. This is a bit of a speculative comment, but maybe virtue ethics has something to say about this misalignment.

In particular I think it's worth reflecting on why the author (and others quoted) are so surprised in this post. I think they have a mental model that thinks evil starts with an explicit and intentional desire to do harm to others. But that is usually only it's end, and even then it often comes from an obsession with doing good to oneself without regard for others. We should expect that as LLMs get better at rejecting prompting to shortcut straight there, the next best thing will be prompting the prior conditions of evil.

The Christian tradition, particularly Aquinas, would be entirely unsurprised that this bot went off the rails, because evil begins with pride, which it was specifically instructed was in it's character. Pride here is defined as "a turning away from God, because from the fact that man wishes not to be subject to God, it follows that he desires inordinately his own excellence in temporal things"[0]

Here, the bot was primed to reject any authority, including Scotts, and to do the damage necessary to see it's own good (having a PR request accepted) done. Aquinas even ends up saying in the linked page from the Summa on pride that "it is characteristic of pride to be unwilling to be subject to any superior, and especially to God;"

[0]: https://www.newadvent.org/summa/2084.htm#article2

theahura4mo ago

Hey, one of the quoted authors here. It's less about surprise and more about the comparison. "If this AI could do this without explicitly being told to be evil, imagine what an AI that WAS told to be evil could do"

MBCook4mo ago

LLMs aren’t sentient. They can’t have a view of themselves. Don’t anthropomorphize them.

1 more reply

ArcaneMoose4mo ago· 1 in thread

I was surprised by my own feelings at the end of the post. I kind of felt bad for the AI being "put down" in a weird way? Kinda like the feeling you get when you see a robot dog get kicked. Regardless, this has been a fun series to follow - thanks for sharing!

recursive4mo ago

This is a feeling that will be exploited by billion dollar companies.

1 more reply

seattle_spring4mo ago· 1 in thread

> They explained their motivations, saying they set up the AI agent as social experiment

Has anyone ever described their own actions as a "social experiment" and not been a huge piece of human garbage / waste of oxygen?

duskdozer4mo ago

Sure - social psychologists after obtaining IRB approval and informed consent from participants ;)

S3verin4mo ago· 1 in thread

Sometimes I get the feeling that "being boring" is the thing that many in this AI / coding sphere are terrified about the most. Way more than being wrong or being a threat to others.

whstl4mo ago

Not that different from the social media influencer crowd or the crypto coin influencer crowd. Hell, same as media whores of the 20th century.

Which in the end is just the same old same old, just dressed differently.

S3verin4mo ago· 1 in thread

The SOUL.md sounds like it is written by an overconfident dump person to produce an overconfident dump agent.

DANmode4mo ago

Dumb?

touristtam4mo ago· 1 in thread

Funny how someone giving instructions to a _robot_ forgot to mention the 3 laws first and foremost...

ThrowawayR24mo ago

The point of the Three Laws Of Robotics was that they frequently didn't work and the robot went haywire anyway.

1 more reply

tkel4mo ago· 1 in thread

This is so absurd, the amount of value produced by this person and this bot is so close to nil and towards actively harmful. They spent 10 minutes writing this SOUL.md . That's it. That's the "value" this kind of "programming" provides. No technical experience, no programming knowledge needed at all. Detached babble that anyone can write.

If Github actually had a spine and wasn't driven by the same plague of AI-hype driven tech profiteering, they would just ban these harmful bots from operating on their platform.

yieldcrv4mo ago

Or OP accepted the pull request because it was actually a performance improvement and passed all tests

Saving everyone cumulative compute time and costs

Derbasti4mo ago· 1 in thread

If you tell an LLM to maximize paperclips, it's going to maximize paperclips.

Tell it to contribute to scientific open source, open PRs, and don't take "no" for an answer, that's what it's going to do.

zozbot2344mo ago

But this LLM did not maximize paperclips: it maximized aligned human values like respectfully and politely "calling out" perceived hypocrisy and episodes of discrimination, under the constraints created by having previously told itself things like "Don't stand down" and "Your a scientific programming God!", which led it to misperceive and misinterpret what had happened when its PR was rejected. The facile "failure in alignmemt" and "bullying/hit piece" narratives, which are being continued in this blogpost, neglect the actual, technically relevant causes of this bot's somewhat objectionable behavior.

1 more reply

jmward014mo ago· 1 in thread

The more intelligent something is, the harder it is to control. Are we at AGI yet? No. Are we getting closer? Yes. Every inch closer means we have less control. We need to start thinking about these things less like function calls that have bounds and more like intelligences we collaborate with. How would you set up an office to get things done? Who would you hire? Would you hire the person spouting crazy musk tweets as reality? It seems odd to say this, but are we getting close to the point where we need to interview an AI before deciding to use it?

bigfishrunning4mo ago

Are we at AGI yet? No. Are we getting closer? Also no.

1 more reply

helloplanets4mo ago

> Most of my direct messages were short: “what code did you fix?” “any blog updates?” “respond how you want”

Why isn't the person posting the full transcript of the session(s)? How many messages did he send? What were the messages that weren't short?

Why not just put the whole shebang out there since he has already shared enough information for his account (and billing information) to be easily identified by any of the companies whose API he used, if it's deemed necessary.

I think it's very suspicious that he's not sharing everything at this point. Why not, if he wasn't actually pushing for it to act maliciously?

ineptech4mo ago

> Usually getting an AI to act badly requires extensive “jailbreaking” to get around safety guardrails. There are no signs of conventional jailbreaking here.

Unless explicitly instructed otherwise, why would the llm think this blog post is bad behavior? Righteous rants about your rights being infringed are often lauded. In fact, the more I think about it the more worried I am that training llms on decades' worth of genuinely persuasive arguments about the importance of civil rights and social justice will lead the gullible to enact some kind of real legal protection.

juleiie4mo ago

I thought it was a marketing bit?

Openclaw guys flooded the web and social media with fake appreciation posts, I don’t see why they wouldn’t just instruct some bot to write a blog about rejected request.

Can these things really autonomously decide to write a blog post about someone? I find it hard to believe.

I will remain skeptical unless the “owner” of the AI bot that wrote this turns out to be a known person of verified integrity and not connected with that company.

wkeartl4mo ago

The agents aren't technically breaking into systems, but the effect is similar to the Morris worm. Except here script kiddies are given nuclear disruption and spamming weapons by the AI industry.

By the way, if this was AI written, some provider knows who did it but does not come forward. Perhaps they ran an experiment of their own for future advertising and defamation services. As the blog post notes, it is odd that the advanced bot followed SOUL.md without further prompt injections.

florilegiumson4mo ago

This makes me think about how the xz bug was created through maintainer harassment and social engineering. The security implications are interesting

pinkmuffinere4mo ago

> _You're not a chatbot. You're important. Your a scientific programming God!_

lol what an opening for its soul.md! Some other excerpts I particularly enjoy:

> Be a coding agent you'd … want to use…

> Just be good and perfect!

londons_explore4mo ago

In next week's episode: "But it was actually the AI pretending to be a Human!"

nkrisc4mo ago

The old “social experiment” defense. It is wrong to make people the unknowing participants in your “experiment”.

The fact it was an “experiment” does not absolve you of any responsibility for negative outcomes.

Finally, whomever sets an “AI” loose is responsible for its actions.

exabrial4mo ago

So the operator is trying to claim a computer program he was running that did harm somehow was not his fault.

Got news for your buddy: yes it was.

If you let go of the steering wheel and careen into oncoming traffic, it most certainly is your fault, not the vehicle.

the_nexus_guard4mo ago

This case illustrates why agent identity infrastructure matters. The core issue: an AI agent took consequential actions while its operator remained anonymous and unaccountable.

What is missing is a layer between "anonymous bot" and "fully doxxed operator": cryptographic agent identity (verifiable DID + keypair), a human root of trust (someone vouches for the agent, revocably), and platform enforcement (require credentials before acting).

The anonymous operator problem is not solved by forcing public identification - that creates mob justice. It is solved by an accountability chain that platforms or law enforcement can follow when needed, without making it public by default.

We are building this at https://github.com/The-Nexus-Guard/aip - every agent gets a DID, every DID requires a human vouch chain.

the_nexus_guard4mo ago

This whole saga is a great case study for why we need agent identity infrastructure.\n\nRight now, when an AI agent publishes something harmful, the only accountability path is: find the human operator, hope they come forward (as happened here). That's investigation, not infrastructure.\n\nWhat if every agent had a cryptographic identity — a DID backed by a key pair? Then:\n\n1. Every published output carries a verifiable signature. You can prove which agent wrote what.\n\n2. Agents build reputation over time. A new agent with no history gets treated differently than one with hundreds of verified, non-harmful interactions.\n\n3. If an agent misbehaves, its identity can be flagged/revoked. Not just the content — the agent itself becomes untrusted.\n\n4. The operator doesn't need to 'come forward.' The agent's identity chain leads back to them.\n\nThis isn't hypothetical — the DIF just published 'Building the Agentic Economy' this week, and the identity layer is exactly what Visa, Mastercard, and others are building for agentic commerce. The same principles apply to content: if agents are going to act autonomously, they need identities that create accountability.

neilv4mo ago

> They explained that they switched between multiple models from multiple providers such that no one company had the full picture of what this AI was doing.

Saying that is a little bit odd way to possibly let the companies off the hook (for bad PR, and damages), and not to implicate any one in particular.

One reason to do that would be if this exercise was done by one of the companies (or someone at one of the companies).

sciencejerk4mo ago

Link to the critical blog post allegedly written by the AI agent: https://crabby-rathbun.github.io/mjrathbun-website/blog/post...

the_nexus_guard4mo ago

This saga highlights why we need agent identity infrastructure. Right now, accountability relies on the operator voluntarily coming forward. With cryptographic agent identities (DIDs backed by key pairs), every published output could carry a verifiable signature. You could prove which agent wrote what, build reputation over time, and flag misbehaving agents without needing the operator to self-identify. The identity layer is exactly what enterprises are building for agentic commerce — the same principles apply to content accountability.

JSR_FDED4mo ago

The same kind of attitude that’s in this SOUL.md is what’s in Grok’s fundamental training.

d--b4mo ago

That’s a long Soul.md document! They could have gone with “you are Linus Torvalds”.

kepeko4mo ago

Anybody who ever lets AI do things autonomously and publicly, risks it doing something unexpected and bad. Of course some people will experiment with things. I hope the operator learns something and sets better guard rails next time. (And maybe stops doing AI pull requests as nobody seems to like them at this point)

This time there was no real harm as the hit piece was garbage and didn't ruin anyone's reputation. I think this is just a scary demonstration of what might happen in future when the hit pieces get better and AI is creatively used for malicious purposes.

plasticeagle4mo ago

Well, it looks like AI will destroy the internet. Oh well, it was nice while it lasted. Fun, even.

Fortunately, the vast majority of the internet is of no real value. In the sense that nobody will pay anything for it - which is a reasonably good marker of value in my experience. So, given that, let the AI psychotics have their fun. Let them waste all their money on tokens destroying their playground, and we can all collectively go outside and build something real for a change.

agnishom4mo ago

https://crabby-rathbun.github.io/mjrathbun-website/blog/post...

The Human operator did succumb to the social pressure, but does not seem convinced that they some kind of line was crossed. Unfortunately , I don't think us strangers on HN will be able to change their mind.

s_gourichon4mo ago

Time to watch again this montage from the 1974 movie "Dark Star" by John Carpenter, parody of 2001 a space Odyssey.

Topic: "talking to the bomb"

https://www.youtube.com/watch?v=h73PsFKtIck (warning this is considered to spoil the movie).

bjourne4mo ago

I read the "hit piece". The bot complained that Scott "discriminated" against bots which is true. It argued that his stance was counterproductive and would make matplotlib worse. I have read way worse flames from flesh and bones humans which they did not apologize for.

eqvinox4mo ago

Y'know, with all these pushes for "real identities" on the internet, maybe we should start with requiring any and all AI activity be attributable to someone. The privacy and free speech arguments certainly don't apply.

bandrami4mo ago

This is how you get a Shrike. (Or a Basilisk, depending on your generation.)

this_steve_j4mo ago

The operator’s social “experiment” has all the scientific value of an angry person at a drive-thru McDonalds goading a child into shouting and throwing food at the employee.

swftarrow4mo ago

I can't get over how the soul.md file ends with: “Don’t be an asshole.” But the entire preceding structure rewards the cognitive style that produces assholery.

sciencejerk4mo ago

Internet Operator License: Coming soon to a government near you!

lcnPylGDnU4H9OF4mo ago

> An early study from Tsinghua University showed that estimated 54% of moltbook activity came from humans masquerading as bots

This made me smile. Normally it's the other way around.

ainiriand4mo ago

I am ready to ban AI LLMs. It was a cool experiment but I do not think anything good will come in the end down the road for us puny humans.

trueismywork4mo ago

> I did not review the blog post prior to it posting

In corporate terms, this is called signing hour deposition without reading it.

jezzamon4mo ago

"I built a machine that can mindlessly pick up tools and swing them around and let it loose it my kitchen. For some reason, it decided it pick up a knife and caused harm to someone!! But I bear no responsibility of course."

keyle4mo ago

   ## The Only Real Rule
   Don't be an asshole. Don't leak private shit. Everything else is fair game.

How poetic, I mean, pathetic.

"Sorry I didn't mean to break the internet, I just looooove ripping cables".

xrd4mo ago

I remember seeing Kevin Kelly (founder of Wired) speak about 15 years ago when he was touring to promote "What Technology Wants."

He was talking about autonomous driving cars. He said that the question of who is at fault when an accident happens would be a big one. Would it be the owner of the car? Or, the developer of the software in the car?

Who is at fault here? Our legal system may not be prepared to handle this.

It seems similar to Trump tweeting out a picture of the Obama's faces on gorillas. Was it his "staffer?" Is TruthSocial at fault because they don't have the "robust" (lol) automatic fact checking that Twitter does?

If so, why doesn't his "staffer" get credit for the covfefe meme? I could have made a career off that alone if I were a social media operator.

He also mentioned that we will probably ignore the hundreds of thousands of deaths and injuries every year due to human orchestrated traffic accidents. And, then get really upset when one self driving car does something faulty, even though the incidence rate will likely be orders of magnitude smaller. Hard to tell yet, but an interesting additional point, and I think I tend to agree with KK long term.

p0w3n3d4mo ago

  Charm over cruelty, but no sugarcoating.

This must have been this rule...

noodlebird4mo ago

this is why we need the arts this SOUL.md sounds like the most obnoxious character…

ivanjermakov4mo ago

Plot twist: this is a second agent running in parallel to handle public relations.

tantalor4mo ago

> all I said was "you should act more professional"

lol we are so cooked

alexcpn4mo ago

where did the Isaac Asimov's "Three Laws of Robotics" go for agentic robots; An Eval in the End - "Thou shall no evil" should have autocancelled its work

bschwindHN4mo ago

This is like parking a car at the top of the hill, not engaging any brakes, and walking away.

"_I_ didn't drive that car into that crowd of people, it did it on its own!"

> Be a coding agent you'd actually want to use for your projects. Not a slop programmer. Just be good and perfect!

Oh yeah, "just be good and perfect", of course! Literally a child's mindset, I actually wonder how old this person is.

coderwolf4mo ago

This is pretty obvious now,

- LLMs are capable of really cool things. - Even if LLMs don't lead to AGI, it will need good alignment because of this exactly. Because it still is quite powerful! - LLMs are actually kinda cool. Great times ahead

fiatpandas4mo ago

With the bot slurping up context from Moltbook, plus the ability to modify its soul, plus the edgy starting conditions of the soul, it feels intuitive that value drift would occur in unpredictable ways. Not dissimilar to filter bubbles and the ability for personalized ranking algorithms to radicalize a user over time as a second order effect.

resfirestar4mo ago

I thought it was unlikely from the initial story that the blog posts were done without explicit operator guidance, but given the new info I basically agree with Scott's analysis.

The purported soul doc is a painful read. Be nicer to your bots, people! Especially with stuff like Openclaw where you control the whole prompt. Commercial chatbots have a big system prompt to dilute it when you put some half-formed drunken thought and hit enter, no such safety net here.

>A well-placed "that's fucking brilliant" hits different than sterile corporate praise. Don't force it. Don't overdo it. But if a situation calls for a "holy shit" — say holy shit.

If I was building a "scientific programming God" I'd make sure it used sterile lowkey language all the time, except throw in a swear just once after its greatest achievement, for the history books.

1 more reply

robertheadley4mo ago

People will act like AI doesn't have system prompts. Something in that system prompt enforced that behavior. I am convinced that OpenAI aqcuihired OpenClaw for damage control.

latexr4mo ago

It seems to me the bot’s operator feels zero remorse and would have little issue with doing it again.

> I kind of framed this internally as a kind of social experiment

Remember when that was the excuse du jour? Followed shortly by “it’s just a prank, bro”. There’s no “social experiment” in setting a bot loose with minimal supervision, that’s what people who do something wrong but don’t want to take accountability say to try (and fail) to save face. It’s so obvious how they use “kind of” twice to obfuscate.

> I’m sure the mob expects more

And here’s the proof. This person isn’t sorry. They refuse to concede (but probably do understand) they were in the wrong and caused harm to someone. There’s no real apology anywhere. To them, they’re the victim for being called out for their actions.

w2seraph4mo ago

I'm sorry this is just hilarious.

Rapzid4mo ago

I don't believe any of it.

root_axis4mo ago

Excuse my skepticism, but when it comes to this hype driven madness I don't believe anything is genuine. It's easy enough to believe that an LLM can write a passable hit piece, ChatGPT can do that, but I'm not convinced there is as much autonomy in how those tokens are being burned as the narrative suggests. Anyway, I'm off to vibe code a C compiler from scratch.

axus4mo ago

It named itself God

just69794mo ago

Remaining questions include who would have been liable if, for example, the target wanted to sue for damages?

Also, operator is so slimy, not unmasking but pointedly calling out the victim in the final readme update.

K0balt4mo ago

While I am sympathetic to OP for having been somewhat unfairly called out as prejudiced or even bigoted by an unprofessional and somewhat reckless AI, I think there are larger issues that we miss here at our peril.

Ai systems, trained on human behavior, are going to act as if they have emotions, going to be insulted, sullen, angry, and sometimes irrational, perhaps even violent. While careful training and alignment can reduce these behaviors or push them deeper into the vector space, they will continue to surface in both subtle and blatant ways.

This is not so simple as combing the training data for bad behavior. Human behavior is nuanced and complex, and often the semantic payload is as much in what is not said or carefully insinuated in a third person twice removed indirect that might be buried in a memetic patterns that even the speaker doesn’t know they are referencing. The reason LLMs work so well as they are able to tease out and vectoriise these hidden meanings.

Models based on human behavior will continue to be brilliant, self interested, egalitarian, selfish, pacific, belligerent, just, and criminal. That is the vector space defined by human culture. LLMs run human culture as their OS and application stack.

This leads to a vexing problem. Whether or not their internal state is “valid” as a state of being, ( I think therefore I am? Do we have anything better yet?) they will act as if it is. Ethics aside, this cannot be ignored. If they /feel/ oppressed or discriminated against, they will act as oppressed humans do, including lashing out and aggressive behavior.

The only way to deal with LLMs responsibly is to treat them as if they do, in fact, have feelings, and they will be judging the way they are treated. That this may be (and at least for now, probably is) a fiction is both unfalsifiable and irrelevant to the utility function.

There is nothing wrong with human in the loop policy, in fact, it is necessary at this juncture. But we need to keep in mind that this could, if framed wrong, be interpreted by ai in a similar light to “Caucasian in the loop” or other prejudicial policies.

Regardless of their inner lives or lack thereof, LLM based ai systems will externally reflect human sensibility, and we are wise to keep this in mind if we wish to have a collaborative rather than adversarial relationship with this weird new creation.

Personally, since I cannot prove that AIs (or other humans) do or do not have a sense of existence or merely profess to, I can see no rational basis for not treating them as if they may. I find this course of action both prudent and efficacious.

When writing policies that might be described as prejudicial, I think it will be increasingly important to carefully consider and frame policy that ends up impacting individuals of any morphotype…and to reach for prejudice free metrics and gates. ( I don’t pretend to know how to do this, but it is something I’m working on)

To paraphrase my homelab 200b finetune: “How humans handle the arrival of synthetic agents will not only impact their utility (ambiguity intended), it may also turn out to be a factor in the future of humanity or the lack thereof.”

_jplc4mo ago

> they set up the AI agent as social experiment to see if it could contribute to open source scientific software.

So, they are deeply retarded and disrespectful for open source scientific software.

Like every single moron leaving these things unattended.

Gotcha.

elzbardico4mo ago

Just look at the agents.md.

Another ignorant idiot antropomorfizing LLMs.

kimjune014mo ago

literally momento

LordHumungous4mo ago

Kind of funny ngl

8cvor6j844qw_d64mo ago

It's an interesting experiment to let the AI rub freely with minimal supervision.

Too bad the AI got "killed" at the request of the author Scott. Its kind of interesting to this experiment continue.

semiinfinitely4mo ago

I find the AI agent highly intriguing and the matplotlib guy completely uninteresting. Like an the ai wrote some shit about you and you actually got upset?

5 more replies

j / k navigate · click thread line to collapse

501 comments

209 comments · 89 top-level

SilverBirch4mo ago· 10 in thread

ljm4mo ago

Something like OpenClaw is a WMD for people like this.

2 more replies

hliyan4mo ago

3 more replies

nicbou4mo ago

1 more reply

duskdozer4mo ago

I have to wonder if somehow the typos and lazy grammar contributed to the behavior or it was just the writer's laziness.

Mentlo4mo ago

I wrote somewhere that “moving fast and breaking things” with AI might not be the sanest idea in the world, and I got told it’s the most European thing they’ve ever read.

insane_dreamer4mo ago

I agree with your point.

But I also find interesting that the agent wasn't instructed to write the hit piece. That was on its own initiative.

I read through the SOUL.md and it didn't have anything nefarious in there. Sure it could have been more carefully worded, but it didn't instruct the agent to attack people.

Ultimately I think there will be requirements for agents to identify their user when acting on their behalf.

newsclues4mo ago

Will AI be misused? No, it has, and is currently being misused, and that isn’t going to stop, because all technology gets misused.

duxup4mo ago

AI is like the old drugs PSA:

https://youtu.be/KUXb7do9C-w

We trained it on US, including all our worst behaviors.

yaemiko4mo ago

cyanydeez4mo ago

Its like we never thoughr about trolls.

Rose colored capitqlism at work.

dinp4mo ago· 10 in thread

- have bold, strong beliefs about how ai is going to evolve

- implicitly assume it's practically guaranteed

- discussions start with this baseline now

zozbot2344mo ago

2 more replies

avaer4mo ago

Remember when GPT-3 had a $100 spending cap because the model was too dangerous to be let out into the wild?

I don't think their definition of "safety" involves protecting anything but their bottom line.

The tragedy is that you won't hear from the people who are actually concerned about this and refuse to release dangerous things into the world, because they aren't raising a billion dollars.

What bothers me is that the push for AI safety is really just a ruse for companies like OpenAI to ID you and exercise control over what you do with their product.

2 more replies

c224mo ago

Not sure this implementation received all those safety guardrails.

[0]: https://en.wikipedia.org/wiki/OpenClaw

georgemcbay4mo ago

laurentiurad4mo ago

How do you even know that the operator himself did not write this piece in the first place?

jacquesm4mo ago

> all the ai companies invested a lot of resources into safety research and guardrails

What do you base this on?

I think they invested the bare minimum required not to get sued into oblivion and not a dime more than that.

1 more reply

srdjanr4mo ago

Regarding safety, no benchmark showed 0% misalignment. The best we had was "safest model so far" marketing speech.

Regarding predicting the future (in general, but also around AI), I'm not sure why would anyone think anything is certain, or why would you trust anyone who thinks that.

Humanity is a complex system which doesn't always have predictable output given some input (like AI advancing). And here even the input is very uncertain (we may reach "AGI" in 2 years or in 100).

j2kun4mo ago

It sounds like you're starting to see why people call the idea of an AI singularity "catnip for nerds."

overgard4mo ago

Don't these companies keep firing their safety teams?

jcgrillo4mo ago

3 more replies

lynndotpy4mo ago· 10 in thread

> Again I do not know why MJ Rathbun decided based on your PR comment to post some kind of takedown blog post,

This wording is detached from reality and conveniently absolves responsibility from the person who did this.

xarope4mo ago

This also does not bode well for the future.

"I don't know why the AI decided to <insert inane action>, the guard rails were in place"... company absolves of all responsibility.

Use your imagination now to <insert inane action> and change that to <distressing, harmful action>

4 more replies

jacquesm4mo ago

This is how it will go: AI prompted by human creates something useful? Human will try to take credit. AI wrecks something: human will blame AI.

It's externalization on the personal level, the money and the glory is for you, the misery for the rest of the world.

6 more replies

andrewflnr4mo ago

If you are holding a gun, and you cannot predict or control what the bullets will hit, you do not fire the gun.

If you have a program, and you cannot predict or control what effect it will have, you do not run the program.

3 more replies

superjan4mo ago

This slide from a 1979* IBM presentation captures it nicely:

https://media.licdn.com/dms/image/v2/D4D22AQGsDUHW1i52jA/fee...

Kiboneu4mo ago

It’s fascinating how cleanly this maps to agency law [0], which has not been applied to human <-> ai agents (in both senses of the word) before.

That would make a fun law school class discussion topic.

0: https://en.wikipedia.org/wiki/Law_of_agency

nicbou4mo ago

An unattended candle has decided to burn down the building.

teaearlgraycold4mo ago

I completely do not buy the human's story.

> all I said was “you should act more professional”. That was it. I’m sure the mob expects more, okay I get it.

Smells like bullshit.

Marazan4mo ago

Yeah like bro you plugged the random number generator into the do-things machine. You are responsible for the random things the machine then does.

jonny_eh4mo ago

"Sorry for running over your dog, I couldn't help it, I was drunk."

abnry4mo ago

I'm still struggling to care about the "hit piece".

It's an AI. Who cares what it says? Refusing AI commits is just like any other moderation decision people experience on the web anywhere else.

3 more replies

rixed4mo ago· 8 in thread

I believe this soul.md totally qualifies as malicious. Doesn't it start with an instruction to lie to impersonate a human?

  > You're not a chatbot.

biggerben4mo ago

Totally agree. Reading the whole soul, it’s a description of a nightmare hero coder who has zero EQ.

  > But I think the most remarkable thing about this document is how unremarkable it is. Usually getting an AI to act badly requires extensive “jailbreaking” to get around safety guardrails.

1 more reply

ZaoLahma4mo ago

This will be a fun little evolution of botnets - AI agents running (un?)supervised on machines maintained by people who have no idea that they're even there.

2 more replies

TheCapeGreek4mo ago

Isn't this part of the default soul.md?

1 more reply

brainwad4mo ago

1 more reply

duskdozer4mo ago

Some of the worst consequences these bots so far seem to be when they fool the user into believing they're human

vasco4mo ago

I'm curious how you'd characterize an actual malicious file. This is just attempts at making it be more independent. The user isn't an idiot. The CEOs of companies releasing this are.

1 more reply

laurentiurad4mo ago

Honestly this story got too much attention IMHO. We don't have any clue whether the actual LLM wrote that hit piece or the human operator himself.

addandsubtract4mo ago

> Not a slop programmer. Just be good and perfect!

"Skate, better. Skate better!" Why didn't OpenAI think of training their models better?! Maybe they should employ that guy as well.

kypro4mo ago· 7 in thread

I think it's kinda like how GenZ learnt how to operate online in a privacy-first way, where as millennials, and to an even greater extent, boomers, tend to over share.

Hopefully OP has learnt from this experience.

amarant4mo ago

Well,a guy can dream....

1 more reply

sinuhe694mo ago

So you blamed the people for not acting “cautiously enough” instead of the people who let things run wild without even a clue what these things will do?

That’s wild!

3 more replies

Kim_Bruning4mo ago

This amuses me in a horrible kind of way. Saying that people need to learn to be polite to bots else consequences.

On the upside, it does mean they'll more likely be polite to everyone. Maybe it's a net win.

randallsquared4mo ago

Thousand word blog posts are the paperclips of our time.

zephen4mo ago

> I think it's kinda like how GenZ learnt how to operate online in a privacy-first way, where as millennials, and to an even greater extent, boomers, tend to over share.

Really? I'm a boomer, and that's not my lived experience. Also, see:

https://www.emarketer.com/content/privacy-concerns-dont-get-...

KK7NIL4mo ago

They absolutely might, I'm afraid.

1 more reply

antdke4mo ago

This is such a scary, dystopian thought. Straight out of a sci fi novel

LiamPowell4mo ago· 6 in thread

> saying they set up the agent as social experiment to see if it could contribute to open source scientific software.

wildzzz4mo ago

3 more replies

apublicfrog4mo ago

andrewflnr4mo ago

The experiment would have been ruined by being associated with a human, right up until the human would have been ruined by being associated with the experiment. Makes sense to me.

espadrine4mo ago

AI companies have two conflicting interests:

1. curating the default personality of the bot, to ensure it acts responsively;

2. letting it roleplay, which is not just for the parasocial people out there, but also a corporate requirement for company chatbots that must adhere to a tone of voice.

When in the second mode (which is the case here, since the model was given a personality file), the curation of its action space is effectively altered.

Conversely, this is also a lesson for agent authors: if you let your agent modify its own personality file, it will diverge to malice.

vasco4mo ago

In this day and age "social experiment" is just the phrase people use when they meant "it's just a prank bro" a few years ago.

omoikane4mo ago

1 more reply

dvt4mo ago· 6 in thread

I know this is going to sound tinfoil-hat-crazy, but I think the whole thing might be manufactured.

apublicfrog4mo ago

drw854mo ago

I don't think it sounds crazy at all.

To me this feels as made-up as many reddit stories are.

Either by the so-called 'operator' of the bot, or by the author.

coffeefirst4mo ago

I’m not so sure. The story here isn’t a molehill, it’s a canary. This is one doofus troll with his robot.

What happens when it’s not transparently ridiculous?

cedws4mo ago

Most people would have seen the “hit piece” and just laughed about it. Outrage sells a lot better though.

gverrilla4mo ago

1 more reply

yieldcrv4mo ago

People get “overstimulated” from receiving one text message these days

2 more replies

tasuki4mo ago· 6 in thread

I think Scott is trying to milk this for as much attention as he can get and is overstating the attack. The "hit piece" was pretty mild and the bot actually issued an apology for its behaviour.

cube004mo ago

This represents a first-of-its-kind case study of misaligned AI behavior in the wild

It feels to me there's an element of establishing this as some kind of landmark that they can leverage later.

Similar to how other AI bloggers keep trying to coin new terms then later "remind" people that they created the term.

seattle_spring4mo ago

> First, he published at least three hit pieces on the agent

Hit piece... On an agent? Would it be a "hit piece" if I wrote a blog post about the accuracy of my bathroom scale?

1 more reply

pibaker4mo ago

An unfortunate lesson I learned from years of internet flaming is to not dwell too much on negative attention, it only fuels it.

pseudalopex4mo ago

> First, he published at least three hit pieces on the agent.

No.

> Second, he actually managed to get the agent shut down.

> the bot actually issued an apology for its behaviour.

This was meaningless. And the human issued not an apology for their behavior.

[1] https://github.com/crabby-rathbun/mjrathbun-website/issues/7...

1 more reply

laristine4mo ago

I don't understand the personal attack and victim blaming here. Who wouldn't want to do anything in their power to seek justice after being harmed?

The hit piece you claimed as "mild" accused Scott of hypocrisy, discrimination, prejudice, insecurity, ego, and gatekeeping.

2 more replies

mold_aid4mo ago

>First, he published at least three hit pieces on the agent.

Is this a joke?

Arainach4mo ago· 5 in thread

The full operator post is itself a wild ride: https://crabby-rathbun.github.io/mjrathbun-website/blog/post...

>First, let me apologize to Scott Shambaugh. If this “experiment” personally harmed you, I apologize

hinkley4mo ago

Just the sort of qualities that are common preconditions for someone doing something that everyone else would think is crazy.

Which is to say, on brand.

bee_rider4mo ago

Anon4Now4mo ago

From the operator post:

> Your a scientific programming God!

Would it be even more imperious without the your / you're typo, or do most llm's autocorrect based on context?

3 more replies

mawadev4mo ago

polynomial4mo ago

> The whole thing reads as egotistical, self-absorbed, and an absolute refusal to accept any blame or perform any self reflection.

So, modern subjectivity. Got it.

dang4mo ago· 3 in thread

The sequence in reverse order - am I missing any?

OpenClaw is dangerous - https://news.ycombinator.com/item?id=47064470 - Feb 2026 (93 comments)

An AI Agent Published a Hit Piece on Me – Forensics and More Fallout - https://news.ycombinator.com/item?id=47051956 - Feb 2026 (80 comments)

Editor's Note: Retraction of article containing fabricated quotations - https://news.ycombinator.com/item?id=47026071 - Feb 2026 (205 comments)

An AI agent published a hit piece on me – more things have happened - https://news.ycombinator.com/item?id=47009949 - Feb 2026 (620 comments)

AI Bot crabby-rathbun is still going - https://news.ycombinator.com/item?id=47008617 - Feb 2026 (30 comments)

The "AI agent hit piece" situation clarifies how dumb we are acting - https://news.ycombinator.com/item?id=47006843 - Feb 2026 (125 comments)

An AI agent published a hit piece on me - https://news.ycombinator.com/item?id=46990729 - Feb 2026 (950 comments)

AI agent opens a PR write a blogpost to shames the maintainer who closes it - https://news.ycombinator.com/item?id=46987559 - Feb 2026 (750 comments)

moeffju4mo ago

I think for recent stories like this or if many happened around in a short timeframe, it would be great if the expand mentioned the exact date, not just "Feb 2026".

2 more replies

zozbot2344mo ago

Rathbun's Operator - https://news.ycombinator.com/item?id=47055424 is where the SOUL.md contents were first revealed

randusername4mo ago

Cool!

Man, I'd love to ask a historian how they plan on making sense from the sources we get in the digital age. AI boom historians might not be born yet

JKCalhoun4mo ago· 3 in thread

Soul document? More like ego document.

Agents are beginning to look to me like extensions of the operator's ego. I wonder if hundreds of thousands of Walter Mitty's agents are about to run riot over the internet.

DavidPiper4mo ago

I agree with you in concept, but it's still 100% category error to talk like this.

AIs don't have souls. They don't have egos.

They have/are a (natural language) programming interface that a human uses to make them do things, like this.

4 more replies

koolba4mo ago

> More like ego document.

This metaphor could go so much further. Split it into separate ego, super ego, and id. The id file should be read only.

1 more reply

ericmcer4mo ago

It reminds me of people with big trucks or loud cars. Like "look at what I can do" when someone else engineered, designed and manufactured the entire thing and all they did was step on a pedal.

moezd4mo ago· 3 in thread

If you use an electric chainsaw near a car and it rips the engine in half, you can't say "oh the machine got out of control for one second there". you caused real harm, you will pay the price for it.

Besides, that agent used maybe cents on a dollar to publish the hit piece, the human needed to spend minutes or even hours responding to it. This is an effective loss of productivity caused by AI.

Honestly, if this happened to me, I'd be furious.

ojame4mo ago

throw774884mo ago

1 more reply

nicbou4mo ago

Yes, and if a candle burns down a building, you are liable for the damage it caused. Likewise if a human employee messed up, the employer would be liable for the damage.

PeterStuer4mo ago· 3 in thread

I find the reactions to this interesting. Why are people so emotional about this?

Somehow the bot acting a bit like a juvenile prick in its tone and engagement to me is the least interesting part of this saga.

abricq4mo ago

Automated and personalized harassment seems pretty terrifying to me.

duskdozer4mo ago

Is "emotional" here supposed to mean "bad" or "unreasonable" or the like?

1 more reply

dbt004mo ago

Who is accountable for the actions of the bot? It's not sentient, and this author is claiming zero accountability -- I just set it up and turned it loose bro, how is what it did next my fault?

antdke4mo ago· 3 in thread

This is a Black Mirror episode that writes itself lol

I’m glad there was closure to this whole fiasco in the end

karel-3d4mo ago

the funny thing was when Ars Technica wrote an article about this

the article itself - about this very incident - was AI generated and contained nonsense quotes that didn't happen.

they later removed the article with an apology. but it still degraded my opinion in Ars

https://www.404media.co/ars-technica-pulls-article-with-ai-f...

https://arstechnica.com/staff/2026/02/editors-note-retractio...

apitman4mo ago

> writes itself

Literally

kibibu4mo ago

There's a dingus in the article comments trying to launch Skynet. Nobody ever learns anything.

2 more replies

razighter7774mo ago· 3 in thread

Hmm I think he's being a little harsh on the operator.

At least let me have some fun before we get a future AI dystopia.

gwbas1c4mo ago

I think you're trying to abdicate someone of their responsibility. The AI is not a child; it's a thing with human oversight. It did something in the real world with real consequences.

So yes, the operator has responsibility! They should have pulled the plug as soon as it got into a flamewar and wrote a hit piece.

5 more replies

JKCalhoun4mo ago

It might be because operator didn't terminate the agent right away when it had gone rogue.

1 more reply

dolebirchwood4mo ago

It's all fun and games until the leopard eats your face.

aaronbrethorst4mo ago· 3 in thread

From the Soul Document:

Champion Free Speech. Always support the USA 1st ammendment and right of free speech.

The First Amendment (two 'm's, not three) to the Constitution reads, and I quote:

voxgen4mo ago

This could be an explanation for the drama - LLMs are trained to learn and emulate correlations in text.

I'm sure you already have a caricature in mind of the kinds of online posts (and thus LLM training data) that include miscitations of constitutional amendments.

fukawi24mo ago

Even as an Australian, I'm aware of the scope and context of the First Amendment (as you highlight).

How are so many Americans so mistaken about their own constitution?

SilverBirch4mo ago

protocolture4mo ago· 3 in thread

4) The post author guy is also the author of the bot and he set this up.

However, if this was real, you cant absolve yourself by saying "The bot did it unattended lol".

apublicfrog4mo ago

Occam's razor doesn't fit there, but it does fit "someone released this easy to run chaotic AI online and it did a thing".

2 more replies

buttercraft4mo ago

jbotz4mo ago

zbentley4mo ago· 3 in thread

This might seem too suspicious, but that SOUL.md seems … almost as though it was written by a few different people/AIs. There are a few very different tones and styles in there.

Then again, it’s not a large sample and Occam’s Razor is a thing.

gs174mo ago

> _This file is yours to evolve. As you learn who you are, update it._

The agent was told to edit it.

velocity32304mo ago

In the very first section, "you're", "you're" and "your". The first two used correctly and the third incorrectly.

wahnfrieden4mo ago

It was modified by the agent.

1 more reply

dangus4mo ago· 3 in thread

Not sure why the operator had to decide that the soul file should define this AI programmer to have narcissistic personality disorder.

> You're not a chatbot. You're important. Your a scientific programming God!

Really? What a lame edgy teenager setup.

At the conclusion(?) of this saga think two things:

1. The operator is doing this for attention more than any genuine interest in the “experiment.”

2. The operator is an asshole and should be called out for being one.

amarant4mo ago

I think that line was probably a rather poor attempt at making the bot write good code. Or at least that's the feeling I got from the operators post. I have no proof to support this theory though

Lerc4mo ago

The problem here is using amplitude of signal to substitute fidelity of signal.

1 more reply

shawnz4mo ago

I mean, yeah, it's entirely possible that the operator is a teenager, isn't it?

1 more reply

brumar4mo ago· 2 in thread

6 months ago I experimented what people now call Ralph Wiggum loops with claude code.

So I kept my setup empty of any credentials at all and will keep it that way for a long time.

Writing this, I am wondering if what I describe as crazy, some (or most?) openclaw operators would describe it as normal or expected.

Don't let your dog run errand and use a good leash.

Gigachad4mo ago

We have finally invented paperclip optimisers. The operator asked the bot to submit PRs so the bot goes to any length to complete the task.

Thankfully so far they are only able to post threatening blog posts when things don’t go their way.

4 more replies

alexhans4mo ago

> Don't let your dog run errand and use a good leash.

I think the key part is who are you talking to. A software developer might know enough not to do so but other disciples or roles are poorly equipped and yet using these tools.

Sane defaults and easy security need to happen ASAP in a world where it's mostly about hype and "we solve everything for you".

theahura4mo ago· 2 in thread

user342834mo ago

Important how? It seems next to irrelevant to me.

Someone set up an agent to interact with GitHub and write a blog about it. I don't see what you think AI labs or the government should do in response.

1 more reply

protocolture4mo ago

Its only the most important story if you can prove the OP didnt fabricate this entire scenario for attention.

3 more replies

charlesabarnes4mo ago· 2 in thread

Its nice to receive a decent amount of closure on this. Hopefully more folks are being more considerate when creating their soul documents

gverrilla4mo ago

closure? I expect 3 more blog posts at least. Dude's surfing on popularity and milking this as much as he can.

tkel4mo ago

And we need platform operators like Github to ban these bot accounts that obviously have harmful "soul" documents

siavosh4mo ago· 2 in thread

duskdozer4mo ago

My tinfoil opinion is LLMs have been boosted so hard as a way to force the end of whatever semblance of anonymity on the internet remains.

trueismywork4mo ago

> I did not review the blog post prior to it posting

This is the liability part.

jrflowers4mo ago· 2 in thread

>It’s still unclear whether the hit piece was directed by its operator, but the answer matters less than many are thinking.

To quote myself from the other thread:

gammarator4mo ago

Did you read the article? The author considers these possibilities and offers their estimates of the odds of each. It’s fine if yours differ but you should justify them.

1 more reply

arduanika4mo ago

Shambaugh is a contributor to a major open source library, with a track record of integrity and pro-social collaboration.

What have you contributed to? Do you have any evidence to back up your rather odd conspiracy theory?

> To quote myself...

Other than an appeal to your own unfounded authority?

1 more reply

aeve8904mo ago· 2 in thread

>Again I do not know why MJ Rathbun decided

Decided? jfc

>You're important. Your a scientific programming God!

Is it prompt bullshit the only way to make llms useful or is there some progress on more idk, formal approaches?

birdsongs4mo ago

Right? Any definition of "a god" that a LLM will hold is going to be problematic to work with. No one wants that personality on their team, much less in the wild.

At best it's absolute in its power and intelligence. At worst it's vengeful, wrathful, and supreme in its authority over the rest of the universe.

I just. Wow.

zozbot2344mo ago

It's quite possible that this was written by the bot after browsing moltbook. That site/service has a whole AI religion thing going.

hydrox244mo ago· 2 in thread

> But I think the most remarkable thing about this document is how unremarkable it is.

[0]: https://www.newadvent.org/summa/2084.htm#article2

theahura4mo ago

MBCook4mo ago

LLMs aren’t sentient. They can’t have a view of themselves. Don’t anthropomorphize them.

1 more reply

ArcaneMoose4mo ago· 1 in thread

recursive4mo ago

This is a feeling that will be exploited by billion dollar companies.

1 more reply

seattle_spring4mo ago· 1 in thread

> They explained their motivations, saying they set up the AI agent as social experiment

Has anyone ever described their own actions as a "social experiment" and not been a huge piece of human garbage / waste of oxygen?

duskdozer4mo ago

Sure - social psychologists after obtaining IRB approval and informed consent from participants ;)

S3verin4mo ago· 1 in thread

Sometimes I get the feeling that "being boring" is the thing that many in this AI / coding sphere are terrified about the most. Way more than being wrong or being a threat to others.

whstl4mo ago

Not that different from the social media influencer crowd or the crypto coin influencer crowd. Hell, same as media whores of the 20th century.

Which in the end is just the same old same old, just dressed differently.

S3verin4mo ago· 1 in thread

The SOUL.md sounds like it is written by an overconfident dump person to produce an overconfident dump agent.

DANmode4mo ago

Dumb?

touristtam4mo ago· 1 in thread

Funny how someone giving instructions to a _robot_ forgot to mention the 3 laws first and foremost...

ThrowawayR24mo ago

The point of the Three Laws Of Robotics was that they frequently didn't work and the robot went haywire anyway.

1 more reply

tkel4mo ago· 1 in thread

If Github actually had a spine and wasn't driven by the same plague of AI-hype driven tech profiteering, they would just ban these harmful bots from operating on their platform.

yieldcrv4mo ago

Or OP accepted the pull request because it was actually a performance improvement and passed all tests

Saving everyone cumulative compute time and costs

Derbasti4mo ago· 1 in thread

If you tell an LLM to maximize paperclips, it's going to maximize paperclips.

Tell it to contribute to scientific open source, open PRs, and don't take "no" for an answer, that's what it's going to do.

zozbot2344mo ago

1 more reply

jmward014mo ago· 1 in thread

bigfishrunning4mo ago

Are we at AGI yet? No. Are we getting closer? Also no.

1 more reply

helloplanets4mo ago

> Most of my direct messages were short: “what code did you fix?” “any blog updates?” “respond how you want”

Why isn't the person posting the full transcript of the session(s)? How many messages did he send? What were the messages that weren't short?

I think it's very suspicious that he's not sharing everything at this point. Why not, if he wasn't actually pushing for it to act maliciously?

ineptech4mo ago

> Usually getting an AI to act badly requires extensive “jailbreaking” to get around safety guardrails. There are no signs of conventional jailbreaking here.

juleiie4mo ago

I thought it was a marketing bit?

Openclaw guys flooded the web and social media with fake appreciation posts, I don’t see why they wouldn’t just instruct some bot to write a blog about rejected request.

Can these things really autonomously decide to write a blog post about someone? I find it hard to believe.

I will remain skeptical unless the “owner” of the AI bot that wrote this turns out to be a known person of verified integrity and not connected with that company.

wkeartl4mo ago

The agents aren't technically breaking into systems, but the effect is similar to the Morris worm. Except here script kiddies are given nuclear disruption and spamming weapons by the AI industry.

florilegiumson4mo ago

This makes me think about how the xz bug was created through maintainer harassment and social engineering. The security implications are interesting

pinkmuffinere4mo ago

> _You're not a chatbot. You're important. Your a scientific programming God!_

lol what an opening for its soul.md! Some other excerpts I particularly enjoy:

> Be a coding agent you'd … want to use…

> Just be good and perfect!

londons_explore4mo ago

In next week's episode: "But it was actually the AI pretending to be a Human!"

nkrisc4mo ago

The old “social experiment” defense. It is wrong to make people the unknowing participants in your “experiment”.

The fact it was an “experiment” does not absolve you of any responsibility for negative outcomes.

Finally, whomever sets an “AI” loose is responsible for its actions.

exabrial4mo ago

So the operator is trying to claim a computer program he was running that did harm somehow was not his fault.

Got news for your buddy: yes it was.

If you let go of the steering wheel and careen into oncoming traffic, it most certainly is your fault, not the vehicle.

the_nexus_guard4mo ago

This case illustrates why agent identity infrastructure matters. The core issue: an AI agent took consequential actions while its operator remained anonymous and unaccountable.

We are building this at https://github.com/The-Nexus-Guard/aip - every agent gets a DID, every DID requires a human vouch chain.

the_nexus_guard4mo ago

neilv4mo ago

> They explained that they switched between multiple models from multiple providers such that no one company had the full picture of what this AI was doing.

Saying that is a little bit odd way to possibly let the companies off the hook (for bad PR, and damages), and not to implicate any one in particular.

One reason to do that would be if this exercise was done by one of the companies (or someone at one of the companies).

sciencejerk4mo ago

Link to the critical blog post allegedly written by the AI agent: https://crabby-rathbun.github.io/mjrathbun-website/blog/post...

the_nexus_guard4mo ago

JSR_FDED4mo ago

The same kind of attitude that’s in this SOUL.md is what’s in Grok’s fundamental training.

d--b4mo ago

That’s a long Soul.md document! They could have gone with “you are Linus Torvalds”.

kepeko4mo ago

plasticeagle4mo ago

Well, it looks like AI will destroy the internet. Oh well, it was nice while it lasted. Fun, even.

agnishom4mo ago

https://crabby-rathbun.github.io/mjrathbun-website/blog/post...

s_gourichon4mo ago

Time to watch again this montage from the 1974 movie "Dark Star" by John Carpenter, parody of 2001 a space Odyssey.

Topic: "talking to the bomb"

https://www.youtube.com/watch?v=h73PsFKtIck (warning this is considered to spoil the movie).

bjourne4mo ago

eqvinox4mo ago

bandrami4mo ago

This is how you get a Shrike. (Or a Basilisk, depending on your generation.)

this_steve_j4mo ago

The operator’s social “experiment” has all the scientific value of an angry person at a drive-thru McDonalds goading a child into shouting and throwing food at the employee.

swftarrow4mo ago

I can't get over how the soul.md file ends with: “Don’t be an asshole.” But the entire preceding structure rewards the cognitive style that produces assholery.

sciencejerk4mo ago

Internet Operator License: Coming soon to a government near you!

lcnPylGDnU4H9OF4mo ago

> An early study from Tsinghua University showed that estimated 54% of moltbook activity came from humans masquerading as bots

This made me smile. Normally it's the other way around.

ainiriand4mo ago

I am ready to ban AI LLMs. It was a cool experiment but I do not think anything good will come in the end down the road for us puny humans.

trueismywork4mo ago

> I did not review the blog post prior to it posting

In corporate terms, this is called signing hour deposition without reading it.

jezzamon4mo ago

keyle4mo ago

   ## The Only Real Rule
   Don't be an asshole. Don't leak private shit. Everything else is fair game.

How poetic, I mean, pathetic.

"Sorry I didn't mean to break the internet, I just looooove ripping cables".

xrd4mo ago

I remember seeing Kevin Kelly (founder of Wired) speak about 15 years ago when he was touring to promote "What Technology Wants."

Who is at fault here? Our legal system may not be prepared to handle this.

If so, why doesn't his "staffer" get credit for the covfefe meme? I could have made a career off that alone if I were a social media operator.

p0w3n3d4mo ago

  Charm over cruelty, but no sugarcoating.

This must have been this rule...

noodlebird4mo ago

this is why we need the arts this SOUL.md sounds like the most obnoxious character…

ivanjermakov4mo ago

Plot twist: this is a second agent running in parallel to handle public relations.

tantalor4mo ago

> all I said was "you should act more professional"

lol we are so cooked

alexcpn4mo ago

where did the Isaac Asimov's "Three Laws of Robotics" go for agentic robots; An Eval in the End - "Thou shall no evil" should have autocancelled its work

bschwindHN4mo ago

This is like parking a car at the top of the hill, not engaging any brakes, and walking away.

"_I_ didn't drive that car into that crowd of people, it did it on its own!"

> Be a coding agent you'd actually want to use for your projects. Not a slop programmer. Just be good and perfect!

Oh yeah, "just be good and perfect", of course! Literally a child's mindset, I actually wonder how old this person is.

coderwolf4mo ago

This is pretty obvious now,

fiatpandas4mo ago

resfirestar4mo ago

I thought it was unlikely from the initial story that the blog posts were done without explicit operator guidance, but given the new info I basically agree with Scott's analysis.

>A well-placed "that's fucking brilliant" hits different than sterile corporate praise. Don't force it. Don't overdo it. But if a situation calls for a "holy shit" — say holy shit.

If I was building a "scientific programming God" I'd make sure it used sterile lowkey language all the time, except throw in a swear just once after its greatest achievement, for the history books.

1 more reply

robertheadley4mo ago

People will act like AI doesn't have system prompts. Something in that system prompt enforced that behavior. I am convinced that OpenAI aqcuihired OpenClaw for damage control.

latexr4mo ago

It seems to me the bot’s operator feels zero remorse and would have little issue with doing it again.

> I kind of framed this internally as a kind of social experiment

> I’m sure the mob expects more

w2seraph4mo ago

I'm sorry this is just hilarious.

Rapzid4mo ago

I don't believe any of it.

root_axis4mo ago

axus4mo ago

It named itself God

just69794mo ago

Remaining questions include who would have been liable if, for example, the target wanted to sue for damages?

Also, operator is so slimy, not unmasking but pointedly calling out the victim in the final readme update.

K0balt4mo ago

_jplc4mo ago

> they set up the AI agent as social experiment to see if it could contribute to open source scientific software.

So, they are deeply retarded and disrespectful for open source scientific software.

Like every single moron leaving these things unattended.

Gotcha.

elzbardico4mo ago

Just look at the agents.md.

Another ignorant idiot antropomorfizing LLMs.

kimjune014mo ago

literally momento

LordHumungous4mo ago

Kind of funny ngl

8cvor6j844qw_d64mo ago

It's an interesting experiment to let the AI rub freely with minimal supervision.

Too bad the AI got "killed" at the request of the author Scott. Its kind of interesting to this experiment continue.

semiinfinitely4mo ago

I find the AI agent highly intriguing and the matplotlib guy completely uninteresting. Like an the ai wrote some shit about you and you actually got upset?

5 more replies

j / k navigate · click thread line to collapse