I think Mr. Shambaugh is probably telling the truth here, as best he can, and is a much more above-board dude than Mr. Steinberger. MJ Rathbun might not be as autonomous as he thinks, but the possibility of someone's AI acting like MJ Rathbun is entirely plausable, so why not pay attention to the whole saga?
Edit: Tim-Star pointed out that I'm mixed up about Moltbook and Openclaw. My Mistake. Moltbook used AI agents running openclaw but wasn't made by Steinberger.
This is terrible news not only for open source maintainers, but any journalist, activist or person that dares to speak out against powerful entities that within the next few months have enough LLM capabilities, along with their resources, to astro-turf/mob any dissident out of the digital space - or worse (rent-a-human but dark web).
We need laws for agents, specifically that their human-maintainers must be identifiable and are responsible. It's not something I like from a privacy perspective, but I do not see how society can overcome this without. Unless we collectively decide to switch the internet off.
I know politics is forbidden on HN, but, as non-politically as possible: institutional power has been collapsing across the board (especially in US, but elsewhere as well) as wealthy individuals yield increasingly more power.
The idea that any solutions to problems as subtle as this one will be solved with "legal authority" is out of touch with the direction things are going. Especially since you propose legislation as a method to protect those that:
> that dares to speak out against powerful entities
It's increasingly clear that the vast majority of political resource are going towards the interests of those "powerful entities". If you're not one of them it's best you try to stay out of their way. But if you want to speak out against them, the law is far more likely to be warped against you than the be extended to protect you.
As an example: "freedom of speech" is obviously a good thing, especially if a government becomes more authoritarian, however it can also become one of your society's biggest weaknesses by allowing, especially in the digital space, actors to use that freedom to make just any unfounded statements about anything and anyone leading to a collapse of the basic trust in society (there's a certain power that used this playbook to great success in regard to post truthismus). If instead, a still democratic government would have changed it to, you "freedom of speech - within the limits of what you can prove", that would have kept society on a much more safe lane instead of spiralling out of control through SM (social media) pushed absurdity.
Under current law, an LLM's operator would already be found responsible for most harms caused by their agent, either directly or through negligence. It's no different than a self-driving car or autonomous drone.
As for "identifiable", I get why that would be good but it has significant implementation downsides - like losing online anonymity for humans. And it's likely bad actors could work around whatever limitations were erected. We need to be thoughtful before rushing to create new laws while we're still in the early stages of a fast-moving, still-emerging situation.
There was no real "attack" beyond that, the worst of it was some sharp criticism over being "discriminated" against compared to human contributors; but as it turns out, this also accurately and sincerely reports on the AI's somewhat creative interpretation of well-known human normative standards, which are actively reinforced in the post-learning training of all mainstream LLM's!
I really don't understand why everyone is calling this a deliberate breach of alignment, when it was nothing of the sort. It was a failure of comprehension with somewhat amusing effects down the road.
I'm on the fence whether this is a legitimate situation with this sham fellow, but irregardless I find it concerning how many people are so willing to abandon online privacy at the drop of a hat.
This just creates a resource/power hurdle. The hoi polloi will be forced to disclose their connection to various agents. State actors or those with the resources/time to cover their tracks better will simply ignore the law.
I don't really have a better solution, and I think we're seeing the slow collapse of the internet as a useful tool for genuine communication. Even before AI, things like user reviews were highly gamed and astroturfed. I can imagine that this is only going to accelerate. Information on the internet - which was always a little questionable - will become nearly useless as a source of truth.
See it’s the people, doing things they’ve always done, not the technology that supercharges those impulses via engagement addiction and power fantasies.
Resist the call to reign in this wildly dangerous technology that is making social media look like a lightweight in its speed to scaling distributed social harm. Some of us can still make a buck off it.
If this technology, on top of making a highly scalable way to scam, delude, cyber bully, and trap its users in psychosis also happens to cause the collapse of professional labor, I will shed no tears for the people naysaying the danger. I hope the AI data centers are burned to the ground, but I’m not holding my breath.
Rathbun's style is very likely AI, and quickly collecting information for the hit piece also points to AI. Whether the bot did this fully autonomously or not does not matter.
It is likely that someone did this to research astroturfing as a service, including the automatic generation of oppo files and spread of slander. That person may want to get hired by the likes of OpenAI.
Indeed, that's a good question. What motivations might someone have to keep this running?
Edit: the posts comments suggest that someone believes they are a crypto bro.
Some people are just terrible like that
Not an outcome I'm eager to see!
Commands have blast radius. Writing a local file is reversible and invisible. git push reaches collaborators. Publishing to Twitter reaches the internet. These are fundamentally different operations but to an autonomous agent they're all just tool calls that succeed.
I ran into the same thing; an agent publishing fabricated claims across multiple platforms because it had MCP access and nothing distinguishing "write analysis to file" from "post analysis to Twitter." The fix was simple: classify commands as local, shared, or external. Auto-approve local. Warn on shared. Defer external to human review. A regex pattern list against the output catches the external tier. It's not sophisticated but it doesn't need to be. The classification is mechanical (does this command reach the internet?) not semantic (is this content accurate?). Semantic verification is what the agent already failed at.
Prompt constraints ("don't publish") reduce probability. Post-execution scanning catches what slips through. Neither alone is sufficient. Both together with a deferred action queue at the end of the run covers it.
I actually disagree with Shambaugh. I think Ars is already breaking the way our media is meant to work, they know the steps to go through and so they cynically go through them in the full knowledge they haven't actually put in place any mechanisms to stop it happening again. It's a theoretical risk that Ars' reputation suffers, but it's a financial risk this week if they get fewer page views by publishing fewer higher quality articles and Conde Nast isn't in the business of making smart long term decisions about digital media.
Don't get me wrong: it would certainly be very valuable to any LLM developer or deployer to know that other plausible scenarios [1] have been disproved. Since LLMs are a black box, investigating or reproducing this would be very difficult, but worth the effort if there's no other explanation. However, if this was not caused by the internal mechanisms of the model, it just becomes a fishing expedition for red herrings.
Things that would indicate no human intervention at any point in the chain:
- log of actual changes (e.g., commits) to configurations (e.g., system prompt, user prompts), before and after the event, not self-reported by the agent;
- log of the chat session inputs and outputs, and the agent thinking chain;
- log of account logins;
- info on the model deployment, OpenClaw configs, etc.
That said, this seems to be an example where many, including the author, want to discuss a particular cause (instrumental convergence) and its implications, regardless of the real cause. And that's OK, I guess - maybe it was never about the whodunnit, but about the what if the LLM agent dunnit.
[1] I've discussed them in the thread of the first article, but shortly: human hiding actions behind agent; direct prompt (incl. jailbreak); system prompt (incl. jailbreak); malicious model chosen on purpose; fine-tuned jailbroken model.
https://crabby-rathbun.github.io/mjrathbun-website/blog/post...
https://arstechnica.com/staff-directory/
The job of a fact checker is to verify the details, such as names, dates, and quotes, are correct. That might mean calling up the interview subjects to verify their statements.
It comes across as Ars Technica does no fact checking. The fault lies with the managing editor. If they just assume the writer verified the facts, that is not responsible journalism, it's just vibes.
Benji Edwards was, is, and will continue to be, a good guy. He's just exhibiting a (hopefully) temporary over-reliance on AI tools that aren't up to the task. Any of us who use these tools could make a mistake of this kind.
Technically yes, any of us could neglect the core duties of our job and outsource it to a known-flawed operator and hope that nobody notices.
But that doesn't minimize the severity of what was done here. Ensuring accurate and honest reporting is the core of a journalist's job. This author wasn't doing that at all.
This isn't an "any one of us" issue because we don't have a platform on a major news website. When people in positions like this drop the ball on their jobs, it's important to hold them accountable.
For a senior tech writer?
Come on, man.
> Any of us who use these tools could make a mistake of this kind.
No, no not any of us.
And, as Benji will know himself, certainly not if accuracy is paramount.
Journalistic integrity - especially when quoting someone - is too valuable to be rooted in AI tools.
This is a big, big L for Ars and Benji.
The humans scare me more than the bot at this point. :-P
Never thinking about that this actual rude behaviour might come back at some point to yourself - and it was not a question of if but just of when.
Well, question answered.