Claude Code runs Git reset –hard origin/main against project repo every 10 mins (opens in new tab)

(github.com)

251 pointsmthwsjc_3mo ago192 comments

192 comments

122 comments · 43 top-level

simianwords3mo ago· 21 in thread

I think this post potentially mischaracterises what may be a one off issue for a certain person as if it were a broader problem. I'm guessing some context has been corrupted?

jeswin3mo ago

It's not a one off issue - it has happened to me a few times. It has once even force pushed to github, which doesn't allow branch protection for private personal projects. Here's an example.

1) claude will stash (despite clear instructions never to do so).

2) claude will use sed to bulk replace (despite clear instructions never to do so). sed replacements make a mess and replaces far too many files.

3) claude restores the stash. Finds a lot of conflicts. Nothing runs.

4) claude decides it can't fix the problem and does a reset hard.

I have this right at the top of my CLAUDE.md and it makes things better, but unlike codex, claude doesn't follow it to the letter. However, it has become a lot better now.

NEVER USE sed TO BULK REPLACE.

*NEVER USE FORCE PUSH OR DESTRUCTIVE GIT OPERATIONS*: `git push --force`, `git push --force-with-lease`, `git reset --hard`, `git clean -fd`, or any other destructive git operations are ABSOLUTELY FORBIDDEN. Use `git revert` to undo changes instead.

bschwindHN3mo ago

When will you all learn that merely "telling" an LLM not to do something won't deterministically prevent it from doing that thing? If you truly want it to never use those commands, you better be prepared to sandbox it to the point where it is completely unable to do the things you're trying to stop.

7 more replies

lambda3mo ago

Why do you expect that a weighted random text generator will ever behave in predictable way?

How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?

This is absolutely insane behavior that you would give Claude access to your GitHub creds. What happens when it sees a prompt injection attack somewhere and exfiltrates all of your creds or wipes out all of your repos?

I can't believe how far people have fallen for this "AI" mania. You are giving a stochastic model that is easily misdirected the keys to all of your productive work.

I can understand the appeal to a degree, that it can seem to do useful work sometimes.

But even so, you can't trust it with anything, not running it in a locked down container that has no access to anything but a Git repo which has all important history stored elsewhere seems crazy.

Shouting harder and harder at the statistical model might give you a higher probability of avoiding the bad behavior, but no guarantee; actually lock down your random text generator properly if you want to avoid it causing you problems.

And of course, given that you've seen how hard it is to get it follow these instructions properly, you are reviewing every line of output code thoroughly, right? Because you can't trust that either.

6 more replies

mtndew4brkfst3mo ago

It has once even force pushed to github, which doesn't allow branch protection for private personal projects.

This is only restricted for *fully free* accounts, but this feature only requires a minimum of a paid Pro account. That starts around $4 USD/month, which sounds worth it to prevent lost work from a runaway tool.

2 more replies

jatora3mo ago

Reinforcing an avoidance tactic is nowhere near as effective as doing that PLUS enforcing a positive tactic. People with loads of 'DONT', 'STOP', etc. in their instructions have no clue what they're doing.

In your own example you have all this huge emphasis on the negatives, and then the positive is a tiny un-emphasized afterthought.

1 more reply

unchar13mo ago

Claude tends to disregard "NEVER do X" quite often, but funnily enough, if you tell it "Always ask me to confirm before going X", it never fails to ask you. And you can deny it every time

1 more reply

kstenerud3mo ago

This is why I use yoloAI (https://github.com/kstenerud/yoloai).

    $ yoloai new bugfix . -a --network-isolated --agent claude

Now I have a claude code session that only has a COPY of my work dir, and can't reach anything over the network except the Claude API server.

Now I interact with the agent, and when it's done:

    $ yoloai diff bugfix
    diff --git a/b64.go b/b64.go
    index cfc5549..253c919 100644
    --- a/b64.go
    +++ b/b64.go
    @@ -39,7 +39,7 @@ func Encode(data []byte) string {
        val |= uint(data[i+2])
       }

    -  out[j] = alphabet[(val>>18)&0x3E]
    +  out[j] = alphabet[(val>>18)&0x3F]
       out[j+1] = alphabet[(val>>12)&0x3F]

       remaining := n - i

Looks good, let's apply it:

    $ yoloai apply bugfix
    Target: /home/ks/tmp/b64

    Commits to apply (1):
      9db260b33bcd Fix bit mask in base64 encoding

    Apply to /home/ks/tmp/b64? [y/N] y
    1 commit(s) applied to /home/ks/tmp/b64

Now the commit claude made inside the sandbox has been applied to my workdir:

    $ git log
    commit 5b0fc3a237efe8bbc9a9e1a05f9ce45d37d38bfa (HEAD -> main)
    Author: Karl Stenerud <kstenerud@gmail.com>
    Date:   Mon Mar 30 05:28:21 2026 +0000

        Fix bit mask in base64 encoding

        Corrected the bit mask for the first character extraction from 0x3E to 0x3F to properly extract all 6 bits.

        Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

    commit 31e12b62b0c3179f3399521d7c4326a8f6130721 (tag: init)

The important thing here is that Claude was not able to reach anything on the network except its own API, and nothing it did ever touched my work dir until I was happy with the changes and applied them.

It also doesn't get access to my credentials, so it couldn't push even if it did have network access.

huijzer3mo ago

> which doesn't allow branch protection for private personal projects.

Time for a personal Forgejo instance? Mine has been running great for more than a year. Faster than GitHub even.

emperorxanu3mo ago

I don't understand how people in this day and age have not learned what the pink elephant problem is.

If you tell AI not to do something, you make it incomprehensibly more likely it will happen.

Use affirming language. Why do you think negative prompts don't exist in diffusion anymore?

DangitBobby3mo ago

I've recently implemented hooks that make it impossible for Claude to use tools that I don't want it to use. You could consider setting up a tool that errors if if they do an unsafe use of sed (or any use of sed if there are safer tools).

anshumankmr3mo ago

Even just last week I auto approved a plan and it even wrote the commit message for me (with @ClaudeCode signed off) which I am grateful my manager did not see.

narrator3mo ago

Claude does not know my github ssh key. I'll do the push myself, thank you. Always good to keep around one or two really import things it can't do.

dolmen3mo ago

Like for humans, teaching the good way to do things works better than forbidding a few bad behaviours.

Jcampuzano23mo ago

Maybe stop using the CLAUDE.md to prevent it from running tools you don't want it to and just setup a hook for pretooluse that blocks any command you don't want.

Its trivial to setup and you could literally ask claude to do it for you and never have any of these issues ever again.

Any and all "I don't want it to ever run this command" issues are just skill issues.

1 more reply

nsonha3mo ago

That's nothing like the issue of the main topic

wzdd3mo ago

"DO NOT, EVER, UNDER ANY CIRCUMSTANCES, think of an elephant"

throwaw123mo ago

you might be right, but consider the implications, if context can be corrupted in 0.1% cases and it starts showing another destructive behaviour, after creating 1000 tickets to agent, your data might be accidentally wiped off

ramses03mo ago

I'd been using cursor at work for a year or two now, figured I'd try it on a personal project. I got to the point where I needed to support env-vars, and my general pattern is `source ./source-me-local-auth` => `export SOME_TOKEN="$( passman read some-token.com/password )"` ...so I wrote up the little dummy script and it literally just says: "Hrm... I think I'll delete these untracked files from the working directory before committing!" ...and goes skipping merrily along it's way.

Never had that experience in the whole time using cursor at work so I had to "take the agent to task" and ask it "WTF-mate? you'd better be able to repro that!" and then circle around the drain for a while getting an AGENTS.md written up. Not really a big deal, as the whole project was like 1k lines in and it's not like the code I'd hand-written there was "irreplaceable" but it lead to some interesting discussion w/ the AI like "Why should I have to tell you this? Shouldn't your baseline training data presume not to delete files that you didn't author? How do you think this affects my trust not just of this agent session, but all agent interactions in the future?"

Overall, this is turning out to be quite interesting technology times we're living in.

3 more replies

throw53mo ago

Yes, exactly. People often overlook that, even with guardrails, it is still probabilities all the way down.

You can reduce the risk, but not drive it to zero, and at scale even very small failure rates will surface.

1 more reply

Jcampuzano23mo ago

I mean its a skill issue in the sense that Claude Code gives you the tools to 100% deterministically prevent this from ever happening without ever relying on the models unpredictability.

Just setup a hook that prevents any git commands you don't ever want it to run and you will never have this happen again.

Whenever I see stuff like this I just wonder if any of these people were ever engineers before AI, because the entire point of software engineering for decades was to make processes as deterministic and repeatable as possible.

colechristensen3mo ago

LLMs do really dumb things sometimes, that's just it.

kibwen3mo ago· 17 in thread

Let's focus on the real issue here, which is that HN has apparently normalized the double hyphen in the title to an en dash--yes, an en dash, not even an em dash.

dragonwriter3mo ago

That's LaTeX convention, double hyphen is an en-dash, triple hyphen is an em-dash.

byronsharman3mo ago

I agree that it should be left as a double hyphen, but an en dash is far more appropriate considering the decades-long precedent set by LaTeX (and continued by Typst).

ajross3mo ago

It's a command line argument. The undeniably correct way to render it is with two minus signs[1] and absolutely not something non-ascii.

[1] Not strictly a hyphen, which has its own unicode point (0x2010) outside of ascii. Unicode embraced the ambiguity by calling this point (0x2d) "HYPHEN-MINUS" formally, but really its only unique typographic usage is to represent subtraction.

4 more replies

tom_3mo ago

Pro tip: pros don't copy and paste from HN titles straight into the command line.

(Or... do they?? Hmm, ok, maybe I need to let this roll around in my mind.)

johnisgood3mo ago

And it should be "--" to begin with, i.e. "--hard".

SoftTalker3mo ago

Two hyphens for an en-dash, three for an em-dash.

rtpg3mo ago

iOS keyboard autocomplete

smallerize3mo ago

Surely its copy and paste though?

1 more reply

jonahx3mo ago

desktop test --

0xbadcafebee3mo ago

Article: "Major issue with most popular AI coding tool"

comments: "ThE tItLe iS aI cOded !!!1"

minitech3mo ago

No, the comment was pointing out that the HN platform automatically replaces `--` in titles with `–`. (I don’t know if that’s true, but that was the intent. Nothing to do with AI.)

butterlesstoast3mo ago

The best community

yunwal2mo ago

Where did they say the title is ai coded?

layer83mo ago

The article is wrong and the issue is closed.

CarVac3mo ago

double hyphens –

triple hyphens —

AnonC3mo ago

For me on iOS:

Double hyphens —

Triple hyphens —-

Actual em dash (typed with more effort, but HN changes it) —

The triple hyphens has a gap in it separating the autocorrected en dash and the hyphen.

mrcwinn3mo ago

Apple actually had the nerve to make it a point to say they’d made their keyboard intelligence better. What a joke. Can’t keyboard, my ass!

luxurytent3mo ago· 5 in thread

Not sure I understand, wouldn't permissions prevent this? The user runs with `--dangerously-skip-permissions` so they can expect wild behaviour. They should run with permissions and a ruleset.

Jcampuzano23mo ago

You could prevent this even with --dangerously-skip-permissions with a simple pretooluse hook.

1 more reply

SpicyLemonZest3mo ago

Who knows whether permissions would prevent this? Anthropic's documentation on permissions (https://code.claude.com/docs/en/permissions) does not describe how permissions are enforced; a slightly uncharitable reading of "How permissions interact with sandboxing" suggests that they are not really enforced and any prompt injection can circumvent them.

jatora3mo ago

With hooks you can enforce permissions much more concretely.

1 more reply

addandsubtract3mo ago

The rules and permissions are no longer program flags, but plain text for the agent to "obey".

petcat3mo ago

That's not what tool use permissions are. The LLM doesn't just magically spawn processes or run code. The Claude Code program itself does those things when the LLM indicates that it wants to. The program has checks and permissions whether those things will be done or not.

1 more reply

mememememememo3mo ago· 5 in thread

As a side note. Always configure remote to reject any kind of trunk push. And ideally any forced push on branches.

throw53mo ago

This! The safeguards need to be outside LLM and they need to be deterministic.

Now I wish I could reject `git reset --hard` on my local system somehow.

0xbadcafebee3mo ago

You could use a wrapper that parses all the command-line options. Basically you loop over "$@", look for strings starting with '-' and '--', skip those; then look for a non-option argument, store that as a subcommand; then look for for more '-' and '--' options. Once that's all done you have enough to find subcommand "reset", subcommand option "--hard". About 50 lines of shell script.

mememememememo3mo ago

Sounds like you care about data stored on your filesystem! Take one step back and solve that problem. Use a proper isolated sandbox, e.g. Github workspace on an account that is working with a fork.

Care about the data in that workspace? Push it first.

Othwerwise it is a cat and mouse game of whackamole.

1 more reply

niek_pas3mo ago

Can’t you just run Claude in a copy of the directory without the .git folder?

namibj3mo ago

Just fork git and patch that out? Can't be that hard just ask the agent for that patch. Don't need to update often either, so it's ok to rebase like twice a year.

kccqzy3mo ago· 4 in thread

> Process monitoring at 0.1-second intervals found zero git processes around reset times.

I don’t think this is a valid way of checking for spawned processes. Git commands are fast. 0.1-second intervals are not enough. I would replace the git on the $PATH by a wrapper that logs all operations and then execs the real git.

wswope3mo ago

Sure looks to me like this whole case is Claude Code chasing its own tail, failing to debug, and offering to instead generate a bug report for the user when it can't figure out a better way forward.

Maybe even submitting the bug report "agentically" without user input, if it's running on host without guardrails (pure speculation).

E: It's a runaway bot lol https://github.com/anthropics/claude-code/issues/40701#issue...

bendews3mo ago

This HN account is also by the same user as github, this submission may be AI created. I wonder if they've let **claw run loose over their whole online presence and this is the result.

bruce_one3mo ago

eBPF is a great tool to use for debugging this kind of thing too, e.g. [bpftrace](https://bpftrace.org) has an [execsnoop](https://github.com/bpftrace/bpftrace/blob/master/tools/execs...) script for looking at everything being exec'd on the system :-)

(No need to use bpftrace, just an easy example :-) )

repiret3mo ago

Or just `strace`.

1 more reply

thunfischtoast3mo ago· 4 in thread

From the issue author:

> Update: Root cause found — this was a bug in a tool I built that was running locally for testing, not Claude Code.

devy3mo ago

Yep. False report.

https://github.com/anthropics/claude-code/issues/40710#issue...

chmod7753mo ago

"I built" is probably doing a lot of work here. Odds are it was some vibe-coded tool.

thunfischtoast2mo ago

The issue and update comment are also clearly generated. I'm not condemning this in general, I prefer a well written generated issue over a badly written manual one. But in this case it has just lead us off track.

pllbnk3mo ago

The entire ticket was most likely created by Claude Code's analysis, i.e. hallucinated. Absurd.

simianwords3mo ago· 4 in thread

Prompt injection?

BoorishBears3mo ago

I was thinking surely scheduled tasks need to be explicitly invoked but nope: https://code.claude.com/docs/en/scheduled-tasks#set-a-one-ti...

Some people are upset at my brave new world characterization, but yeah even as someone deriving value from Claude Code we've jumped the shark on AI in development.

Either the industry will face that reality and recalibrate, or in 20 years we're going to look back on these days like the golden age of software reliability and just accept that software is significantly more broken than it was (we've been priming ourselves for that after all)

mhitza3mo ago

People aren't upset about your characterization. Catch phrases, memes, or other low qualitative comments (with no context, elaboration or personal angle) are contrary to community ethos and down voted.

1 more reply

bonoboTP3mo ago

I agree that it's worrying that we're moving more and more towards implicit and opaque state. Hiding what exactly is getting edited, very limited tooling to check what the subagents are doing exactly, setting up scheduled and recurring tasks without it being obvious etc.

It's tending more and more towards pushing the user to treat the whole thing as a pure chat interface magic black box, instead of a rich dashboard that allows you to keep precise track of what's going on and giving you affordances to intervene. So less a tool view and more magic agent, where the user is not supposed to even think about what the thing is even doing. Just trust the process. If you want to know what it did, just ask it. If you want to know if it deleted all the files, just ask it in the chat. Or don't. Caring about files is old school. Just care about the chat messages it sends you.

3 more replies

viccis3mo ago

Feels like just yesterday that everyone agreed that critical code is read orders of magnitude more than written, so optimizing for quick writing is wrong.

1 more reply

nickphx3mo ago· 4 in thread

cool. if you choose to use a non-deterministic black box of bullshit, should you really be surprised when it shits all over your floor?

gpm3mo ago

The weird part is that it's "shitting over the floor" in quite a deterministic ma nner. Every 600seconds (+- less than 0.5 seconds) doing the exact same thing.

morganastra3mo ago

the purpose of a system is what it does!

gerdesj3mo ago

non sequitor.

coffeeboy273mo ago

The person who posted this bug doesn't seem like the pinnacle of software engineering. To me, this looks like either a user error or some corrupt file or context you should be able to clean up pretty quickly.

You reap what you sow, finance bro.

ghelmer3mo ago· 3 in thread

That is not my experience.

phyzome3mo ago

It's an issue title. It means "this is what is happening for me".

gerdesj3mo ago

Which is what?

Traubenfuchs3mo ago

For him, Claude Code does NOT run git reset --hard origin/main against project repo every 10 minutes.

I just checked, mine also doesn‘t.

ZeljkoS3mo ago· 2 in thread

Update from the author: https://github.com/anthropics/claude-code/issues/40710#issue...

"Update: Root cause found — this was a bug in a tool I built that was running locally for testing, not Claude Code.

When the tool's configuration pointed at a local working directory, it would hard-reset that directory every poll cycle to reflect the remote — destroying all uncommitted changes to tracked files, exactly as described in the issue."

progbits2mo ago

So much "thorough investigation" done but the author did not consider turning off claude for 10 minutes to see if the problems stops? lol

Flagged the submission as it's inaccurate. Will unflag if title gets changed to something like "dev builds script that resets their git repo every 10 minutes, forgets about it, blames claude code with no evidence"

QuantumGood2mo ago

sigh re: - lol ending so many sentences, it's almost replaced </sarcasm>

Jarred3mo ago· 2 in thread

I spent some time investigating this, and the issue is not accurate - Claude Code itself does not have code that spawns `git reset --hard origin/main`

Most likely, the developer ran `/loop 10m <prompt>` or asked claude to create a cron task that runs every 10 minutes and refreshes & resets git.

tylerchilds3mo ago

Probably something innocuous like

“Sync with the server periodically to get the latest”

Tracks for what we can infer

xtajv2mo ago

This is the part where you'd normally pull the junior engineer aside and politely give them a stern talking to until they understood what they did wrong.

If anybody has suggestions for how to do this with LLMs (short of maintaining CLAUDE_wall_of_shame.md), please share.

Edit: for the record, yes I do run a linter, and generally try not to impose bikeshedding or soapboxes on my peers. It's just that there are certain patterns that I personally am not going to commit under my own username as the engineer of record.

Edit 2: I saw another comment recommending "Always confirm with me before doing $x" (and then always denying). Seems like it might work.

1 more reply

oelmgren3mo ago· 2 in thread

I'm curious how common this is or if this just affects this one user.

pattilupone3mo ago

I opened up Hacker News and I saw this right at the top, and I assumed it had started happening to everyone. I thought, good thing I'm not running Claude Code right now.

treesknees3mo ago

I thought, good thing I've already hit my 5-hour session limit.

chaos_emergent3mo ago· 2 in thread

Have you considered that Claude set up a crontab that does that programmatically? Every 10 mins seems awfully, idk, regular.

smallerize3mo ago

But different projects are being reset at different times.

PufPufPuf3mo ago

That's consistent with /loop command.

claudiug3mo ago· 2 in thread

no more developers, all code is written alone /s

Tomis023mo ago

All code is deleted alone

jerukmangga3mo ago

yes sir

throw53mo ago· 2 in thread

Isn't this a natural consequence of how these systems work?

The model is probabilistic and sequences like `git reset --hard` are very common in training data, so they have some probability to appear in outputs.

Whether such a command is appropriate depends on context that is not fully observable to the system, like whether a repository or changes are disposable or not. Because of that, the system cannot rely purely on fixed rules and has to figure intent from incomplete information, which is also probabilistic.

With so many layers of probabilities, it seems expected that sometimes commands like this will be produced even if they are not appropriate in that specific situation.

Even a 0.01% failure rate due to context corruption, misinterpretation of intent, or guardrail errors would show up regularly at scale, that is like 1 in 10000 queries.

simianwords3mo ago

That's not how the systems work. Just by a thing being common in training data doesn't mean it will be produced.

> I guess, what I'm trying to say ... is this even a bug? Sounds like the model is doing exactly what it is designed to do.

False, it goes against the RL/HF and other post training goals.

throw53mo ago

> Just by a thing being common in training data doesn't mean it will be produced.

That's not what I said at all. I never said it will be produced. I said there is some probability of it being produced.

> False, it goes against the RL/HF and other post training goals.

It is correct that frequency in training data alone does not determine outputs, and that post-training (RLHF, policies, etc.) is meant to steer the model away from undesirable behavior.

But those mechanisms do not make such outputs impossible. They just make them less likely. The underlying system is still probabilistic and operating with incomplete context.

I am not sure how you can be so confident that a probabilistic model would never produce `git reset --hard`. There is nothing inherent in how LLMs work that makes that sequence impossible to generate.

1 more reply

lambda3mo ago

Who would have guessed that running a binary blob dev tool, that is tied to a SaaS product, which was mostly vibe-coded, could lead to mysterious, hard to debug problems?

byearthithatius3mo ago

Regardless of if this is common its getting popular because its objectively hilarious and we can all see it being possible.

nstj3mo ago

As an FYI you can recover from force pushes to GitHub using their UI[0] or their API[1].

And if you force push to one of your own machines you can use the reflog[2].

[0]: https://stackoverflow.com/a/78872853 [1]: https://stackoverflow.com/a/48110879 [2]: https://stackoverflow.com/a/24236065

bastard_op2mo ago

I have a similar issue, where normally I use claude-code in an srt or bwrap direct sandbox, but when I don't, claude-code will call gh _every_ time I backspace out of a /command or escape a menu such a /mcp, any time I paste something, as well as on some timer that sounds like the issue owner's complaint of 10 minutes.

I know because I use keepassxc as my secret provider, so I get an approval prompt to allow or deny it _every_time_, so I darn well notice it as it'll grab focus when typing something. Inside a sandbox it tries but without access to creds or env, it just silently fails, so I never noticed while in-sandbox, just when out-of-sandbox, which I do occasionally to let it do some sysadmin/housecleaning task for me.

I finally asked Claude why: "Root cause: It's Claude Code's built-in git context feature, not hooks.

So unless you keep your secrets open to the user regardless who's asking for them, it'll nag you to escape out of menus with a gh query. KeepassXC doesn't let you set a per-session limit, it's either right now for forever.

agent_anuj3mo ago

I give you my personal experinces. I use it for everything design, coding, testing, deploying to kubernetes cluster, fixing issues on cluster. I use it to fix not only dev env issues, I use it for production issues. Confidently. Have things gone wrong. Sure. But mistakes have been rare (and catastrophic mistake - non recoverable , even rarer).

Everytime a mistake has happened,on diggin in I was always trace it back to something which I did wrong - either being careless in reading what it told me , or careless in telling what I want. I have had git code corruption issues, it overwrote uncommited working code with non working code. But it was my mistake to not tell it to commit the code before makign changes. It deleted QA cluster database but becuase I told it to delete it thinking it was my dev setup db. Net net. It;s mistakes are more a reflection of me as its supervisor than anything else.

jrvarela563mo ago

It’s a feature not a bug!

mrothroc2mo ago

I see this has been updated by the user showing it is their own tool doing the damage.

These things happen. They happened before coding agents, they happen now. I've done plenty of damage with my own ten fingers on the keyboard without any help from an LLM.

This is exactly why I develop on a Mac with Time Machine. It has saved my bacon many times. Both from things I did and from things Claude did. I've had several recent incidents that went like this:

"me: Claude, did you delete X?" "claude: Yes, sorry, I shouldn't have done that. I can reconstruct it." [Narrator: no, claude cannot reconstruct it.] "me: Should I just restore it from Time Machine?" "claude: Yes! That's perfect!"

I swear I can feel a sense of relief from Claude when I tell it I can just restore from backup.

11235813213mo ago

This looks similar to a bug report Claude Code offered to file for me after it became confused about my shell environment. The author is probably running something (maybe /loop as suggested in the comment.) In my case, a restart fixed the envs.

newfriend3mo ago

>Update: Root cause found — this was a bug in a tool I built that was running locally for testing, not Claude Code.

whateveracct3mo ago

that must be a very powerful claude.md

bicepjai2mo ago

If we are living in an era where software release everyday is the norm, no amount of testing is enough to claim stability. Roll the dice everyday with these beautiful stochastic imitators.

nerolawa3mo ago

Highly recommend to deny commands in user settings.json like git reset

mmaunder3mo ago

Can we immunize HN against being yet another AI drama site? Obviously this isn’t a fundamental issue with agents or AI or Anthropic but a misconfiguration edge case.

jxcole3mo ago

The obvious solution is to just copy paste it into Claude itself and ask it to fix. Works for almost any Claude problem

rkrbaccord94f3mo ago

95+ entries that are logged at 10 min intervals

/10 * * * /usr/ schedules script execution

Ryand12343mo ago

This is exactly why guardrails need to be deterministic and outside the model.

simonw3mo ago

Has anyone been able to replicate the behavior described in this issue yet?

meander_water3mo ago

Probably does it to reduce context for regex/git history searches

meltyness3mo ago

is this token friendly?

TZubiri3mo ago

tbf, that's claude's workspace

do not share a workspace with the llm, or with anybody for that matter.

How would the llm even distinguish what was wrote by them and what was written by you ?

lqstuart3mo ago

if an idea can't be vibecoded in under 10 minutes, it's not worth pursuing. Checks out

dboreham3mo ago

But it doesn't.

gverrilla3mo ago

obviously a user mistake, not a claude code bug

irishcoffee3mo ago

I’m having this weird vision of a “the matrix 3” type machine crawling around inside Microsoft’s GitHub servers central repository and just wreaking havoc.

This whole LLM thing is a blast, huh?

draw_down3mo ago

Hope they don’t auto-close this one in two weeks

fragmede3mo ago

While that's obviously a bug which should be fixed, having stuff just sitting around uncommitted for days (which is much longer than 10 mins) is an anti-pattern (that I used to fall into).

BoorishBears3mo ago

Truly is a brave new world we're in

I guess some people are upset at my brave new world characterization, but even as someone deriving value from Claude Code we've jumped the shark on AI in development.

The idea a natural request can get Claude to invoke potentially destructive actions on a timer is silly

https://code.claude.com/docs/en/scheduled-tasks#set-a-one-ti...

What would it cost if the /loop command was required instead of optional?

boutell3mo ago

That's interesting man, that's pretty f***' interesting. I don't think I've seen it though. I've let it run for hours making changes overnight and I only do git operations manually.

Oh, but maybe allowing it to do remote git operations is a necessary trigger.

j / k navigate · click thread line to collapse

192 comments

122 comments · 43 top-level

simianwords3mo ago· 21 in thread

I think this post potentially mischaracterises what may be a one off issue for a certain person as if it were a broader problem. I'm guessing some context has been corrupted?

jeswin3mo ago

It's not a one off issue - it has happened to me a few times. It has once even force pushed to github, which doesn't allow branch protection for private personal projects. Here's an example.

1) claude will stash (despite clear instructions never to do so).

2) claude will use sed to bulk replace (despite clear instructions never to do so). sed replacements make a mess and replaces far too many files.

3) claude restores the stash. Finds a lot of conflicts. Nothing runs.

4) claude decides it can't fix the problem and does a reset hard.

I have this right at the top of my CLAUDE.md and it makes things better, but unlike codex, claude doesn't follow it to the letter. However, it has become a lot better now.

NEVER USE sed TO BULK REPLACE.

bschwindHN3mo ago

7 more replies

lambda3mo ago

Why do you expect that a weighted random text generator will ever behave in predictable way?

I can't believe how far people have fallen for this "AI" mania. You are giving a stochastic model that is easily misdirected the keys to all of your productive work.

I can understand the appeal to a degree, that it can seem to do useful work sometimes.

But even so, you can't trust it with anything, not running it in a locked down container that has no access to anything but a Git repo which has all important history stored elsewhere seems crazy.

And of course, given that you've seen how hard it is to get it follow these instructions properly, you are reviewing every line of output code thoroughly, right? Because you can't trust that either.

6 more replies

mtndew4brkfst3mo ago

It has once even force pushed to github, which doesn't allow branch protection for private personal projects.

2 more replies

jatora3mo ago

In your own example you have all this huge emphasis on the negatives, and then the positive is a tiny un-emphasized afterthought.

1 more reply

unchar13mo ago

Claude tends to disregard "NEVER do X" quite often, but funnily enough, if you tell it "Always ask me to confirm before going X", it never fails to ask you. And you can deny it every time

1 more reply

kstenerud3mo ago

This is why I use yoloAI (https://github.com/kstenerud/yoloai).

    $ yoloai new bugfix . -a --network-isolated --agent claude

Now I have a claude code session that only has a COPY of my work dir, and can't reach anything over the network except the Claude API server.

Now I interact with the agent, and when it's done:

    $ yoloai diff bugfix
    diff --git a/b64.go b/b64.go
    index cfc5549..253c919 100644
    --- a/b64.go
    +++ b/b64.go
    @@ -39,7 +39,7 @@ func Encode(data []byte) string {
        val |= uint(data[i+2])
       }

    -  out[j] = alphabet[(val>>18)&0x3E]
    +  out[j] = alphabet[(val>>18)&0x3F]
       out[j+1] = alphabet[(val>>12)&0x3F]

       remaining := n - i

Looks good, let's apply it:

    $ yoloai apply bugfix
    Target: /home/ks/tmp/b64

    Commits to apply (1):
      9db260b33bcd Fix bit mask in base64 encoding

    Apply to /home/ks/tmp/b64? [y/N] y
    1 commit(s) applied to /home/ks/tmp/b64

Now the commit claude made inside the sandbox has been applied to my workdir:

    $ git log
    commit 5b0fc3a237efe8bbc9a9e1a05f9ce45d37d38bfa (HEAD -> main)
    Author: Karl Stenerud <kstenerud@gmail.com>
    Date:   Mon Mar 30 05:28:21 2026 +0000

        Fix bit mask in base64 encoding

        Corrected the bit mask for the first character extraction from 0x3E to 0x3F to properly extract all 6 bits.

        Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

    commit 31e12b62b0c3179f3399521d7c4326a8f6130721 (tag: init)

It also doesn't get access to my credentials, so it couldn't push even if it did have network access.

huijzer3mo ago

> which doesn't allow branch protection for private personal projects.

Time for a personal Forgejo instance? Mine has been running great for more than a year. Faster than GitHub even.

emperorxanu3mo ago

I don't understand how people in this day and age have not learned what the pink elephant problem is.

If you tell AI not to do something, you make it incomprehensibly more likely it will happen.

Use affirming language. Why do you think negative prompts don't exist in diffusion anymore?

DangitBobby3mo ago

anshumankmr3mo ago

Even just last week I auto approved a plan and it even wrote the commit message for me (with @ClaudeCode signed off) which I am grateful my manager did not see.

narrator3mo ago

Claude does not know my github ssh key. I'll do the push myself, thank you. Always good to keep around one or two really import things it can't do.

dolmen3mo ago

Like for humans, teaching the good way to do things works better than forbidding a few bad behaviours.

Jcampuzano23mo ago

Maybe stop using the CLAUDE.md to prevent it from running tools you don't want it to and just setup a hook for pretooluse that blocks any command you don't want.

Its trivial to setup and you could literally ask claude to do it for you and never have any of these issues ever again.

Any and all "I don't want it to ever run this command" issues are just skill issues.

1 more reply

nsonha3mo ago

That's nothing like the issue of the main topic

wzdd3mo ago

"DO NOT, EVER, UNDER ANY CIRCUMSTANCES, think of an elephant"

throwaw123mo ago

ramses03mo ago

Overall, this is turning out to be quite interesting technology times we're living in.

3 more replies

throw53mo ago

Yes, exactly. People often overlook that, even with guardrails, it is still probabilities all the way down.

You can reduce the risk, but not drive it to zero, and at scale even very small failure rates will surface.

1 more reply

Jcampuzano23mo ago

I mean its a skill issue in the sense that Claude Code gives you the tools to 100% deterministically prevent this from ever happening without ever relying on the models unpredictability.

Just setup a hook that prevents any git commands you don't ever want it to run and you will never have this happen again.

colechristensen3mo ago

LLMs do really dumb things sometimes, that's just it.

kibwen3mo ago· 17 in thread

Let's focus on the real issue here, which is that HN has apparently normalized the double hyphen in the title to an en dash--yes, an en dash, not even an em dash.

dragonwriter3mo ago

That's LaTeX convention, double hyphen is an en-dash, triple hyphen is an em-dash.

byronsharman3mo ago

I agree that it should be left as a double hyphen, but an en dash is far more appropriate considering the decades-long precedent set by LaTeX (and continued by Typst).

ajross3mo ago

It's a command line argument. The undeniably correct way to render it is with two minus signs[1] and absolutely not something non-ascii.

4 more replies

tom_3mo ago

Pro tip: pros don't copy and paste from HN titles straight into the command line.

(Or... do they?? Hmm, ok, maybe I need to let this roll around in my mind.)

johnisgood3mo ago

And it should be "--" to begin with, i.e. "--hard".

SoftTalker3mo ago

Two hyphens for an en-dash, three for an em-dash.

rtpg3mo ago

iOS keyboard autocomplete

smallerize3mo ago

Surely its copy and paste though?

1 more reply

jonahx3mo ago

desktop test --

0xbadcafebee3mo ago

Article: "Major issue with most popular AI coding tool"

comments: "ThE tItLe iS aI cOded !!!1"

minitech3mo ago

No, the comment was pointing out that the HN platform automatically replaces `--` in titles with `–`. (I don’t know if that’s true, but that was the intent. Nothing to do with AI.)

butterlesstoast3mo ago

The best community

yunwal2mo ago

Where did they say the title is ai coded?

layer83mo ago

The article is wrong and the issue is closed.

CarVac3mo ago

double hyphens –

triple hyphens —

AnonC3mo ago

For me on iOS:

Double hyphens —

Triple hyphens —-

Actual em dash (typed with more effort, but HN changes it) —

The triple hyphens has a gap in it separating the autocorrected en dash and the hyphen.

mrcwinn3mo ago

Apple actually had the nerve to make it a point to say they’d made their keyboard intelligence better. What a joke. Can’t keyboard, my ass!

luxurytent3mo ago· 5 in thread

Not sure I understand, wouldn't permissions prevent this? The user runs with `--dangerously-skip-permissions` so they can expect wild behaviour. They should run with permissions and a ruleset.

Jcampuzano23mo ago

You could prevent this even with --dangerously-skip-permissions with a simple pretooluse hook.

1 more reply

SpicyLemonZest3mo ago

jatora3mo ago

With hooks you can enforce permissions much more concretely.

1 more reply

addandsubtract3mo ago

The rules and permissions are no longer program flags, but plain text for the agent to "obey".

petcat3mo ago

1 more reply

mememememememo3mo ago· 5 in thread

As a side note. Always configure remote to reject any kind of trunk push. And ideally any forced push on branches.

throw53mo ago

This! The safeguards need to be outside LLM and they need to be deterministic.

Now I wish I could reject `git reset --hard` on my local system somehow.

0xbadcafebee3mo ago

mememememememo3mo ago

Sounds like you care about data stored on your filesystem! Take one step back and solve that problem. Use a proper isolated sandbox, e.g. Github workspace on an account that is working with a fork.

Care about the data in that workspace? Push it first.

Othwerwise it is a cat and mouse game of whackamole.

1 more reply

niek_pas3mo ago

Can’t you just run Claude in a copy of the directory without the .git folder?

namibj3mo ago

Just fork git and patch that out? Can't be that hard just ask the agent for that patch. Don't need to update often either, so it's ok to rebase like twice a year.

kccqzy3mo ago· 4 in thread

> Process monitoring at 0.1-second intervals found zero git processes around reset times.

wswope3mo ago

Sure looks to me like this whole case is Claude Code chasing its own tail, failing to debug, and offering to instead generate a bug report for the user when it can't figure out a better way forward.

Maybe even submitting the bug report "agentically" without user input, if it's running on host without guardrails (pure speculation).

E: It's a runaway bot lol https://github.com/anthropics/claude-code/issues/40701#issue...

bendews3mo ago

This HN account is also by the same user as github, this submission may be AI created. I wonder if they've let **claw run loose over their whole online presence and this is the result.

bruce_one3mo ago

(No need to use bpftrace, just an easy example :-) )

repiret3mo ago

Or just `strace`.

1 more reply

thunfischtoast3mo ago· 4 in thread

From the issue author:

> Update: Root cause found — this was a bug in a tool I built that was running locally for testing, not Claude Code.

devy3mo ago

Yep. False report.

https://github.com/anthropics/claude-code/issues/40710#issue...

chmod7753mo ago

"I built" is probably doing a lot of work here. Odds are it was some vibe-coded tool.

thunfischtoast2mo ago

pllbnk3mo ago

The entire ticket was most likely created by Claude Code's analysis, i.e. hallucinated. Absurd.

simianwords3mo ago· 4 in thread

Prompt injection?

BoorishBears3mo ago

I was thinking surely scheduled tasks need to be explicitly invoked but nope: https://code.claude.com/docs/en/scheduled-tasks#set-a-one-ti...

Some people are upset at my brave new world characterization, but yeah even as someone deriving value from Claude Code we've jumped the shark on AI in development.

mhitza3mo ago

1 more reply

bonoboTP3mo ago

3 more replies

viccis3mo ago

Feels like just yesterday that everyone agreed that critical code is read orders of magnitude more than written, so optimizing for quick writing is wrong.

1 more reply

nickphx3mo ago· 4 in thread

cool. if you choose to use a non-deterministic black box of bullshit, should you really be surprised when it shits all over your floor?

gpm3mo ago

The weird part is that it's "shitting over the floor" in quite a deterministic ma nner. Every 600seconds (+- less than 0.5 seconds) doing the exact same thing.

morganastra3mo ago

the purpose of a system is what it does!

gerdesj3mo ago

non sequitor.

coffeeboy273mo ago

You reap what you sow, finance bro.

ghelmer3mo ago· 3 in thread

That is not my experience.

phyzome3mo ago

It's an issue title. It means "this is what is happening for me".

gerdesj3mo ago

Which is what?

Traubenfuchs3mo ago

For him, Claude Code does NOT run git reset --hard origin/main against project repo every 10 minutes.

I just checked, mine also doesn‘t.

ZeljkoS3mo ago· 2 in thread

Update from the author: https://github.com/anthropics/claude-code/issues/40710#issue...

"Update: Root cause found — this was a bug in a tool I built that was running locally for testing, not Claude Code.

progbits2mo ago

So much "thorough investigation" done but the author did not consider turning off claude for 10 minutes to see if the problems stops? lol

QuantumGood2mo ago

sigh re: - lol ending so many sentences, it's almost replaced </sarcasm>

Jarred3mo ago· 2 in thread

I spent some time investigating this, and the issue is not accurate - Claude Code itself does not have code that spawns `git reset --hard origin/main`

Most likely, the developer ran `/loop 10m <prompt>` or asked claude to create a cron task that runs every 10 minutes and refreshes & resets git.

tylerchilds3mo ago

Probably something innocuous like

“Sync with the server periodically to get the latest”

Tracks for what we can infer

xtajv2mo ago

This is the part where you'd normally pull the junior engineer aside and politely give them a stern talking to until they understood what they did wrong.

If anybody has suggestions for how to do this with LLMs (short of maintaining CLAUDE_wall_of_shame.md), please share.

Edit 2: I saw another comment recommending "Always confirm with me before doing $x" (and then always denying). Seems like it might work.

1 more reply

oelmgren3mo ago· 2 in thread

I'm curious how common this is or if this just affects this one user.

pattilupone3mo ago

I opened up Hacker News and I saw this right at the top, and I assumed it had started happening to everyone. I thought, good thing I'm not running Claude Code right now.

treesknees3mo ago

I thought, good thing I've already hit my 5-hour session limit.

chaos_emergent3mo ago· 2 in thread

Have you considered that Claude set up a crontab that does that programmatically? Every 10 mins seems awfully, idk, regular.

smallerize3mo ago

But different projects are being reset at different times.

PufPufPuf3mo ago

That's consistent with /loop command.

claudiug3mo ago· 2 in thread

no more developers, all code is written alone /s

Tomis023mo ago

All code is deleted alone

jerukmangga3mo ago

yes sir

throw53mo ago· 2 in thread

Isn't this a natural consequence of how these systems work?

The model is probabilistic and sequences like `git reset --hard` are very common in training data, so they have some probability to appear in outputs.

With so many layers of probabilities, it seems expected that sometimes commands like this will be produced even if they are not appropriate in that specific situation.

Even a 0.01% failure rate due to context corruption, misinterpretation of intent, or guardrail errors would show up regularly at scale, that is like 1 in 10000 queries.

simianwords3mo ago

That's not how the systems work. Just by a thing being common in training data doesn't mean it will be produced.

> I guess, what I'm trying to say ... is this even a bug? Sounds like the model is doing exactly what it is designed to do.

False, it goes against the RL/HF and other post training goals.

throw53mo ago

> Just by a thing being common in training data doesn't mean it will be produced.

That's not what I said at all. I never said it will be produced. I said there is some probability of it being produced.

> False, it goes against the RL/HF and other post training goals.

It is correct that frequency in training data alone does not determine outputs, and that post-training (RLHF, policies, etc.) is meant to steer the model away from undesirable behavior.

But those mechanisms do not make such outputs impossible. They just make them less likely. The underlying system is still probabilistic and operating with incomplete context.

1 more reply

lambda3mo ago

Who would have guessed that running a binary blob dev tool, that is tied to a SaaS product, which was mostly vibe-coded, could lead to mysterious, hard to debug problems?

byearthithatius3mo ago

Regardless of if this is common its getting popular because its objectively hilarious and we can all see it being possible.

nstj3mo ago

As an FYI you can recover from force pushes to GitHub using their UI[0] or their API[1].

And if you force push to one of your own machines you can use the reflog[2].

[0]: https://stackoverflow.com/a/78872853 [1]: https://stackoverflow.com/a/48110879 [2]: https://stackoverflow.com/a/24236065

bastard_op2mo ago

I finally asked Claude why: "Root cause: It's Claude Code's built-in git context feature, not hooks.

agent_anuj3mo ago

jrvarela563mo ago

It’s a feature not a bug!

mrothroc2mo ago

I see this has been updated by the user showing it is their own tool doing the damage.

These things happen. They happened before coding agents, they happen now. I've done plenty of damage with my own ten fingers on the keyboard without any help from an LLM.

This is exactly why I develop on a Mac with Time Machine. It has saved my bacon many times. Both from things I did and from things Claude did. I've had several recent incidents that went like this:

I swear I can feel a sense of relief from Claude when I tell it I can just restore from backup.

11235813213mo ago

newfriend3mo ago

>Update: Root cause found — this was a bug in a tool I built that was running locally for testing, not Claude Code.

whateveracct3mo ago

that must be a very powerful claude.md

bicepjai2mo ago

If we are living in an era where software release everyday is the norm, no amount of testing is enough to claim stability. Roll the dice everyday with these beautiful stochastic imitators.

nerolawa3mo ago

Highly recommend to deny commands in user settings.json like git reset

mmaunder3mo ago

Can we immunize HN against being yet another AI drama site? Obviously this isn’t a fundamental issue with agents or AI or Anthropic but a misconfiguration edge case.

jxcole3mo ago

The obvious solution is to just copy paste it into Claude itself and ask it to fix. Works for almost any Claude problem

rkrbaccord94f3mo ago

95+ entries that are logged at 10 min intervals

/10 * * * /usr/ schedules script execution

Ryand12343mo ago

This is exactly why guardrails need to be deterministic and outside the model.

simonw3mo ago

Has anyone been able to replicate the behavior described in this issue yet?

meander_water3mo ago

Probably does it to reduce context for regex/git history searches

meltyness3mo ago

is this token friendly?

TZubiri3mo ago

tbf, that's claude's workspace

do not share a workspace with the llm, or with anybody for that matter.

How would the llm even distinguish what was wrote by them and what was written by you ?

lqstuart3mo ago

if an idea can't be vibecoded in under 10 minutes, it's not worth pursuing. Checks out

dboreham3mo ago

But it doesn't.

gverrilla3mo ago

obviously a user mistake, not a claude code bug

irishcoffee3mo ago

I’m having this weird vision of a “the matrix 3” type machine crawling around inside Microsoft’s GitHub servers central repository and just wreaking havoc.

This whole LLM thing is a blast, huh?

draw_down3mo ago

Hope they don’t auto-close this one in two weeks

fragmede3mo ago

While that's obviously a bug which should be fixed, having stuff just sitting around uncommitted for days (which is much longer than 10 mins) is an anti-pattern (that I used to fall into).

BoorishBears3mo ago

Truly is a brave new world we're in

I guess some people are upset at my brave new world characterization, but even as someone deriving value from Claude Code we've jumped the shark on AI in development.

The idea a natural request can get Claude to invoke potentially destructive actions on a timer is silly

https://code.claude.com/docs/en/scheduled-tasks#set-a-one-ti...

What would it cost if the /loop command was required instead of optional?

boutell3mo ago

That's interesting man, that's pretty f***' interesting. I don't think I've seen it though. I've let it run for hours making changes overnight and I only do git operations manually.

Oh, but maybe allowing it to do remote git operations is a necessary trigger.

j / k navigate · click thread line to collapse