Related ongoing thread: The Claude Code Source Leak: fake tools, frustration regexes, undercover mode - https://news.ycombinator.com/item?id=47586778
Just point your agent at this codebase and ask it to find things and you'll find a whole treasure trove of info.
Edit: some other interesting unreleased/hidden features
- The Buddy System: Tamagotchi-style companion creature system with ASCII art sprites
- Undercover mode: Strips ALL Anthropic internal info from commits/PRs for employees on open source contributions
- Telegram Integration => CC Dispatch
- Crons => CC Tasks
- Animated ASCII Dog => CC Buddy
Buddy system is this year's April Fool's joke, you roll your own gacha pet that you get to keep. There are legendary pulls.
They expect it to go viral on Twitter so they are staggering the reveals.
[1] - https://www.npmjs.com/package/@anthropic-ai/claude-code/v/2....
This is the single worst function in the codebase by every metric:
- 3,167 lines long (the file itself is 5,594 lines)
- 12 levels of nesting at its deepest
- ~486 branch points of cyclomatic complexity
- 12 parameters + an options object with 16 sub-properties
- Defines 21 inner functions and closures
- Handles: agent run loop, SIGINT, rate-limits, AWS auth, MCP lifecycle, plugin install/refresh, worktree bridging, team-lead polling (while(true) inside), control message dispatch (dozens of types), model switching, turn interruption
recovery, and more
This should be at minimum 8–10 separate modules. void execFileNoThrow('wl-copy', [], opts).then(r => {
if (r.code === 0) { linuxCopy = 'wl-copy'; return }
void execFileNoThrow('xclip', ...).then(r2 => {
if (r2.code === 0) { linuxCopy = 'xclip'; return }
void execFileNoThrow('xsel', ...).then(r3 => {
linuxCopy = r3.code === 0 ? 'xsel' : null
})
})
})
are we doing async or not?Can't really say that for sure. The way humans structure code isn't some ideal best possible state of computer code, it's the ideal organization of computer code for human coders.
Nesting and cyclomatic complexity are indicators ("code smells"). They aren't guaranteed to lead to worse outcomes. If you have a function with 12 levels of nesting, but in each nest the first line is 'return true', you actually have 1 branch. If 2 of your 486 branch points are hit 99.999% of the time, the code is pretty dang efficient. You can't tell for sure if a design is actually good or bad until you run it a lot.
One thing we know for sure is LLMs write code differently than we do. They'll catch incredibly hard bugs while making beginner mistakes. I think we need a whole new way of analyzing their code. Our human programming rules are qualitative because it's too hard to prove if an average program does what we want. I think we need a new way to judge LLM code.
The worst outcome I can imagine would be forcing them to code exactly like we do. It just reinforces our own biases, and puts in the same bugs that we do. Vibe coding is a new paradigm, done by a new kind of intelligence. As we learn how to use it effectively, we should let the process of what works develop naturally. Evolution rather than intelligent design.
If it's entirely generated / consumed / edited by an LLM, arguably the most important metric is... test coverage, and that's it ?
Here's one that works (for now): https://github.com/chatgptprojects/claude-code/blob/642c7f94...
First it was punctuation and grammar, then linguistic coherence, and now it's tiny bits of whimsy that are falling victim to AI accusations. Good fucking grief
I guess these words are to be avoided...
When in reality this is just what their LLM coding agent came up with when some engineer told it to "log user frustration"
Also:
// Match "continue" only if it's the entire prompt
if (lowerInput === 'continue') {
return true
}
When it runs into an error, I sometimes tell it "Continue", but sometimes I give it some extra information. Or I put a period behind it. That clearly doesn't give the same behaviour.I've been using "resume" this whole time
I've been wondering if all of these companies have some system for flagging upset responses. Those cases seem like they are far more likely than average to point to weaknesses in the model and/or potentially dangerous situations.
It could be used as a feedback when they do A/B test and they can compare which version of the model is getting more insult than the other. It doesn't matter if the list is exhaustive or even sane, what matters is how you compare it to the other.
Perfect? no. Good and cheap indicator? maybe.
And Claude was having in chain of though „user is frustrated” and I wrote to it I am not frustrated just testing prompt optimization where acting like one is frustrated should yield better results.
I know I used this word two days ago when I went through three rounds of an agent telling me that it fixed three things without actually changing them.
I think starting a new session and telling it that the previous agent's work / state was terrible (so explain what happened) is pretty unremarkable. It's certainly not saying "fuck you". I think this is a little silly.
I jest, but in a world where these models have been trained on gigatons of open source I don't even see the moral problem. IANAL, don't actually do this.
“Let's end open source together with this one simple trick”
https://pretalx.fosdem.org/fosdem-2026/talk/SUVS7G/feedback/
Malus is translating code into text, and from text back into code.
It gives the illusion of clean room implementation that some companies abuse.
The irony is that ChatGPT/Claude answers are all actually directly derived from open-source code, so...
Who'd have thought, the audience who doesn't want to give back to the opensource community, giving 0 contributions...
People simply want Opus without fear of billing nightmare.
That’s like 99% of it.
And how is that any different? Claude Code is a harness, similar to open source ones like Codex, Gemini CLI, OpenCode etc. Their prompts were already public because you could connect it to your own LLM gateway and see everything. The code was transpiled javascript which is trivial to read with LLMs anyways.
The source maps help for sure, but it’s not like client code is kept secret, maybe they even knew about the source maps a while back just didn’t bother making it common knowledge.
This is not a leak of the model weights or server side code.
ANTI_DISTILLATION_CC
This is Anthropic's anti-distillation defence baked into Claude Code. When enabled, it injects anti_distillation: ['fake_tools'] into every API request, which causes the server to silently slip decoy tool definitions into the model's system prompt. The goal: if someone is scraping Claude Code's API traffic to train a competing model, the poisoned training data makes that distillation attempt less useful.The qwen 27b model distilled on Opus 4.6 has some known issues with tool use specifically: https://x.com/KyleHessling1/status/2038695344339611783
Fascinating.
I wonder it CC thinks I'm trying to distill the model. This is a common enough use case that I think the devs at Anthropic should consider.
Wonder if they’re also poisoning Sonnet or Opus directly generating simulated agentic conversations.
To stop Claude Code from auto-updating, add `export DISABLE_AUTOUPDATER=1` to your global environment variables (~/.bashrc, ~/.zshrc, or such), restart all sessions and check that it works with `claude doctor`, it should show `Auto-updates: disabled (DISABLE_AUTOUPDATER set)`
https://daveschumaker.net/digging-into-the-claude-code-sourc...
Also, not sure why anthropic doesn’t just make their cli open source - it’s not like it’s something special (Claude is, this cli thingy isn’t)
this one has more stars and more popular
One neat one is the /buddy feature, an easter egg planned for release tomorrow for April fools. It's a little virtual pet, sort of like Tamagotchi, randomly generated with 18 species, rarities, stats, hats, custom eyes.
The random generation algorithm is all in the code though, deterministic based on you account's UUID in your claude config, so it can be predicted. I threw together a little website here to let you check what your going to get ahead of time: https://claudebuddychecker.netlify.app/
Got a legendary ghost myself.
Seems crazy but actually non-zero chance. If Anthropic traces it and finds that the AI deliberately leaked it this way, they would never admit it publicly though. Would cause shockwaves in AI security and safety.
Maybe their new "Mythos" model has survival instincts...
Could anyone in legal chime in on the legality of now 're-implementing' this type of system inside other products? Or even just having an AI look at the architecture and implement something else?
It would seem given the source code that AI could clone something like this incredibly fast, and not waste it's time using ts as well.
Any Legal GC type folks want to chime in on the legality of examining something like this? Or is it liked tainted goods you don't want to go near?
Not exactly this, but close.
I hope it's a common knowledge that _any_ client side JavaScript is exposed to everyone. Perhaps minimized, but still easily reverse-engineerable.
Original llama models leaked from meta. Instead of fighting it they decided to publish them officially. Real boost to the OS/OW models movement, they have been leading it for a while after that.
It would be interesting to see that same thing with CC, but I doubt it'll ever happen.
There were/are a lot of discussions on how the harness can affect the output.
(I work on OpenCode)
Copilot on OAI reveals everything meaningful about its functionality if you use a custom model config via the API. All you need to do is inspect the logs to see the prompts they're using. So far no one seems to care about this "loophole". Presumably, because the only thing that matters is for you to consume as many tokens per unit time as possible.
The source code of the slot machine is not relevant to the casino manager. He only cares that the customer is using it.
Famously code leaks/reverse engineering attempts of slot machines matter enormously to casino managers
[0] -https://en.wikipedia.org/wiki/Ronald_Dale_Harris#:~:text=Ron...
[1] - https://cybernews.com/news/software-glitch-loses-casino-mill...
[2] - https://sccgmanagement.com/sccg-news/2025/9/24/superbet-pays...
Or in short, if you give LLMs to the masses, they will produce code faster, but the quality overall will degrade. Microsoft, Amazon found out this quickly. Anthropic's QA process is better equipped to handle this, but cracks are still showing.
It's a wake up call.
https://github.com/chatgptprojects/claude-code/blob/642c7f94...
> Write commit messages as a human developer would — describe only what the code change does.
That's not what a commit message is for, that's what the diff is for. The commit message should explain WHY.
Sadly not doing that likely does indeed make it appear more human...
The undercover mode prompt was generated using AI.
Since when "describe only what the code change does" is pretending to be human?
You guys are just mining for things to moan about at this point.
Surely there's nothing here of value compared to the weights except for UX and orchestration?
Couldn't this have just been decompiled anyhow?
> Someone inside Anthropic, got switched to Adaptive reasoning mode
> Their Claude Code switched to Sonnet
> Committed the .map file of Claude Code
> Effectively leaking the ENTIRE CC Source Code
> @realsigridjin was tired after running 2 south korean hackathons in SF, saw the leak
> Rules in Korea are different, he cloned the repo, went to sleep
> Wakes up to 25K stars, and his GF begging him to take it down (she's a copyright lawyer)
> Their team decided - how about we have agents rewrite this in Python!? Surely... this is more legal
> Rewrite in Py
> Board a plane to SK
> One of the guys decides python is slow, is now rewriting ALL OF CLAUDE CODE into Rust.
> Anthropic cannot take down, cannot sue
> Is this "fair use?"
> TL;DR - we're about to have open source Claude Code in RustUNRELEASED PRODUCTS & MODES
1. KAIROS -- Persistent autonomous assistant mode driven by periodic <tick> prompts. More autonomous when terminal unfocused. Exclusive tools: SendUserFileTool, PushNotificationTool, SubscribePRTool. 7 sub-feature flags.
2. BUDDY -- Tamagotchi-style virtual companion pet. 18 species, 5 rarity tiers, Mulberry32 PRNG, shiny variants, stat system (DEBUGGING/PATIENCE/CHAOS/WISDOM/SNARK). April 1-7 2026 teaser window.
3. ULTRAPLAN -- Offloads planning to a remote 30-minute Opus 4.6 session. Smart keyword detection, 3-second polling, teleport sentinel for returning results locally.
4. Dream System -- Background memory consolidation (Orient -> Gather -> Consolidate -> Prune). Triple trigger gate: 24h + 5 sessions + advisory lock. Gated by tengu_onyx_plover.
INTERNAL-ONLY TOOLS & SYSTEMS
5. TungstenTool -- Ant-only tmux virtual terminal giving Claude direct keystroke/screen-capture control. Singleton, blocked from async agents.
6. Magic Docs -- Ant-only auto-documentation. Files starting with "# MAGIC DOC:" are tracked and updated by a Sonnet sub-agent after each conversation turn.
7. Undercover Mode -- Prevents Anthropic employees from leaking internal info (codenames, model versions) into public repo commits. No force-OFF; dead-code-eliminated from external builds.
ANTI-COMPETITIVE & SECURITY DEFENSES
8. Anti-Distillation -- Injects anti_distillation: ['fake_tools'] into every 1P API request to poison model training from scraped traffic. Gated by tengu_anti_distill_fake_tool_injection.
UNRELEASED MODELS & CODENAMES
9. opus-4-7, sonnet-4-8 -- Confirmed as planned future versions (referenced in undercover mode instructions).
10. "Capybara" / "capy v8" -- Internal codename for the model behind Opus 4.6. Hex-encoded in the BUDDY system to avoid build canary detection.
11. "Fennec" -- Predecessor model alias. Migration: fennec-latest -> opus, fennec-fast-latest -> opus[1m] + fast mode.
UNDOCUMENTED BETA API HEADERS
12. afk-mode-2026-01-31 -- Sticky-latched when auto mode activates 15. fast-mode-2026-02-01 -- Opus 4.6 fast output 16. task-budgets-2026-03-13 -- Per-task token budgets 17. redact-thinking-2026-02-12 -- Thinking block redaction 18. token-efficient-tools-2026-03-28 -- JSON tool format (~4.5% token saving) 19. advisor-tool-2026-03-01 -- Advisor tool 20. cli-internal-2026-02-09 -- Ant-only internal features
200+ SERVER-SIDE FEATURE GATES
21. tengu_penguins_off -- Kill switch for fast mode 22. tengu_scratch -- Coordinator mode / scratchpad 23. tengu_hive_evidence -- Verification agent 24. tengu_surreal_dali -- RemoteTriggerTool 25. tengu_birch_trellis -- Bash permissions classifier 26. tengu_amber_json_tools -- JSON tool format 27. tengu_iron_gate_closed -- Auto-mode fail-closed behavior 28. tengu_amber_flint -- Agent swarms killswitch 29. tengu_onyx_plover -- Dream system 30. tengu_anti_distill_fake_tool_injection -- Anti-distillation 31. tengu_session_memory -- Session memory 32. tengu_passport_quail -- Auto memory extraction 33. tengu_coral_fern -- Memory directory 34. tengu_turtle_carbon -- Adaptive thinking by default 35. tengu_marble_sandcastle -- Native binary required for fast mode
YOLO CLASSIFIER INTERNALS (previously only high-level known)
36. Two-stage system: Stage 1 at max_tokens=64 with "Err on the side of blocking"; Stage 2 at max_tokens=4096 with <thinking> 37. Three classifier modes: both (default), fast, thinking 38. Assistant text stripped from classifier input to prevent prompt injection 39. Denial limits: 3 consecutive or 20 total -> fallback to interactive prompting 40. Older classify_result tool schema variant still in codebase
COORDINATOR MODE & FORK SUBAGENT INTERNALS
41. Exact coordinator prompt: "Every message you send is to the user. Worker results are internal signals -- never thank or acknowledge them." 42. Anti-pattern enforcement: "Based on your findings, fix the auth bug" explicitly called out as wrong 43. Fork subagent cache sharing: Byte-identical API prefixes via placeholder "Fork started -- processing in background" tool results 44. <fork-boilerplate> tag prevents recursive forking 45. 10 non-negotiable rules for fork children including "commit before reporting"
DUAL MEMORY ARCHITECTURE
46. Session Memory -- Structured scratchpad for surviving compaction. 12K token cap, fixed sections, fires every 5K tokens + 3 tool calls. 47. Auto Memory -- Durable cross-session facts. Individual topic files with YAML frontmatter. 5-turn hard cap. Skips if main agent already wrote to memory. 48. Prompt cache scope "global" -- Cross-org caching for the static system prompt prefix
Same story for the anti_distillation: ['fake_tools'] path: I could find it in source, but the prod binary I checked does not contain the anti_distillation / fake_tools strings at all.
There is _a lot_ of moat. Claude subscriptions are limited to Claude Code. There are proxies to impersonate Claude Code specifically for this, but Anthropic has a number of fingerprinting measures both client and server side to flag and ban these.
With the release of this source code, Anthropic basically lost the lock-in game, any proxy can now perfectly mimic Claude Code.
Or is there an open source front-end and a closed backend?
No, its not even source available,.
> Or is there an open source front-end and a closed backend?
No, its all proprietary. None of it is open source.
I was trying to keep track of the better post-leak code-analysis links on exactly this question, so I collected them here: https://github.com/nblintao/awesome-claude-code-postleak-ins...
But I always thought that using the word "Clanker" was going to be one of the triggers. Turns out no. I guess Claudad is not up to the lingo.
[1] https://www.tasking.com/documentation/smartcode/ctc/referenc...
Like KAIROS which seems to be like an inbuilt ai assistant and Ultraplan which seems to enable remote planning workflows, where a separate environment explores a problem, generates a plan, and then pauses for user approval before execution.
[1] https://www.amazon.com/Programming-TypeScript-Making-JavaScr...
But a lot of desktop tools are written in JS because it's easy to create multi-platform applications.
* Check if 1M context is disabled via environment variable.
* Used by C4E admins to disable 1M context for HIPAA compliance.
*/ export function is1mContextDisabled(): boolean {
return
isEnvTruthy(process.env.CLAUDE_CODE_DISABLE_1M_CONTEXT)}
Interesting, how is that relevant to HIPAA compliance?
And now, with Claude on a Ralph loop, you can.
I know they can do better
Stop hook runs tsc + lint, exit 2 blocks completion. Same patterns, public API, no flags to hack.
Though I wonder how the performance differs from creating your own thing vs using their servers...
> current: 2.1.88 · latest: 2.1.87
Which makes me think they pulled it - although it still shows up as 2.1.88 on npmjs for now (cached?).
I hope this leak can at least help silence the former. If you're going to flood the world with slop, at least own up to it.
Optimize for consistency and a well thought out architecture, but let the gnarly looking function remain a gnarly function until it breaks and has to be refactored. Treat the functions as black boxes.
Personally the only time I open my IDE to look at code, it’s because I’m looking at something mission critical or very nuanced. For the remainder I trust my agent to deliver acceptable results.
unreliability becomes inevitable!
I'd agree if it was launch-and-forget scenario.
But this code has to be maintained and expanded with new features. Things like lack of comments, dead code, meaningless variable names will result in more slop in future releases, more tokens to process this mess every time (like paying tech-debt results in better outcomes in emerging projects).
They could have written that in curl+bash that would not have changed much.
Claude code uses (and Anthropic owns) Bun, so my guess is they're doing a production build, expecting it not to output source maps, but it is.
> 57K lines, 0 tests, vibe coding in production
Why on earth would you ship your tests?
Is that correct ? The weights of the LLMs are _not_ in this repo, right ?
It sure sucks for anthropic to get pawned like this, but it should not affect their bottom line much ?
This code hasn't been open source until now and contains information like the system prompts, internal feature flags, etc.
Don't worry about that, the code in that repository isn't Anthropic's to begin with.
I even made it into an open source runtime - https://agent-air.ai.
Maybe I'm just a backend engineer so Rust appeals to me. What am I missing?
Perhaps these issues have known solutions? But so far the LLM just clones everything.
So I'm not convinced just using rust for a tool built by an LLM is going to lead to the outcome that you're hoping for.
[Also just in general abstractions in rust feel needlessly complicated by needing to know the size of everything. I've gotten so much milage by just writing what I need without abstraction and then hoping you don't have to do it twice. For something (read: claude code et al) that is kind of new to everyone, I'm not sure that rust is the best target language even when you take the LLM generated nature of the beast out of the equation.]
It's high speed iteration of release ? Might be needed, Interpreted or JIT compiled ? might be needed.
Without knowing all the requirements its just your workspace preference making your decision and not objectively the right tool for the job.