Opus 4.6 can be quite sassy at times, the other day I asked it if it were "buttering me up" and it candidly responded "Hey you asked me to help you write a report with that conclusion, not appraise it."
I started using it last week and it’s been great. Uses git worktrees, experimental feature (spotlight) allows you to quickly check changes from different agents.
I hope the Claude app will add similar features soon
If I don't want to sit behind something like LiteLLM or OpenRouter, I can just use the Claude Agent SDK: https://platform.claude.com/docs/en/agent-sdk/overview
However, you're not supposed to really use it with your Claude Max subscription, but instead use an API key, where you pay per token (which doesn't seem nearly as affordable, compared to the Max plan, nobody would probably mind if I run it on homelab servers, but if I put it on work servers for a bit, technically I'd be in breach of the rules):
> Unless previously approved, Anthropic does not allow third party developers to offer claude.ai login or rate limits for their products, including agents built on the Claude Agent SDK. Please use the API key authentication methods described in this document instead.
If you look at how similar integrations already work, they also reference using the API directly: https://code.claude.com/docs/en/gitlab-ci-cd#how-it-works
A simpler version is already in Claude Code and they have their own cloud thing, I'd just personally prefer more freedom to build my own: https://www.youtube.com/watch?v=zrcCS9oHjtI (though there is the possibility of using the regular Claude Code non-interactively: https://code.claude.com/docs/en/headless)
It just feels a tad more hacky than just copying an API key when you use the API directly, there is stuff like https://github.com/anthropics/claude-code/issues/21765 but also "claude setup-token" (which you probably don't want to use all that much, given the lifetime?)
https://docs.google.com/spreadsheets/u/0/d/e/2PACX-1vQDvsy5D...
Claude Plays Pokemon is currently stuck in Victory Road, doing the Sokoban puzzles which are both the last puzzles in the game and by far the most difficult for AIs to do. Opus 4.5 made it there but was completely hopeless, 4.6 made it there and is is showing some signs of maaaaaybe being eventually bruteforce through the puzzles, but personally I think it will get stuck or undo its progress, and that Claude 4.7 or 5 will be the one to actually beat the game.