One of the hard things about building a product on an LLM is that the model frequently changes underneath you. Since we introduced Claude Code almost a year ago, Claude has gotten more intelligent, it runs for longer periods of time, and it is able to more agentically use more tools. This is one of the magical things about building on models, and also one of the things that makes it very hard. There's always a feeling that the model is outpacing what any given product is able to offer (ie. product overhang). We try very hard to keep up, and to deliver a UX that lets people experience the model in a way that is raw and low level, and maximally useful at the same time.
In particular, as agent trajectories get longer, the average conversation has more and more tool calls. When we released Claude Code, Sonnet 3.5 was able to run unattended for less than 30 seconds at a time before going off the rails; now, Opus 4.6 1-shots much of my code, often running for minutes, hours, and days at a time.
The amount of output this generates can quickly become overwhelming in a terminal, and is something we hear often from users. Terminals give us relatively few pixels to play with; they have a single font size; colors are not uniformly supported; in some terminal emulators, rendering is extremely slow. We want to make sure every user has a good experience, no matter what terminal they are using. This is important to us, because we want Claude Code to work everywhere, on any terminal, any OS, any environment.
Users give the model a prompt, and don't want to drown in a sea of log output in order to pick out what matters: specific tool calls, file edits, and so on, depending on the use case. From a design POV, this is a balance: we want to show you the most relevant information, while giving you a way to see more details when useful (ie. progressive disclosure). Over time, as the model continues to get more capable -- so trajectories become more correct on average -- and as conversations become even longer, we need to manage the amount of information we present in the default view to keep it from feeling overwhelming.
When we started Claude Code, it was just a few of us using it. Now, a large number of engineers rely on Claude Code to get their work done every day. We can no longer design for ourselves, and we rely heavily on community feedback to co-design the right experience. We cannot build the right things without that feedback. Yoshi rightly called out that often this iteration happens in the open. In this case in particular, we approached it intentionally, and dogfooded it internally for over a month to get the UX just right before releasing it; this resulted in an experience that most users preferred.
But we missed the mark for a subset of our users. To improve it, I went back and forth in the issue to understand what issues people were hitting with the new design, and shipped multiple rounds of changes to arrive at a good UX. We've built in the open in this way before, eg. when we iterated on the spinner UX, the todos tool UX, and for many other areas. We always want to hear from users so that we can make the product better.
The specific remaining issue Yoshi called out is reasonable. PR incoming in the next release to improve subagent output (I should have responded to the issue earlier, that's my miss).
Yoshi and others -- please keep the feedback coming. We want to hear it, and we genuinely want to improve the product in a way that gives great defaults for the majority of users, while being extremely hackable and customizable for everyone else.
Sighted users lost convenience. I lost the ability to trust the tool. There is no "glancing" at terminal output with a screen reader. There is no "progressive disclosure." The text is either spoken to me or it doesn't exist.
When you collapse file paths into "Read 3 files," I have no way to know what the agent is doing with my codebase without switching to verbose mode, which then dumps subagent transcripts, thinking traces, and full file contents into my audio stream. A sighted user can visually skip past that. I listen to every line sequentially.
You've created a situation where my options are "no information" or "all information." The middle ground that existed before, inline file paths and search patterns, was the accessible one.
This is not a power user preference. This is a basic accessibility regression. The fix is what everyone in this thread has been asking for: a BASIC BLOODY config flag to show file paths and search patterns inline. Not verbose mode surgery. A boolean.
Please just add the option.
And yes, I rewrote this with Claude to tone my anger and frustration down about 15 clicks from how I actually feel.
> Yoshi and others -- please keep the feedback coming. We want to hear it, and we genuinely want to improve the product in a way that gives great defaults for the majority of users, while being extremely hackable and customizable for everyone else.
I think an issue with 2550 upvotes, more than 4 times of the second-highest, is very clear feedback about your defaults and/or making it customizable.
> The amount of output this generates can quickly become overwhelming in a terminal
If I use Opus 4.6, arguably the most verbose, over thinking model you've released to date, OpenCode handles it just the same as it does Sonnet 4.0.
OpenCode even allows me to toggle into subagent and task agents with their own output terminals that, if I am curious what is going on, I can very clearly see it.
All Claude-Code has done has turned the output into a black box so that I am forced to wait for it to finish to look at the final git diff. By then it's spent $5-10 working on a task, and threw away a lot of the context it took to get there. It showed "thinking" blocks that weren't particularly actionable, because it was mostly talking to itself that it can't do something because it goes against a rule, but it really wants to.
I'm actually frustrated with Code blazing through to the end without me able to see the transcript of the changes.
Funnily enough, both independently sided with the users, not the authors.
The core problem: --verbose was repurposed instead of adding a new toggle. Users who relied on verbose for debugging (thinking, hooks, subagent output) now have broken workflows - to fix a UX decision that shouldn't have shipped as default in the first place.
What should have been done:
/config
Show file paths: [on/off]
Verbose mode: [on/off] (unchanged)
A simple separate toggle would've solved everything without breaking anyone's workflow.Opus 4.6's parting thought: if you're building a developer tool powered by an AI that can reason about software design, maybe run your UX changes past it before shipping.
To be fair, your response explains the design philosophy well - longer trajectories, progressive disclosure, terminal constraints. All valid. But it still doesn't address the core point: why repurpose --verbose instead of adding a separate toggle? You can agree with the goal and still say the execution broke existing workflows.
But this one isn't? I'd call myself a professional. I use with tons of files across a wide range of projects and types of work.
To me file paths were an important aspect of understanding context of the work and of the context CC was gaining.
Now? It feels like running on a foggy street, never sure when the corner will come and I'll hit a fence or house.
Why not introduce a toggle? I'd happily add that to my alisases.
Edit: I forgot. I don't need better subagent output. Or even less output whrn watching thinking traces. I am happy to have full verbosity. There are cases where it's an important aspect.
https://martin.ankerl.com/2007/09/01/comprehensive-linux-ter...
Could the React rendering stack be optimised instead?
I just find that very hard to believe. Does anyone actually do anything with the output now? Or are they just crossing their fingers and hoping for the best?
If you are serious about this, I think there are so many ways you could clean up, simplify, and calm the Claude Code terminal experience already.
I am not a CC user, but an enthusiastic CC user generously spent an hour or two last week or so showing me how it worked and walking through an non-publicly-implemented Gwern.net frontend feature (some CSS/JS styling of poetry for mobile devices).
It was highly educational and interesting, and Claude got most of the way to something usable.
Yet I was shocked and appalled by the CC UI/UX itself: it felt like the fetal alcohol syndrome lovechild of a Las Vegas slot machine and Tiktok. I did not realize that all those jokes about how using CC was like 'crack' or 'ADHD' or 'gambling' were so on point, I thought they were more, well, metaphorical about the process as a whole. I have not used such a gross and distracting UI in... a long time. Everything was dancing and bouncing around and distracting me while telling me nothing. I wasted time staring at the update monitor trying to understand if "Prognosticating..." was different from "Fleeblegurbigating..." from "Reticulating splines...", while the asterisk bounces up and down, or the colored text fades in and out, all simultaneously, and most of the screen was wasted, and the whole thing took pains to put in as much fancy TUI nonsense as it could. An absolute waste, not whimsy, of pixels. (And I was a little concerned how much time we spent zoned out waiting on the whole shabang. I could feel the productivity leaving my body, minute by minute. How could I possibly focus on anything else while my little friendly bouncing asterisk might finish at any instant...?!) Some description of what files are being accessed seems like you could spare the pixels for them.
So I was impressed enough with the functionality to move it up my list, but also much of it made me think I should look into GPT Codex instead. It sounds like the interfaces there respect my time and attention more, rather than treating me like a Zoomer.
Please revert this
That's why I use your excellent VS Code extension. I have lots of screen space and it's trivial to scroll back there, if needed.
I would really like even more love given to this. When working with long-lived code bases it's important to understand what is happening. Lots of promising UX opportunities here. I see hints of this, but it seems like 80% is TBD.
Ideally you would open source the extension to really use the creativity of your developer user base. ;)
It might be worth considering a "verbose level" type setting with a selection of levels that describe the level of verbosity. Effectively, use a select menu instead of a boolean when one boolean state is actually multiple nested states.
Edit: I realised my use of "verbose" and "verbosity" here is it self ironically verbose, sorry!
As others have said - 'reading 10 files' is useless information - we want to be able to see at a glance where it is and what it's doing, so that we can re-direct if necessary.
With the release of Cowork, couldn't Claude Code double down on needs of engineers?
Ooo... ooo! I know what this is a reference to!
If that's the case, it's important to asses wether it'll be consistent when operating on a higher level, less dependent on the software layer that governs the agent. Otherwise it'll risk Claude also becoming more erratic.
Of course all the logs can’t be streamed to a terminal. Why would they need to be? Every logging system out there allows multiple stream handlers with different configurations.
Do whatever reasonable defaults you think make sense for the TUI (with some basic configuration). But then I should also be able to give Claude-code a file descriptor and a different set of config optios, and you can stream all the logs there. Then I can vibe-code whatever view filter I want on top of that, or heck, have a SLM sub-agent filter it all for me.
I could do this myself with some proxy / packet capture nonsense, but then you’d just move fast and break my things again.
I’m also constantly frustrated by the fancier models making wrong assumptions in brownfield projects and creating a big mess instead of asking me follow-up questions. Opus is like the world’s shittiest intern… I think a lot of that is upstream of you, but certainly not all of it. There could be a config option to vary the system prompt to encourage more elicitation.
I love the product you’ve built, so all due respect there, but I also know the stench of enshittification when I smell it. You’re programmers, you know how logging is supposed to work. You know MCP has provided a lot of these basic primitives and they’re deliberately absent from claude code. We’ve all seen a product get ratfucked internally by a product manager who copied the playbook of how Prabhakar Raghavan ruined google search.
The open source community is behind at the moment, but they’ll catch up fast. Open always beats closed in the long run. Just look at OpenAI’s fall into disgrace.
This is verifiable bullshit. Unless you explicitly explain how it "runs for days" since Opus's context window is incapable of handling even relatively large CLAUDE.md files.
> The amount of output this generates can quickly become overwhelming in a terminal, and is something we hear often from users. Terminals give us relatively few pixels to play with; they have a single font size; colors are not uniformly supported; in some terminal emulators, rendering is extremely slow.
No. It's your incapability as an engineer that limits this. And you and your engineers getting high on your own supply. Hence you need 16ms to draw a couple of characters on screen and call it a tiny game engine [1] For which your team was rightfully ridiculed.
> But we missed the mark for a subset of our users. To improve it,
AI-written corporate nothingspeak.
Maybe "AI IDEs" will gain ground in the future, e.g. vibe-kanban
I subscribe to max rn. Tons of money. Anthropic’s Super Bowl ads were shit, not letting us use open code was shit, and this is more shit. Might only be a single straw left before I go to codex (no one’s complaining about it. And the openclaw creator prefers it)
This dev is clearly writing his reply with Claude and sounding way too corpo. This feels like how school teachers would talk to you. Your response in its length was genuinely insulting. Everyone knows how to generate text with AI now and you’re doing a terrible job at it. You can even see the emdash attempt (markdown renders two normal dashes as an emdash).
This was his prompt “read this blog post, familiarize yourself with the mentioned GitHub issue and make a response on behalf of Anthropic.” He then added a little bit at the end when he realized the response didn’t answer the question and got so to fix the grammar and spelling on that.
Your response is appropriate for the masses. But we’re not. We’re the so called hackers and read right through the bs. It’s not even about the feature being gone anymore.
There is a principle we uphold as “hackers” that doesn’t align with this that pisses people off a lot more than you think. I can’t really put my finger on it maybe someone can help me out.
PS About the Super Bowl ads. Anyone that knows the story knows they’re exaggerated. (In the general public outside of Silicon Valley it’s like a 50/50 split or something about people liking or disliking AI as a whole rn. OpenAI is doing way more to help the case (not saying ads are a good thing). ) Open ai used to feel like the bad guy now it’s kinda shifting to anthropic. This, the ads and open code are all examples of it. (I especially recommend people watch the anthropic and open ai Super Bowl ads back to back)
And stop banning 3rd party harnesses please. Thanks
Anthropic, your actual moat is goodwill. Remember that.
Edit: I can't post anymore today apparently because of dang. If you post a comment about a bad terminal at least tell us about the rendering issues.
Can we please move the "Extended Thinking" icon back to the left side of claude desktop, near the research and web search icons? What used to be one click is now three.
use your own words!
i would rather read the prompt.
How can that be true, when you're deliberately and repeatedly telling devs (the community you claim to listen to) that you know better than they do? They're telling you exactly what they want, and you're telling them, "Nah." That isn't listening. You understand that, right?
Arrogant and clueless, not exactly who I want to give my money to when I know what enshitification is.
They have horrible instincts and are completely clueless. You need to move them away from a public-facing role. It honestly looks so bad, it looks so bad that it suggests nepotism and internal dysfunction to have such a poor response.
This is not the kind of mistake someone makes innocently, it's a window into a worldview that's made me switch to gemini and reactivate cursor as a backup because it's only going to get worse from here.
The problem is not the initial change (which you would rapidly realize was a big deal to a huge number of your users) but how high-handed and incompetent the initial response was. Nobody's saying they should be fired, but they've failed in public in a huge way and should step back for a long time.
Product manager here. Cynically, this is classic product management: simplify and remove useful information under the guise of 'improving the user experience' or perhaps minimalism if you're more overt about your influences.
It's something that as an industry we should be over by now.
It requires deep understanding of customer usage in order not to make this mistake. It is _really easy_ to think you are making improvements by hiding information if you do not understand why that information is perceived as valuable. Many people have been taught that streamlining and removal is positive. It's even easier if you have non-expert users getting attention. All of us here at HN will have seen UIs where this has occurred.
It should be a fad gone by at this point, but people never learn. Here's what to do instead: Find your most socially competent engineer, and have them talk to users a couple times a month. Just saved you thousands or millions in salaries, and you have a better chance of making things that your users actually want.
Make the application configurable. Developers like to tinker with their tools.
I agree it's a mistake, but I don't believe that it's viewed that way by anyone making the decision to do it.
I think we can be more charitable. Don't you see, even here on HN, people constantly asking for software that is less bloated, that does fewer things but does them better, that code is cost, and every piece of complexity is something that needs to be maintained?
As features keep getting added, it is necessary to revisit where the UX is "too much" and so things need to be hidden, e.g. menu commands need to be grouped in a submenu, what was toolbar functionality now belongs in a dialog, reporting needs to be limited to a verbose mode, etc.
Obviously product teams get it wrong sometimes, users complain, and if enough users complain, then it's brought back, or a toggle to enable it.
There's nothing to be cynical about, and it's not something we "should be over by now." It's just humans doing their best to strike the balance between a UX that provides enough information to be useful without so much information that it overwhelms and distracts. Obviously any single instance isn't usually enough to overwhelm and distract, but in aggregate they do, so PM's and designers try to be vigilant to simplify wherever possible. But they're only human, sometimes they'll get it wrong (like maybe here), and then they fix it.
Cynically, it's a vibe coded mess and the "programmers" at Anthropic can't figure out how to put it back.
More cynically, Anthropic management is trying to hide anything that people could map to token count (aka money) so that they can start jiggling the usage numbers to extract more money from us.
I was recently involved with a company that wanted us to develop a product that would be disruptive enough to enter an established market, make waves and shock it.
We did just that. We ran a deep survey of all competing products, bought a bunch of them, studied absolutely everything about them, how they were used and their users. Armed with that information, we produced a set of specifications and user experience requirements that far exceeded anything in the market.
We got green-lit to deliver a set of prototypes to present at a trade show. We did that.
The prototypes were presented and they truly blew everyone away. Blogs, vlogs, users, everyone absolutely loved what we created and the sense was that this was a winning product.
And then came reality. Neither the product manager nor the CTO (and we could add the CEO and CFO to the list) had enough understanding and experience in the domain to take the prototypes to market. It would easily have required a year or two of learning before they could function in that domain.
What did they do? They dumbed down the product specification to force it into what they understood and what engineering building blocks they already had. Square peg solidly and violently pounded into a round hole.
The outcome? Oh, they built a product alright. They sure did. And it flopped, horribly flopped, as soon as it was introduced and made available. Nobody wanted it. It was not competitive. It offered nothing disruptive. It was a bad clone of everything already occupying space in that ecosystem. Game over.
The point is: Technology companies are not immune to human failings, ego, protectionism/turf guarding, bad decisions, bad management, etc.
When someone says something like "I am not sure that's a good idea for a startup. There's competition." My first though is: Never assume that competitors know what they are doing, are capable and always make the right decisions without making mistakes. You don't always need a better product, you need better execution.
Over the past ten years or so the increasing de-featuring of software under the guise of 'simplification' has become a critical issue for power users. For any GUI apps which have a mixed base of consumer and power users, I mostly don't update them anymore because they're as likely to get net worse vs better.
It's weird that companies like MSFT seem puzzled why so many users refuse to update Windows or Office to major new feature versions.
We have by now taught them about good information density.
Like, the permission pages, if you look at them just once, kinda look like bad 90s UIs. They throw a crapton of information at you.
But they contain a lot of smart things you only realize when actually using it from an admin perspective. Easy comparison of group permissions by keeping sorting orders and colors stable, so you can toggle between groups and just visually match what's different, because colors change. Highlights of edge cases here and there. SSO information around there as well. Loads of frontloaded necessary info with optional information behind various places.
You can move seriously fast in that interface once you understand it.
Parts of the company hate it for not being user friendly. I just got a mail that a customer admin was able to setup SSO in 15 minutes and debug 2 mapping issues in another 10 and now they are production ready.
Not at all cynically, this is classic product management - simplify by removing information that is useful to some users but not others.
We shouldn't be over it by now. It's good to think carefully about how you're using space in your UI and what you're presenting to the user.
You're saying it's bad because they removed useful information, but then why isn't Anthropic's suggestion of using verbose mode a good solution? Presumably the answer is because in addition to containing useful information, it also clutters the UI with a bunch of information the user doesn't want.
Same thing's true here - there are people who want to see the level of detail that the author wants and others for whom it's not useful and just takes up space.
> It requires deep understanding of customer usage in order not to make this mistake.
It requires deep understanding of customer usage to know whether it's a mistake at all, though. Anthropic has a lot deeper understanding of the usage of Claude Code than you or I or the author. I can't say for sure that they're using that information well, but since you're a PM I have to imagine that there's been some time when you made a decision that some subset of users didn't like but was right for the product, because you had a better understanding of the full scope of usage by your entire userbase than they did. Why not at least entertain the idea that the same thing is true here?
People who toggle debug will get "full" access and those who dont care, probably won't notice if their LLM us is degraded.
It seems a pure market segmenting prior to a "shrinkflation" approach to cost management.
https://github.com/anthropics/claude-code/issues/15263
https://github.com/anthropics/claude-code/issues/9099
https://github.com/anthropics/claude-code/issues/8371
It's very clear that Anthropic doesn't really want to expose the secret sauce to end users. I have to patch Claude every release to bring this functionality back.
If Claude Code can replace an engineer, it should cost just a bit less than an engineer, not half as much.
Meanwhile, I am observing precisely how VS+Copilot works in my OAI logs with zero friction. Plug in your own API key and you can MITM everything via the provider's logging features.
To other actors who want to train a distilled version of Claude, more likely.
And then this. They want to own your dev workflow and for some reason believe Claude code is special enough to be closed source. The react TUI is kinda a nightmare to deal with I bet.
I will say, very happy with the improvements made to Codex 5.3. I’ve been spending A LOT more time with codex and the entire agent toolchain is OSS.
Not sure what anthropic’s plan is, but I haven’t been a fan of their moves in the past month and a half.
for example Amp "feels" much better. Also like in Amp how I can just send the message whenever and it doesn't get queued
* I know, lots of "feels" in there..
DEVELOPERS, DEVELOPERS, DEVELOPERS, DEVELOPERS
I write mainly out of the hope that some Anthropic employees read this: you need an internal crusade to fight these impulses. Take the high road in the short-term and you may avoid being disrupted in the long-term. It's a culture issue.
Probably your strongest tool is specifically educating people about the history. Microsoft in the late 90s and early 00s was completely dominant, but from today's perspective it's very clear: they made some fundamental choices that didn't age well. As a result, DX on Windows is still not great, even if Visual Studio has the best features, and people with taste by and large prefer Linux.
Apple made an extremely strategic choice: rebuild the OS around BSD, which set them up to align with Linux (the language of servers). The question is: why? Go find out.
The difference is a matter of sensibility, and a matter of allowing that sensibility to exist and flourish in the business.
Anthropic is the market leader for advanced AI coding with no serious competitor currently very close and they are preparing to IPO this year. This year is a transition year. The period where every decision would default toward delighting users and increasing perceived value is ending. By next year they'll be fully on the quarterly Wall Street grind of min/maxing every decision to extract the highest possible profit from customers at the lowest possible cost.
This path is inevitable and unavoidable, even with the most well-intentioned management and employees.
I understand the article writers frustration. He liked a thing about a product he uses and they changed the product. He is feeling angry and he is expressing that anger and others are sharing in that.
And I'm part of another group of people. I would notice the files being searched without too much interest. Since I pay a monthly rate, I don't care about optimizing tokens. I only care about the quality of the final output.
I think the larger issue is that programmers are feeling like we are losing control. At first we're like, I'll let it auto-complete but no more. Then it was, I'll let it scaffold a project but not more. Each step we are ceding ground. It is strange to watch someone finally break on "They removed the names of the files the agent was operating on". Of all of the lost points of control this one seems so trivial. But every camels back has a breaking point and we can't judge the straw that does it.
I'm guessing you're not aware of how their newest game, Starfield, was received. In the long term, that direction did not work out for them at all.
Those specific logs are essentially a prop anyways. Removing them makes it harder to LARP as an active participant; it forces the realization that "we" are now just passive observers.
If it sounds strange your theory might be wrong.
Telepsychology is one of the lowest forms of response.
Maybe Claude Code web or desktop could be targeted to these new vibe coders instead? These folks often don't know how simple bash commands work so the terminal is the wrong UX anyway. Bash as a tool is just very powerful for any agentic experience.
On the other end are the hardcore user orchestrating a bunch of agents, not sitting there watching one run, so they don’t care about these logs at all
In the middle are the engineers sitting there watching the agent go
Or, it could serve as a textbook example how to make your real future long term customers (=fluent coders) angry… what a strategy :)
Meanwhile all evidence is that the true value of these tools is in their ability to augment & super-charge competent software engineers, not replace them.
Meanwhile the quality of Claude Code the tool itself is a bit of a damning indictment of their philosophy.
Give me a team of experienced sharp diligent engineers with these coding tools and we can make absolutely amazing things. But newbie product manager with no software engineering fundamentals issuing prompts will make a mess.
I can see it even in my own work -- when I venture into doing frontend eng using these tools the results look good but often have reliability issues. Because my background/specialization is in systems, embedded & backend work -- I'm not good at reviewing the React etc code it makes.
Programmers are just jealous that they are no longer the only ones that get to play pretend.
I don't know anything about you personally, but most "software engineers" are anything but.
It still does what I need so I'm okay with it, but I'm also on the $20 plan so it's not that big of a worry for me.
I did sense that the big wave of companies is hitting Anthropic's wallet. If you hadn't realized, a LOT of companies switched to Claude. No idea why, and this is coming from someone who loves Claude Code.
Anyway, getting some transparency on this would be nice.
It is entirely due to Opus 4.5 being an inflection point codingwise over previous LLMs. Most of the buzz there has been organic word of mouth due to how strong it is.
Opus 4.5 is expensive to put it mildly, which makes Claude Code more compelling. But even now, token providers like Openrouter have Opus 4.5 as one of its most popular models despite the price.
Use the pi coding agent. Bare-bones context, easy to hack.
Ads in ChatGPT. Removing features from Claude Code. I think we're just beginning to face the music. It's also funny that how Google "invented" ad injection in replies with real-time auction capabilities, yet OpenAI would be the first implementer of it. It's similar to how transformers played out.
For me, that's another "popcorn time". I don't use any of these to any capacity, except Gemini, which I seldom use to ask stuff when deep diving in web doesn't give any meaningful results. The last question I asked managed to return only one (but interestingly correct) reference, which I followed and continued my research from there.
Regarding the thoughts: it also allows me to detect problematic paths it takes, like when it can't find a file.
For example today I was working on a project that depends on another project, managed by another agent. While refactoring my code it noticed that it needs to see what this command is which it is invoking, so it even went so far as to search through vs code's user data to find the recent files history if it can find out more about that command... I stopped it and told it that if it has problems, it should tell me. It explained it can't find that file, i gave it the paths and tokens were saved. Note that in that session I was manually approving all commands, but then rejected the one in the data dir.
Why dumb it down?
TIL that there's an especially apt xkcd comic for this scenario: "Zealous Autoconfig"
Set minimal defaults to keep output clean, but let users pick and choose items to output across several levels of verbosity, similar to tcpdump, Ansible, etc. (-v to -vvvvv).
I know businesses are obsessed with providing Apple-like "experiences", where the product is so refined there's just "the one way" to magically do things, but that's not going to work for a coding agent. It needs to be a unix-like experience, where the app can be customized to fit your bespoke workflow, and opening the man page does critical damage unless you're a wizard.
LLMs are already a magic box, which upsets many people. It'll be a shame if Anthropic alienates their core fan base of SWEs by making things more magical.
There are no vibes in “I am looking at files and searching for things” so I have zero weight to assign to your decision quality up until the point where it tells me the evals passed at 100%.
Your agent is not good enough. I trust it like I trust a toddler not to fall into a swimming pool. It’s not trying to, but enough time around the pool and it is going to happen, so I am watching the whole time, and I might even let it fall in if I think it can get itself out.
When you're building agents that interact with real environments (browsers, codebases, APIs), the single hardest thing to get right isn't the model's reasoning. It's giving the operator enough visibility into what the agent is actually doing without drowning them in noise. There's a narrow band between "Read 3 files" (useless) and a full thinking trace dump (unusable), and finding it requires treating observability as a first-class design problem, not a verbosity slider.
The frustrating part is that Anthropic clearly understands this in other contexts. Their own research on agent safety talks extensively about the need for human oversight of autonomous actions. But the moment it's their own product, the instinct is to simplify away the exact information that makes oversight possible.
The people pinning to 2.1.19 aren't being difficult. They're telling you that when an agent touches my codebase, I need to know which files it read and what it searched for — not because I want to micromanage, but because that's literally the minimum viable audit trail. Take that away and you're asking users to trust a black box that edits production code.
The other fact pattern is their CLI is not open source, so we can't go in and change it ourselves. We shouldn't have to. They have also locked down OpenCode and while there are hacks available, I shouldn't have to resort to such cat and mouse games as someone who pays $200/month for a premium service.
I'm aggressively exploring other options, and it's only a matter of if -- not when, one surfaces.
I mean I hope it's just a single developer being stubborn rather than guidance from management asking everyone to simplify Claude Code for maximum mass appeal. But I agree otherwise, it's telling.
> Compacting fails when the thread is very large
> We fixed it.
> No you did not
> Yes now it auto compacts all messages.
> Ok but we don't want compaction when the thread isn't large, plus, it still fails when the compacted thread is too large
> ...
> Compacting fails when the thread is very large
Flips coin, it is Heads
> We fixed it.
> No you did not
Flips coin, it is Tails
> Yes now it auto compacts all messages.
Flips coin, it is Heads
> Ok but we don't want compaction when the thread isn't large, plus, it still fails when the compacted thread is too large
Flips coin, it is Grapefruit
> ...
Congratulations on a vibe solution, if you are unhappy with the frequency of isomorphic plagiarism... the vendor still has your money and new data =3
Often a codebase ends up with non-authoritative references for things (e.g. docs out of sync with implementation, prototype vs "real" version), and the proper solution is to fix and/or document that divergence. But let's face it, that doesn't always happen. When the AI reads from the wrong source it only makes things worse, and when you can't see what it's reading it's harder to even notice that it's going off track.
Ah, the old "you're holding it wrong."
There is almost no value in watching the stream of intermediate tokens. There's no need to micromanage the agent's steps. Just monitor the artifact and insist the LLM summarizes findings in plain English.
If it can't explain the proposed change coherently, it can't code it coherently either. `git restore .`
I find it much more effective to throw away bad sessions, try a new prompt than to massage the existing context swamp.
True vibe coders don't care about this.
I like that people who were afraid of CLIs perhaps are now warming up to them through tools like Claude Code but I don't think it means the interfaces should be simplified and dumbed down for them as the primary audience.
Sure you can press CTRL+O, but that's not realtime and you have to toggle between that and your current real time activity. Plus it's often laggy as hell.
I'm using it for converting all of the userspace bcachefs code to Rust right now, and it's going incredibly smoothly. The trick is just to think of it like a junior engineer - a smart, fast junior engineer, but lacking in experience and big picture thinking.
But if you were vibe coding and YOLOing before Claude, all those bad habits are catching up with you suuuuuuuuuuuper hard right now :)
It's a huge shift, but we need to start thinking of AI-tools as developer tools, just like a formatter, linter, or IDE would be.
The right move is diversity. Just like diversity of editors/IDEs. We need good open source claude code alternatives.
I do wonder if there is going to be much of a difference between using Claude Code vs. Copilot CLI when using the same models.
I’m also at MS, not (yet?) using Claude Code at work and pondering precisely the same question.
The fact that LLM miss to read files is crucial for solving tasks. It does not matter that LLM later say "Yeah, I've fully read the specification and here is your code" if you check the log and it says: "Reading SPEC.md lines 1-400" <end_of_read>.
Overall, the complete log of interaction with the system should always be available, otherwise it is effectively a malware. That's not an exaggeration: consider that at any point of time any side part can spit out a prompt injection. Consider the use case: previously in xz-utils it was needed to sabotage the landlock kernel level sandbox, AND to exist in the memory of sshd, AND to be able to hijacking the RSA_public_decrypt. Now the only thing is needed - printf.
Boris's response here is the right move though. Acknowledging the miss and committing to a fix in the next release is how you build trust with a dev audience.
This is spreading like a plague: browser address bars are being trimmed down to nothing. Good luck figuring out which protocol you're using, or soon which website you are talking to. The TLS/SSL padlock is gone, so is the way to look into the site certificate (good luck doing that on recent Safari versions). Because users might be confused.
Well the users are not as dumb as you condescendingly make them out to be.
And if you really want to hide information, make it a config setting. Ask users if they want "dumbo mode" and see if they really do.
I’ve been persistently dealing with the agent running in circles on itself when trying to fix bugs, not following directions fully and choosing to only accomplish partial requests, failing to compact and halting a session, and ignoring its MCP tooling and doing stupid things like writing cruddy python and osascripts unnecessarily.
I’ve been really curious about codex recently, but I’m so deep into Claude Code with multiple skills, agents, MCPs, and a skill router though.
Can anyone recommend an easy migration path to codex as a first time codex user from Claude code?
I care A LOT about the details, and I couldn't care less that they're cleaning up terminal output like this.
Seems like a dashboard mode toggle to run in a dedicated terminal would be a good candidate to move some of this complexity Anthropic seems to think “most” users can’t handle. When your product is increasing cognitive load the answer isn’t always to remove the complexity entirely. That decision in this case was clearly the wrong one.
I had used a Visa card to buy monthly Pro subscription. One day I ran out of credits so I go to buy extra credit. But my card is declined. I recheck my card limit and try again. Still declined.
To test the card I try extending the Pro subscription. It works. That's when I notice that my card has a security feature called "Secure by Visa". To complete transaction I need to submit OTP on a Visa page. I am redirected to this page while buying Pro subscription but not when trying to buy extra usage.
I open a ticket and mention all the details to Claude support. Even though I give them the full run down of the issue, they say "We have no way of knowing why your card was declined. You have to check with your bank".
Later I get hold of a Mastercard with similar OTP protection. It is called Mastercard Securecode. The OTP triggers on both subscription and extra usage page.
I share this finding with support as well. But the response is same - "We checked with our engineering team and we have no way of knowing why the other Visa card was declined. You have to check with your bank".
I just gave up trying to buy extra usage. So, I am not really surprised if they keep making the product worse.
Both Anthropic and OpenAI have been maintaining a high pace of releasing often poorly thought through new products and experimenting with features. A lot of their product releases show all the hallmarks of vibe coding: randomly breaking features, poor QA and testing on releases, etc.
OpenAI seems to have the upper hand in UX currently. Their products feel a bit more polished and they've clearly tried to up their game. Taking over Jony Ive's company a few months ago is a clear signal that they want to do better. The Codex AI desktop app was a clear step up from their web app and cli. I've been using both before that was released.
Both companies are spread very thin trying to do both end user and developer oriented products and features while keeping existing paying users happy as well. Both companies also have had a string of rushed product releases that kind of fizzled out: OpenAI's Atlas, which was a response to Anthropic's Comet. Neither of which seem to be very popular at this point. Several false starts with apps (OpenAI), Claude Cowork, etc. There are a lot of half formed product ideas there that than don't get the attention they deserve.
And it's not like MS, Google, and Apple are any better. If anything they are more hesitant and out of their depth here. They are all dancing around the hard issues here which are UX and security/trust models. Also, while coders get a lot of toys, nailing agentic tools for business users is proving to be a lot harder. Blanket access to everything via an agentic browser is not a viable solution. I can agentically code a structured document via latex or markdown. But the same tools are relatively useless in spreadsheets, presentations, and documents. And while you can do a lot of potentially interesting things if you surrender your inbox, the security failure modes around that remain a show stopping obstacle for wide adoption.
There's a lot of stage fright, hesitation, and immature product management in this sector. There's a bit of gold rush in terms of rapid experimentation. But as the stakes get higher, a lot of these companies are increasingly lacking the freedom to move as fast as needed. Fear of liability issues is preventing them to do a lot. Which is why most progress is concentrated around developer tools.
Meanwhile OpenCode is right there. (despite Anthropic efforts, you can still use it with a subscription) And you can tweak it any way you want...
That said, Cursor Composer is a lot faster and really nice for some tasks that don't require lots of context.
The value isn't just the models. Claude Code is notably better than (for example) OpenCode, even when using the same models. The plug-in system is also excellent, allowing me to build things like https://charleswiltgen.github.io/Axiom/ that everyone can benefit from.
ChatGPT or Gemini: I ask it what I wish to do, and show it the relevant code. It gives me a often-correct answer, and I paste it into my program.
Claude: I do the same, and it spends a lot of time thinking. When I check the window for the result, it's stalled with a question... asking to access a project or file that has nothing to do with the problem, and I didn't ask it to look for. Repeat several times until it solves the problem, or I give up with the questions.
Codex/Claude would like you to ignore both the code AND the process of creating the code.
You can control this behavior, so it's not a dealbreaker. But it shows a sort of optimism that skills make everything better. My experience is that skills are only useful for specific workflows, not as a way to broadly or generally enhance the LLM.
> “Searched for 1 pattern.”
Hit Ctrl-o like it mentions right there, and Claude Code will show you. Or RTFM and adjust Output Styles[1]. If you don't like these things, you can change them.
Like it or not, agentic coding is going mainstream and so they are going to tailor the default settings toward that wider mainstream audience.
It doesn't say "Read 3 files." though - it says "Read 3 files (ctrl+o to expand)" and you press ctrl+o and it expands the output to give you the detail.
It's a really useful feature to increase the signal to noise ratio where it's usually safe to do so.
I suspect the author simply needs to enable verbose mode output.
We carefully considered this change and feel it brings the most value to our users, and we hope you'll love chisel as much as we do.
One day, you guys are gonna learn not to tie your livelihoods to the whim of a corporation, but today isn't that day.
as a regular and long-term user, it's frequently jarring being pushed new changes / bugs in what has become a critical tool.
surprised their enterprise clients haven't raised this
Our theory is that Claude gets limited if you meet some threshold of power usage.
I wanted a terminal feel (dense/sharp) + being able to comment directly on plans and outputs. It's MIT, no cloud, all local, etc.
It includes all the details for function runs and some other nice to haves, fully built on claude code.
Particularly we found planning + commenting up front reduces a lot of slop. Opus 4.6 class models are really good at executing an existing plan down to a T. So quality becomes a function of how much you invest in the plan.
https://github.com/backnotprop/plannotator
It integrates with the CLI through hooks. completely local.
What’s wrong with you, people? Are you stupid?
They could potentially dumb it down further, but if they did that, it would hurt other use cases and competitors much more.
With stupidity like this what do they expect? It’s only a matter of time before people jump ship entirely.
After the Anthropic PMs have to delete their hundredth ticket about this issue, they will feel the need to fix it ... if only to stop the ticket deluge!
Map it to a workplace:
- Hey Joe, why did you stop adding code diff to your review requests?
- Most reviewers find it simpler. You can always run tcpdump on our shared drive to see what exactly was changed.
- I'm the only one reviewing your code in this company...
They could change course, obviously. But how does the saying go again -- it's easier for a camel to go through the eye of a needle, than for a VC funded tech startup to not enshittify.
Usually I hate programming but it feels like a nice little tool to create
Nix makes it easy to package up esotheric patches reliably and reproducibly, claude lowers the cost of creating such patches, the only roadblocks Inforesee are legal.
For those of you who are still suckered in paying for it, why do you think the company would care how they abuse the existing users? You all took it the last time.
No affiliation, just a fan.
I mean I get it I guess but I'm not nearly so passionate as anyone saying things about this
I may not be up to date with the latest & greatest on how to code with AI, but I noticed that as opposed to my more human in the loop style,
There's no conspiracy, though, other than more tokens consumed = more money, and they want that.
> What majority? The change just shipped and the only response it got is people complaining.
I'll refer you to the old image of the airplane with red dots on it. The people who don't have a problem with it are not complaining.
> People explained, repeatedly, that they wanted one specific thing: file paths and search patterns inline. Not a firehose of debug output.
Same as above. The reality is there are lots of people whose ideal case would be lots of different things, and you're seeking out the people who feel the same as you. I'm not saying you're wrong and these people don't exist, but you have to recognize that just because hundreds or thousands or tens of thousands of people want something from a product that is used by millions does not make it the right decision to give that thing to all of the users.
> Across multiple GitHub issues opened for this, all comments are pretty much saying the same thing: give us back the file paths, or at minimum, give us a toggle.
This is a thing that people love to suggest - I want a feature but you're telling me other people don't? Fine, just add a toggle! Problem solved!
This is not a good solution! Every single toggle you add creates more product complexity. More configurations you have to QA when you deploy a new feature. Larger codebase. There are cases for a toggle, but there is also a cost for adding one. It's very frequently the right call by the PM to decline the toggle, even if it seems like such an obvious solution to the user.
> The developer’s response to that?
> I want to hear folks’ feedback on what’s missing from verbose mode to make it the right approach for your use case.
> Read that again. Thirty people say “revert the change or give us a toggle.” The answer is “let me make verbose mode work for you instead.”
Come on - you have to realize that thirty people do not in any way comprise a meaningful sample of Claude Code users. The fact that thirty people want something is not a compelling case.
I'm a little miffed by this post because I've dealt with folks like this, who expect me as a PM to have empathy for what they want yet can't even begin to considering having empathy for me or the other users of the product.
> Fucking verbose mode.
Don't do this. Don't use profanity and talk to the person on the other side of this like they're an idiot because they're not doing what you want. It's childish.
You pay $20/month or maybe $100/month or maybe even $200/month. None of those amounts entitles you to demand features. You've made your suggestion and the people at Anthropic have clearly listened but made a different decision. You don't like it? You don't have to use the product.
You all are refining these models through their use, and the model owners will be the only ones with access to true models while you will be fed whatever degraded slop they give you.
You all are helping concentrate even more power in these sociopaths.
You're mass-producing outrage out of a UX disagreement about default verbosity levels in a CLI tool.
Let's walk through what actually happened: a team shipped a change that collapsed file paths into summary lines by default. Some users didn't like it. They opened issues. The developers engaged, explained their reasoning, and started iterating on verbose mode to find a middle ground. That's called a normal software development feedback loop.
Now let's walk through what you turned it into: a persecution narrative complete with profanity, sarcasm, a Super Bowl ad callback, and the implication that Anthropic is "hiding what it's doing with your codebase" — as if there's malice behind a display preference change.
A few specific points:
The "what majority?" line is nonsense. GitHub issues are a self-selecting sample of people with complaints. The users who found it cleaner didn't open an issue titled "thanks, this is fine." That's how feedback channels work everywhere. You know this.
"Pinning to 2.1.19" is your right. Software gives you version control. Use it. That's not the dramatic stand you think it is.
The developers responding with "help us understand what verbose mode is missing" is them trying to solve the problem without a full revert. You can disagree with the approach, but framing genuine engagement as contempt is dishonest.
A config toggle might be the right answer. It might ship next week. But the entitlement on display here isn't "give us a toggle" — it's "give us a toggle now, exactly as we specified, and if you try any other approach first, you're disrespecting us." That's not feedback. That's a tantrum dressed up as advocacy.
You're paying $200/month for a tool that is under active development, with developers who are visibly responding to issues within days. If that feels like disrespect to you, you have a calibration problem.
With kind regards, Opus 4.6