I saw a podcast where the main guy behind Claude Code, Boris, just casually explained that at Anthropic where there are some software developers using sometimes multiple 6 figure values in dollars PER MONTH on their tokens for running their swarms of Claude agents...that might make you some kind of productivity guru but try going to your CFO and asking for these kinds of figures from each software developer PER MONTH...they will spit coffee out in your face and burst out laughing.
It would be the equivalent of saying 'yeah we want a new CAD and design management package that we basically prompt and it does everything for us'
'How much is it?'
'Well, upwards of 7 figures per year, per design engineer, and we have about 400 design engineers'.
'You want half a billion pounds, per year, so you can all play with your imaginary friends? No...piss off and get back to work'
Was absolutely comical to watch.
Claude is useful for software engineers. It’ll be useful until something is better-enough and then we’ll all move on to that.
Most folks are using both Claude/Codex together anyways, undermining the idea that Anthropics corporate strategy mattered in the market.
I mean, a lot of developers have 90% of their code being written by AI (myself and my friends at the labs included). Obviously YMMV depending on your codebase and individual skill.
"Software engineers will at times overestimate their capabilities, as demonstrated by the METR study that found that developers believed they were 24% faster when using LLMs, when in fact coding models made them 19% slower. This, naturally, makes them quite defensive of the products they use, and whether or not they’re actually seeing improvements."
I wonder what he thinks about the new METR update that showed a net speedup as a lower bound (due to participants literally not wanting to even tackle tasks with AI due to how slow it would be), with the returning devs having the greatest improvements in speedup?
"for one of Anthropic’s greatest lies: that AI can “work uninterrupted” for periods of time, leaving the reader or listener to fill in the (unsaid) gap of “...and actually create useful stuff.”"
We're probably at the beginning of the S curve for long-running tasks that create useful stuff (https://ladybird.org/posts/adopting-rust/) but it clearly needs hand-holding and a way to self-verify work.
"No amount of DarioMath about how a model “costs this much and makes this much revenue” changes the fact that profitability is when a company makes more money than it spends."
Feels like he's being dishonest here because the economics of the labs are unique (and precarious). Each model (revenue - cost to train and serve) is profitable. Labs invest in the next model to maintain their advantage, otherwise people will stop using their latest models. This probably doesn't go on in perpetuity (which is what Ed should've analyzed more). To his benefit, he's right that CC subscriptions are currently being subsidized.
[Insert quotes of Dario saying models will be smarter than most humans or Nobel laureates]
I mean, he's not wrong in certain definitions of "smart". They're already well above the average human in terms of testable world knowledge, math, coding, science, etc... but obviously fall short in other ways compared to humans.
Oh, I don't know about that. That's a very big claim.
I don't trust either of them but I have a different reading of their motivations. Altman comes off to me like a ruthless sociopath who lies as easily as he breathes and Amodei comes off to me as more of an over-enthusiastic goober who is a bit high on his own supply.