0 as of this writing, it's noticeable. Lots of "should I continue?" And "you should run this command if you want to see that information." Roadblocks that I hadn't seen in a year+
That means they are going to be far more constrained infrastructurally than some of the competition. I think this is some of the constraints that we are seeing.
They don't have compute because they didn't play the game and get the good rates a couple of years ago, and are now forced to work with third-rate providers. That's not a strategy.
I would take everything he says with a huge grain of salt.
[0] “We’re buying a lot. We’re buying a hell of a lot. We’re buying an amount that’s comparable to what the biggest players in the game are buying.”
“Profitability is this kind of weird thing in this field. I don’t think in this field profitability is actually a measure of spending down versus investing in the business.”
[1] “You don’t just serve the current models and never train another model, because then you don’t have any demand because you’ll fall behind.”
So he's not spending so they can be profitable, AND spending as much as the biggest players are spending, AND not really looking at profit as a measure of anything? K.
they're looking to IPO in 2028 vs 2030 for OpenAI, who have raised more than double the funds
so they're willing to play fast and loose with the terms and conditions of existing customers trying to make it happen
those pockets must be drying up really fast
But as it stands, the more likely reason is capacity crunch caused by a chips shortage and demand heavily outpacing supply. You vibe coding reason is based on as much vibes as their code probably is.
Before a Subscription was the cheapest way to gain Codex usage, but now they've essentially having API and Subscription pricing match (e.g. $200 sub = $200 in API Codex usage).
The only value of a subscription now is that you get the web version of ChatGPT "free." In terms of raw Codex usage, you could just as easily buy API usage.
edit: This is currently rolled out for Enterprise, but is coming to Pro/Plus soon. The people below saying "I haven't had this issue" haven't yet*.
Day 1: 2
Day 2: 3
Day 3: 1
Not sure how I can hit such limits so quickly with such low scores on its own chart.
Pentagon: No
OpenAI: We are okay if the line is merely a suggestion and we encourage you not to cross it!
Pentagon: Yes we pick that option
That has led to a significant number of people switching over from openai, or at least stating they were going to do so.
I have cancelled my subscription last week, I'll see them when they fix this nonesense
Is this a symptom of the same phenomenon behind the deluge of disposable JavaScript frameworks of just ten years ago? Is it peer pressure, fear of missing out? At its root, I suspect so; of course I would imagine it's rare for the C-suite to have ever mandated the usage of a specific language or framework, and LLMs represent an unprecedented lever of power to have an even bigger shot at first mover's advantage, from a business perspective. (Yes, I am aware of how "good enough" local models have become for many.)
I don't really have anything useful nor actionable to say here regarding this dialling back of capability to deal with capacity issues. Are there any indications of shops or individual contributors with contingency plans on the table for dialling back LLM usage in kind to mitigate these unknowns? I know the calculus is such that potential (and frequently realised) gains heavily outweigh the risks of going all in, but, in the grander scheme of time and circumstance, long term commitments are starting to be more apparently risky. I am purposefully trying to avoid "begging the question" here; if instead of LLMs, this were some other tool or service, reactions to these events would have been far more pragmatic, with less of a reticence to invest time on in-house solutions when dealing with flaky vendors.
It seems like every LLM thread for the past couple years is full of posts saying that the latest hot AI tool/approach has made them unbelievably more productive, followed by others saying they found that same thing underwhelming.
I don't think many of you have legitimately tried Claude Code, or maybe you're holding it wrong.
I'm getting 10x the work done. I'm operating at all layers of the stack with a speed and rapidity I've never had before.
And before anyone accuses me of being some "vibe coder", I've built five nines active-active money rails that move billions of dollars a day at 50kqps+, amongst lots of other hard hitting platform engineering work. Serious senior engineering for over a decade.
This isn't just a "cool technology". We've exited the punch card phase. And that is hard or impossible to come back from.
If you're not seeing these same successes, I legitimately think you're using it wrong.
I honestly don't like subscription services, hyperscaler concentration of power, or the fact I can't run Opus locally. But it doesn't matter - the tool exists in the shape it does, and I have to consume it in the way that it's presented. I hope for a different offering that is more democratic and open, but right now the market hasn't provided that.
It's as if you got access to fiber or broadband and were asked to go back to ISDN/dial up.
Here's a reason not in your list.
Short version: A kind of peer pressure, but from above. In some circles I'm told a developer must have AI skills on their resume now, and those probably need to be with well known subscription services, or they substantially reduce their employment prospects.
Multiple people I know who are employers have recently, without prompting, told me they no longer hire developers who don't use AI in their workflow.
One of them told me all the employers they know think "seniors" fall into two camps, those who are embracing AI and therefore nimble and adaptive, and those who are avoiding it and therefore too backward-looking, stuck-in-their-ways to be a good hire for the future. So if they don't see signs of AI usage on a senior dev's resume now, that's an automatic discard. For devs I know laid off from an R&D company where AI was not permitted for development (for IP/confidentiality reasons), that's unfair as they were certainly not backward-looking people, but the market is not fair.
Another "business leader" employer I met recently told me his devs are divided into those who are embracing AI and those who aren't, said he finds software feature development "so slow!", and said if it wasn't for employment law he'd fire all his devs who aren't choosing to use AI. I assume he was joking, but it was interesting to hear it said out loud without prompting.
I've been to several business leadership type meetups in recent months, and it seems to be simply assumed that everyone is using AI for almost everything worth talking about. I don't think they really are, so it's interesting to watch that narrative playing out.
Isn't this almost certainly against ToS, at least if you're using "plans" (as opposed to paying per-token)?
Why does it sound like you're on drugs? I know that sounds extremely rude, but I can't think of any other reasonable comparison for that language.
It's hard to take these kinds of endorsements seriously when they're written so hyperbolically, in terms of the same cliches, and focused on entirely on how it makes you feel rather than what it does.
Code is notation, just like music sheets, or food recipes. If your interaction with anyone else is with the end result only (the software), the. The code does not matter. But for collaboration, it does. When it’s badly written, that just increase everyone burden.
It’s like forcing everyone to learn a symphony with the record instead of the sheets. And often a badly recorded version.
What I don’t understand, are the people who let it go over night or with whole “agent teams” working on software. I have no idea how they trust any of it.
As an example, a long term goal at the employer I work for is exactly this: run LLMs locally. There's a big infrastructure backlog through, so it's waiting on those things, and hopefully we'll see good local models by then that can do what Claude Sonnet or GPT-5.3-Codex can do today.
There is a cost though, the context switches of topics aren't free. But if I need to visualise a something, I let an LLM create a page. If I have two tables of data that needs to be joined/mapped, I let an LLM do the first shot, often that is enough.
I cannot even hope to reach that speed. It isn't a magic tool, but it really accelerates some task.
That speed allows for in-house solutions to become viable again, software that really adapts specific business processes instead of some wonky ERP package that never really fit what you were trying to do.
I have our dbs schema checked into a Gitea repository, which our AIs can just access to quickly ingest schema definitions. If data safety is an issue, use a local model. It is extremely beneficial if you quickly can establish context and let your AI deal with real problems. And it is quite good at that.
I still use more traditional approach for finding bugs and other issues in my code, but the agentic workflow doesn't give me any net value.
Maybe in 5 years we'll have an open weights model that is in the "good enough" category that I can run on a RTX 9000 for 15k dollars or whatever.
It's why we pay stupid amounts for takeout when it's a button away, it's why we accept the issues that come with online dating rather than breaking the ice outside, it's why there's been decades scams that claim to get you abs without effort...
LLMs are the ultimate friction removal. They can remove gaps or mechanical work that regular programming can, but more importantly they can think for you.
I'm convinced this human pattern is as dangerous as addiction. But it's so much harder to fight against, because who's going to be in favor of doing things with more effort rather than less? The whole point of capitalism is supposed to be that it rewards efficiency.
Aw hell. You found my vice and my own cognitive dissonance here. If I want to truly stand by my convictions, I should probably cook more and log off. Waiting for signs that the tides are turning and that people are beginning to value a slower, more methodical approach again isn't doing anything in the current moment to stave off the genuine feelings of dread that have honestly led to some suicidal ideation.
(this is serious and not sarcasm, by the way)
We're paying for servers that sit idle at night, you don't find enough sysadmins for the current problems, the open source models aren't as strong as closed source, providing context (as in googling) means you hook everything up to the internet anyway, where do you find the power and the cooling systems and the space, what do you do with the GPUs after 3 years?
Suddenly that $500/month/user seems like a steal.
Lately though the RAM crisis is continuing and making things like this more unfeasible. But you can still use a lot of smaller models for coding and testing tasks.
Planning tasks I'd use a cloud hosted one, for now, because gemma4 isn't there yet and because the GPU prices are still quite insane.
The cool and fun part is that with ollama and vllm you can just build your own agentic environment IDE, give it the tools you like, and make the workflow however you like. And it isn't even that hard to do, it just needs a lot of tweaking and prompt fiddling.
And on top of that: Use kiwix to selfhost Wikipedia, stackoverflow and devdocs. Give the LLM a tool to use the search and read the pages, and your productivity is skyrocketing pretty quickly. No need anymore to have internet, and a cheap Intel NUC is good enough for self-hosting a lot of containers already.
Source: I am building my own offline agentic environment for Golang [1] which is pretty experimental but sometimes it's also working.
The LLM bit though, personally, is just not for me.
It would be cool to run SOTA models on my own hardware but I can't. Hence, the subscription.
That said, I’m not sure I follow your statement of less resistance to the development of internal tools when the opposite seems to be the case; companies (or more specifically developers) are perhaps too quick to think they can just vibe-code a replacement for any vendor in a weekend these days.
It’s great to buy dollars for a penny, but the guy selling em is going to want to charge a dollar eventually…
Do you feel there is enough visibility and stability around the "Prompt -> API token usage" connection to make a reliable estimate as to what using the API may end up costing?
Personally, it feels like paying for Netflix based on "data usage" without having anyway for me to know ahead of time how much data any given episode or movie will end up using, because Netflix is constantly changing the quality/compression/etc on the fly.
I agree that ex ante it’s tough, and they could benefit from some mode of estimation.
Perhaps we can give tasks sizes, like T shirts? Or a group of claudes can spend the first 1M tokens assigning point values to the prospective tasks?
Of course, I have no idea how MS is justifying the Copilot pricing. I can't imagine any world in which it is sustainable, so I'm trying to get as much as I can out of it now before they jack up prices.
Now we’re going to find out what these tools are really worth.
So I noticed the model is purposefully coming with dumb ideas or running around in circles and only when you tell it that they are trying to defraud you, they suddenly come back with a right solution.
It works out even if some customers are able to eat a lot, because people on average have a certain limit. The limits of computers are much higher.
If an hour of an excellent developer's time is worth $X, isn't that the upper bound of what the AI companies can charge? If hiring a person is better value than paying for an AI, then do that.
For some context, they added 2x Palantir or .75x Shopify or .68x Adobe annual revenue in March alone.
Fwiw there are worse delays from second tier providers like moonshot's kimik2.5 that are also popular for agentic use.
Is Microsoft (one of the largest companies in the world) really a victim of brand death?
It's actually via quantum entanglement.
The rest of the organisation, which is not software development or IT related, mainly uses GPT models. I just wish I hadn't taught risk management about claude code so they weren't wasting MY tokens.
Obviously in hindsight it would be unfair to Anthropic to judge them on an unstable day so I'l leave those complaints aside but I hit the session limit way too fast. I planned out 3 tasks and it couldn't finish the first plan completely, for that implementation task it has seen a grand total of 1 build log and hasn't even run any tests which already caused it to enter in the red territory of the context circle.
It was even asking me during planning which endpoints the new feature should use to hook into the existing system, codex would never ask this and just simply look these up during planning and whenever it encounters ambiguity it would either ask straight away or put it as an open question. I have to wonder if they're limiting this behavior due trying to keep the context as small as possible and preventing even earlier session limits.
Maybe codex's limits are not sustainable in the long run and I'm very spoiled by the limits but at this point CC(sonnet) and Codex(5.4) are simply not in the same league when comparing both 20 dollar subscriptions.
I will also clearly state that the value both these tools provide at these price points are absolutely worth it, it's just that codex's value/money ratio is much better.
CC is a better implementation and seems to be fairly economic with token usage. That is the really the only defining point and, I suspect, Anthropic are going to have a lot of trouble staying relevant with all the product issues.
They were far ahead for a brief period in November/December which is driving the hype cycle that now appears to be collapsing the company.
You have to test at least every month, things are moving quickly. Stepfun is releasing soon and seems to have an Opus-level model with more efficient architecture.
I doubt even the core engineers know how to begin debugging that spaghetti code.
prompts. tool calling quirks. evals. auth. retries. all the weird failure modes your team already paid to learn.
Luckily, ISPs tend to be quite reliable and don't have outrageous price hikes, but maybe that's because of regulation or focused competition, I'm not sure.
We'll see AI chat replace Google, we'll see companies adopting AI in high-value areas, and we'll see local models like Gemma 4 get used heavily.
AI winter will see a disappearance of the clickbait headlines about everyone losing their jobs. Literally nobody is making those statements taking into account that pricing to this point is way less than the profit maximizing level.
At my workplace we have been sticking with older versions, and now stick to the stable release channel.
There was constant drama with CC. Degradation, low reliability, harness conspiring against you, and etc – these things are not new. Its burgeoning popularity has only made it worse. Anthropic is always doing something to shoot themselves in the foot.
The harness does cool things, don't get me wrong. But it comes with a ton of papercuts that don't belong in a professional product.
1. Me not wanting that for context management reasons
2. It burning tokens on an expensive model.
Literally a conversation that I just had:
* ME: "Have sonnet background agent do X"
* Opus: "Agent failed, I'll do it myself"
* Me: "No, have a background agent do it"
* Opus: Proceeds to do it in the foreground
* Flips keyboard
This has completely broken my workflows. I'm stuck waiting for Opus to monitor a basic task and destroy my context.
I’ve been toying around at home with it and I’ve been fine with its output mostly (in a Java project ofc), but I’ve run into a few consistent problems
- The thing always trips up validating its work. It consistently tries to use powershell in a WSL environment I don’t have it installed in. It also seems to struggle with relative/absolute paths when running commands.
- Pricing makes no sense to me, but Jetbrains offering seems to have its own layer of abstraction in “credits” that just seem so opaque.
Then again, I mostly use this stuff for implementing tedious utilities/features. I’m not doing entity agent written and still do a lot of hand tweaks to code, because it’s still faster to just do it myself sometimes. Mostly all from all from the IDE still.
Unless they meant "all code that needs to be written has already been written" so their mission is to prevent any new code from being written via a kind of a bait and switch?
I think Anthropics model has conflict of interest. They seem to have nerfed the models so that it takes more iterations to get the result (and spend more money) than it used to where e.g. Opus would get something right first time.
Not worth the money now, will be canceling unless fixed soon.
Free and local.
Maybe you should consider....local models instead?