I think theyre absolutely needed. I can't afford 200 USD a month for personal use of coding AI, and I don't think such prices are reasonable for most of the world economy anyway. Not to mention US firms might be giving their employees a lot more than that.
It's increasingly feeling, to me, that theres a gap building up between haves and have nots. But then, we get news of these open weight models that are reasonably priced in inference with reasonable capabilities. Yes, they take maybe 6-9 months to get there, tbh, that's not a bad trade off at all.
I get a lot more out of a 200/mo subscription now in a week than I did from them in a month.
Now obviously in today’s world they’d be using a 200/mo subscription themselves. But it’s not like money is nothing, software development doesn’t scale down below 1k/mo for anyone competent even in the poorest areas.
So not really comparable. I use Step 3.7 Flash locally, models are good enough for so many coding tasks even at the lower end! (Though I note that calling a 200B model "lower end" is kind of amusing)
Which of course causes some unfairness on both ends. Nobody here can compete with me. I often use left over tokens on local client projects; which despite lower pay, still pays off because they now take hours not days or weeks to complete. And nobody in the local clients talent pool can compete with me; unless they charge about half the market rate.
Take away my 500$ monthly grant; and I’d be more or less screwed. Better open models will more or less start to reduce this advantage. It’s not like I positioned myself here on purpose. But it’s definitely a „right place, right time“ situation.
This depends a lot on how you work, and how much of the architectural thinking you do yourself.
People seem to lose sight of the fact that a flash model today is as powerful as a frontier model from a year ago. If you were happy with GPT 4.x, you should be ecstatic that equivalent power is now basically free...
Mind if I ask you for a few vibe coding tips? I failed to solve you gh puzzle in the profile though.
A NYC dev and a dev in india have the same ai costs, based the ratio tokens/salary it becomes less of comparative disadvantage to be in NYC.
Now combine that with the fact that AI makes the act of generating code less a % time of the job, and the ability to get/refine requirements more of the job and you have a decent shift.
For example I build other AI products and I have been hyper aware of the token spend of our users. I was going crazy seeing that some users were having 5$ conversations. So that was optimized and I found ways to use sub agents to get it down to 1-2$. Just for management asking me why I was worrying to begin with? The users using these are consultants being paid 120$ per hour. They have a daily 10-20$ token expense, no problem. “But amazing job on the cost reduction.”.. well 5$ for me is what I spend on food daily. While the consultant is slamming: “yes” 10 times in a chat , for whatever reason for the same cost. Would the NYC dev care as much natively? No.
You can still hire three devs in India for the price of a dev in NYC. Now you give them AI and you might only need 1-2. That makes offshoring even more appealing, not less. And the dev in India now having tooling to out compete local talent. Well that’s my reality (I am not in India though).
If that was true, they would be collaborating with each other and opening up all the results from their work.
The Chinese are genociding Uyghurs as we speak, purely for being Muslim, in numbers that dwarf any harm the US has done.
At work I'm struggling to keep my claude bill around $500.
Also if you run the “loops” they’re now yapping about, it will burn through enormous amounts of usage as well.
Has a very race-to-the-bottom feel to it.
Though in the grand scheme of it, $200/mo probably isn’t the real price either. Also looking at it not just in a vacuum - paying for a product that can change what you get from under you doesn’t seem great anyway.
At least with a locally-hosted model you know what you’re getting.
The LLM in a box is something you can buy today, but it 1. doesn’t serve over usb by default 2. costs $100k for hardware (not counting electricity) at 100 tps 3. can’t buy this from AliExpress.
Better to put that $100k in t-bills and just buy tokens even at api prices.
OpenAI already charges enterprise users a premium purely for that title over on-demand, no-contract usage. Retail users get a good deal. People make a lot of hay about subsidies but this is a very sane approach if you want exposure to these three different types of customers.
He's sitting on a frontier model letting it burn a hole in his wallet that could actually pay for itself.
"Meta has been using Google’s Gemini large language model for most of its moderation and customer support, but staff have recently been told to switch to Meta’s new foundational model, Muse Spark, the people said."
https://www.ft.com/content/39251a31-4a9d-4870-b86c-dc6353d67...
0. https://openrouter.ai/compare/z-ai/glm-5.2/anthropic/claude-...
People speak of a permanent underclass.
https://www.nytimes.com/2026/04/30/opinion/ai-labor-work-for...