undefined | Better HN

0 pointsan0malous2mo ago0 comments

What is all this AI doing? People are spending 10’s to 100’s of billions and no service or technology seems better or cheaper. Everything is more expensive and worse.

0 comments

76 comments · 17 top-level

barnabee2mo ago· 26 in thread

Where I work:

- Development velocity is very noticeably much higher across the board. Quality is not obviously worse, but it's LLM assisted, not vibe coding (except for experiments and internal tools).

- Things that would have been tactically built with TypeScript are now Rust apps.

- Things that would have been small Python scripts are full web apps and dashboards.

- Vibe coding (with Claude Desktop, nobody is using Replit or any of the others) is the new Excel for non tech people.

- Every time someone has any idea it's accompanied by a multi page "Clauded" memo explaining why it's a great idea and what exactly should be done (about 20% of which is useful).

- 80% of what were web searches now go to Claude instead (for at least a significant minority of people, could easily be over 50%).

- Nobody talks about ChatGPT any more. It's Claude or (sometimes) Gemini.

- My main job isn't writing code but I try to keep Claude Code (both my personal and corpo accounts) and OpenCode (also almost always Claude, via Copilot) busy and churning away on something as close to 100% of the time as I can without getting in the way of my other priorities.

We (~20 people) are probably using 2 orders of magnitude more inference than we were at the start of the year and it's consolidated away from cursor, ChatGPT and Claude to just be almost all Claude (plus a little Gemini as that's part of our Google Whateverspace plan and some people like it, mostly for non-engineering tasks).

No idea if any of this will make things better, exactly, but I think we'd be at a severe competitive disadvantage if we dropped it all and went back how things were.

stasomatic2mo ago

I am hobbyist playing around. Recently dropped CC (which gave me a sense of awe 2 months ago), but they realized GPUs need CapEx and I want to screw around with pi.dev on a budget. Then on to GH Copilot but couldn't understand their cost structure, ran out of quota half month in, now on Codex. I don't really see any difference for little stuff. I also have Antigravity through a personal Gmail account with access to Opus et al and I don't understand if I am paying for it or not. They don't have my CC so that's a breather.

It's all romantic, but a bunch of devs are getting canned left and right, a slice of the population whose disposable income the economy depends on.

It's too late to be a contrarian pundit, but what's been done besides uncovering some 0-days? The correction will be brutal, worse than the Industrial Revolution. Just the recent news about Meta cuts, SalesForce, Snap, Block, the list is long.

Have you shipped anything commercially viable because of AI or are you/we just keeping up?

fc417fc8022mo ago

> The correction will be brutal, worse than the Industrial Revolution.

Has it occurred to you that there might not be a correction, and that the outcome would still be brutal, at least on par with the industrial revolution.

2 more replies

jameshart2mo ago

There has always been a gap between the experience of solo/small shop developers, vs. developers who work in teams in a large corporate environment. But thanks to open source, we have for the past twenty years at least mostly all been using the same tools.

But right now, the difference in developer experience between a dev on a team at a business which has corporate copilot or Claude licenses and bosses encouraging them to maximize token usage, vs a solo dev experimenting once every few months with a consumer grade chat model is vast.

2 more replies

bluGill2mo ago

Developers being let go is about the economy. Every time we see a slowdown people are let go and we always blame the fad but it's the economy not whatever.

kaiokendev2mo ago

> Every time someone has any idea it's accompanied by a multi page "Clauded" memo explaining why it's a great idea and what exactly should be done (about 20% of which is useful).

we're in the same boat, and currently trying to fix that 20% problem because it's the biggest hindrance to shipping things quickly

there is a ton of learned ceremony that we have to undue gracefully because it's extremely tempting to vibe code a problem spec as opposed to just... talking to users directly and understanding what the actual problem is

mullingitover2mo ago

> - Development velocity is very noticeably much higher across the board

It's an absolute tornado of PRs these days. Everyone making the most of these tools is effectively an engineering team lead.

MrDarcy2mo ago

The CTO/VP of engineering role down is now singularly focused on keeping agents fed with a backlog of Linear issues. This is the new normal.

1 more reply

am17an2mo ago

Sounds exhausting. Are your revenue numbers up?

camdenreslink2mo ago

I am also curious about the correlation between more PRs getting merged faster and actual business outcomes.

My impression has always been it's more important the build the correct thing (what the customer needs/wants) rather than more stuff faster.

4 more replies

eieie2mo ago

Incremental cash flows is what we should be observing - have to net out the costs of llm associated with the activity.

Thats just one set of costs but a good starting point.

xnx2mo ago

Reducing costs is also a business benefit.

1 more reply

barnabee2mo ago

It’s no more exhausting than the alternative. It feels good being able to build more and experiment more.

The biggest downside is the feeling that people sometimes turn their brain off and aren’t even doing basic checks on some of the slop their LLMs produce.

jeremyjh2mo ago

It sounds very similar to my shop. I have QA people and Product Managers using Claude to develop better integration and reporting tools in Python. Business users are vibe coding all kinds of tools shared as Claude Artifacts, the more ambitious ones are building single page app prototypes. We ported one prototype to Next.js and hosted on Vercel in a couple of days and then handed it back to them with a Devcontainer and Claude Code so they can iterate on it themselves; and we also developed all the security infrastructure, scaffolding, agent instructions & policy required to do this for low stakes apps in a responsible way.

It hardly seems worth it to try to iterate on design when they can just build a completely functional prototype themselves in a few hours. We're building APIs for internal users in preference to UIs, because they can build the UIs themselves and get exactly what they need for their specific use cases and then share it with whoever wants it.

We replaced an expensive, proprietary vendor product in a couple of weeks.

I have no delusions about the scale or complexity limits of these projects. They can help with large, complex systems but mostly at the margins: help with impact analysis, production support, test cases, code review. We generate a lot of code too but we're not vibe coding a new system of record and review standards have actually increased because refactoring is so much cheaper.

The fact is that ordinary businesses have a LOT of unmet demand for low stakes custom software. The ones that lean into this will not develop superpowers but I do think they will out-compete slow adopters and those companies will be forced to catch up in the next few years.

I develop presentations now by dumping a bunch of context in a folder with a template and telling Claude Cowork what I want (it does much better than web version because of its python and shell tools and it can iterate, render, review, repeat until its excellent). The copy is quite good, I rewrite less than a third of it and the style and graphics are so much better than I could do myself in many hours.

No one likes reading a bunch of vibe coded slop and cultural norms about this are still evolving; but on balance its worth it by far.

realusername2mo ago

Personally at my place, there hasn't been a noticable velocity change since the adoption of Claude Code. I'd say it's even slightly worse as now you have junior frontend engineers making nonsense PRs in the backend.

Mainn blockers are still product, legal, management ... which Claude code didn't help with.

davidcann2mo ago

Is your team measuring how much of your code is being written with claude and comparing amongst the team, like what works best in your codebase? How are you learning from each other?

I’m making a team version of my buildermark.dev open source project and trying to learn about how teams would like to use it.

barnabee2mo ago

Different teams are using it in very different ways so it can be tough to compare meaningfully.

Backends handling tens to hundreds of thousands of messages per second with extremely high correctness and resilience requirements are necessarily taking a different approach to less critical services that power various ancillary sites/pages or to front end web apps.

That said there's a lot of very open discussion around tooling, "skills", MCP, etc., harnesses, and approaches and plenty of sharing and cross-pollination of techniques.

It would be great to find ways to better quantify the actual value add from LLMs and from the various ways of using them, but our experience so far is that the landscape in terms of both model capability and tooling is shifting so fast that that's quite hard to do.

1 more reply

croes2mo ago

Jevon‘s paradox comes into play.

https://en.wikipedia.org/wiki/Jevons_paradox

In the end only profit matters

ojr2mo ago

I am an early Gemini daily driver type engineer, feels like Node, Firefox, React and Tailwind all over again, Claude Sonnet is 10x more expensive, quick thought experiment do you think 10 Gemini prompts is needed to match the quality of one Claude Code prompt? The harness around Gemini is an issue but I built my own (in Rust)

jwpapi2mo ago

I think if you drop this all you will absolutely kill it.

komali22mo ago

I'm not sure. I have a buddy that's one of the better engineers I know personally, and he struggled to maintain an "AI Lent" for even a month. He found he just wasn't productive enough without it.

He did a writeup: https://buduroiu.com/blog/ai-lent-end/

2 more replies

ttul2mo ago

This sounds like my office, but we're a bit more tilted toward Codex. I personally use Claude Cowork for drudge-admin work, GPT 5.5-Pro for several big research tasks daily, and the LLMs munge on each other's slop all day as I try my best to wrap my head around what has been produced and get it into our document repository -- all the while being conscious that the enormous volume of stuff I'm producing is a bit overwhelming for everyone.

We are definitely reaching the point where you need an LLM to deal with the onslaught of LLM-generated content, even if the humans are being judicious about editing everything. We're all just cranking on an inhumanly massive amount of output and it's frankly scary.

JambalayaJimbo2mo ago

Didn’t got 5.5 just come out lol. Am I just reading slop on this website?

1 more reply

dominotw2mo ago

what have you guys built exactly?

barnabee2mo ago

Everything from complex backend logic and processing to new user facing festivals, ops and infra tools, analytical apps, etc.

A lot of engineers now describe the problem, discuss the outline of the solution with the LLM, and then get it to write and test most of it ahead of their review. They tell me it usually takes the same approach they would anyway and even when it doesn’t, it’s often faster to explain what’s wrong and give the LLM another try.

Very little (and even then, only simple internal tools) gets written without a human owning the code and reviewing it thoroughly, but even with that overhead the productivity boost is impressive.

pwinnski2mo ago

I kept asking this question last year, especially after that initial METR report showing people believed themselves to be faster when they were slower. Then I decided to dive in feet-first for a few weeks so that nobody could say I hadn't tried all I could.

At work, what I see happening is that tickets that would have lingered in a backlog "forever" are getting done. Ideas that would have come up in conversation but never been turned into scoped work is getting done, too. Some things are no faster at all, and some things are slower, mostly because the clankers can't be trusted and human understanding can't be sped up, or because input is needed from product team, etc. But the sorts of things that don't make it into release notes, and are never announced to customers, those are happening faster, and more of them are happening.

We review server logs, create tickets for every error message we see, and chase them down, either fixing the cause or mitigating and downgrading the error message, or however is appropriate to the issue. This was already a practice, but it used to feel like we were falling farther behind every week, as the backlog of such tickets grew longer. Most low-priority stuff, since obviously we prioritized errors based on user impact, but now remediation is so fast that we've eliminated almost the entire backlog. It's the sort of things that if we were a mobile app, would be described as "improvement and bug fixes" generically. It's a lot of quality-of-life issues for use as backend devs.

At home, I'm creating projects I don't intend for anyone outside my family to see. So far things I could theoretically have done myself, even related to things I've done myself before, but at a scale I wouldn't bother. Like a price-checker that tracks a watchlist of grocery items at nine local stores and notifies me in discord of sales on items and in categories I care about. It's a little agent posting to a discord channel that I can check before heading out for groceries.

Or several projects related to my hobbies, automating the parts I don't enjoy so much to give me more time for the parts I do. My collection of a half-dozen python scripts and three cron jobs related to those hobbies has grown to just over 20 such scripts and 14 cron jobs. Plus some that are used by an agent as part of a skill, although still scripts I can call manually, because I'll go back to cron jobs for everything if the price of tokens rises a bit more.

I was super-skeptical, and now I'm not. I think companies laying off employees are delusional or using LLMs as an excuse, but there is zero question in my mind that these things can be a huge boon to productivity for some categories of coding.

1 more reply

renegade-otter2mo ago

Development velocity is faster, but the code quality hits take a while to manifest.

Some places are more diligent, but most are not. We HATE reading other people's code, and we only have so much focus capacity per day to review all the shit these clunkers spew out.

Over time, the errors induced by Looks Good To Me code reviews compound.

Jagerbizzle2mo ago· 26 in thread

I'm burning an insane number of tokens 8-12 hours a day for the dramatic improvement of some internal tooling at a big tech company. Using it heavily for an unannounced future project as well.

I presume I'm not the only one.

msy2mo ago

We suddenly have a proliferation of new internal tools and resources, nearly all of which are barely functional and largely useless with no discernible impact on the overall business trajectory but sure do seem to help come promo time.

Barely an hour goes by without a new 4-page document about something that that everyone is apparently ment to read, digest and respond to, despite its 'author' having done none of those steps, it's starting to feel actively adversarial.

kranke1552mo ago

Without good management AI is just a new way to make terrible work in unprecedented quantities.

With good management you will get great work faster.

The distinguishing feature between organisations competing in the AI era is process. AI can automate a lot of the work but the human side owns process. If it’s no good everything collapses. Functional companies become hyper functional while dysfunctional companies will collapse.

Bad ideas used to be warded off by workers who in some shape or form of malicious compliance just would slow down and redirect the work while advocating for better solutions.

That can’t happen as much anymore as your manager or CEO can vibe code stuff and throw it down the pipeline for the workers to fix.

If you have bad processes your company will die, or shrivel or stagnate at best. Companies with good process will beat you.

1 more reply

komali22mo ago

> but sure do seem to help come promo time.

I personally noticed this. The speed at which development was happening at one gig I had was impossible to keep up with without agentic development, and serious review wasn't really possibile because there wasn't really even time to learn the codebase. Had a huge stack of rules and MCPs to leverage that kinda kept things on the rails and apps were coming out but like, for why? It was like we were all just abandoning the idea of good code and caring about the user and just trying to close tickets and keep management/the client happy, I'm not sure if anyone anywhere on the line was measuring real world outcomes. Apparently the client was thrilled.

It felt like... You know that story where two economists pass each other fifty bucks back and forth and in doing so skyrocket the local GDP? Felt like that.

qingcharles2mo ago

My main use of vibecoding is creating dozens of internal tools that have sped up tasks, or made tasks possible that were previously not. These tools would have taken weeks of time to build manually and would have been hard to justify, rather than just struggling with manual processes every now and again. AI has been life-changing in creating these kinda janky tools with janky UI that do everything they're supposed to perfectly, but are ugly as hell.

1 more reply

Gigachad2mo ago

We had a coworker vibecode an internal tool, do a bunch of marketing to the company at how incredible it is. Then got hired somewhere else.

I just went and deleted it because it's completely broken at every edge case and half of the happy paths too.

hdndjsbbs2mo ago

My team has also adopted this - it's much easier to add another layer than to refine or simplify what exists. We have AI skills to help us debug microservices that call microservices that have circular dependencies.

This was possible before but someone would maybe notice the insane spaghetti. Now it's just "we'll fix it with another layer of noodles".

3 more replies

cobolcomesback2mo ago

We’re seeing the exact same where I work. Our main Slack channels have become inundated with “new tool announcements!”, multiple per day, often solving duplicate problems or problems that don’t exist. We’ve had to stop using those channels for any real conversation because most people are muting them due to the slop noise.

And what’s worse is that when someone does build a decent tool, you can’t help but be skeptical because of all the absolute slop that has come out. And everyone thinks their slop doesn’t stink, so you can’t take them at their word when they say it doesn’t. Even in this thread, how are you to know who is talking about building something useful vs something they think is useful?

A lot of people that have always wanted to be developers but didn’t have the skills are now empowered to go and build… things. But AI hasn’t equipped them with the skill of understanding if it actually makes sense to build a thing, or how to maintain it, or how to evolve it, or how to integrate it with other tools. And then they get upset when you tell them their tool isn’t the best thing since sliced bread. It’s exhausting, and I think we’ve yet to see the true consequences of the slop firehose.

trhway2mo ago

>Barely an hour goes by without a new 4-page document about something that that everyone is apparently ment to read, digest and respond to, despite its 'author' having done none of those steps, it's starting to feel actively adversarial.

well, isn't that what AI can be used effectively for - to generate [auto]response to the AI generated content.

1 more reply

Jagerbizzle2mo ago

I'm sorry to hear that you have people abusing their new superpowers.

I run a team and am spending my time/tokens on serious pain points.

1 more reply

er2d2mo ago

Im convinced none of these people have any training in corporate finance. For if they did they'd realise they were wasting money.

I guess you gotta look busy. But the stick will come when the shareholders look at the income statement and ask... So I see an increase in operating expenses. Let me go calculate the ROIC. Hm its lower, what to do? Oh I know, lets fire the people who caused this (it wont be the C-Suite or management who takes the fall) lmao.

1 more reply

fc417fc8022mo ago

Sounds like a workplace wide DDoS.

_zoltan_2mo ago

That's not on Claude, that's on the authors.

Claude is a tool. It can be abused, or used in a sloppy way. But it can also be used rigorously.

I've been beating my team to be more papercut-free in the tooling they develop and it's been rough mostly because of the velocity.

But overall it's a huge net positive.

jeremyjh2mo ago

I'm sorry to hear you have such poor leadership.

BloondAndDoom2mo ago

AI is truly perfect for internal tooling. Security is less or no concern, bugs are more acceptable, performance / scalability rarely a concern. Quickest way to get things done, and speed up production development, MVP development etc.

jdub2mo ago

> Security is less or no concern

[waits for chickens to come home to roost]

6 more replies

cobolcomesback2mo ago

This comment makes me want to scream.

2 more replies

amluto2mo ago

I am, oddly, able to get really quite a lot of mileage out of $20/mo of OpenAI plan, and I have never encountered a usage limit. I have gotten warnings that I was close a couple times.

I wonder what I’m doing differently.

I did spend quite a bit of time, mostly manually, improving development processes such that the agent could effectively check its work. This made a difference between the agent mostly not working and mostly working. Maybe if I had instead spent gobs of money it would have worked output tooling improvements?

komali22mo ago

I wonder if you're like me? I tried out the MCPs and sub agents and rules and bells and whistles and always just came back to a plain Codex / Claude Code / Cursor Agent terminal window, where I say what I want, @ a few files, let it rip, check the diff, ask for some adjustments, then commit and start the process over after clearing context.

Haven't found a process that beats this yet and I burn very few tokens this way.

1 more reply

se4u2mo ago

I'd be interested to learn what kind of internal tooling are you improving ?

Jagerbizzle2mo ago

We've had a lot of complaints about our review processes, time to submit, etc, and a lot of that boils down to tools no one has time to improve.

It's now trivial to fix these problems while still doing our day jobs -- shipping a product.

TimTheTinker2mo ago

Personally, a static analysis PR check to catch some types of preventable runtime production errors in application code

appplication2mo ago

I’m not them but we have vastly improved our internal pipeline monitoring/triage/root cause/etc by having a new system that basically its whole purpose is to hook into all of our other systems and consolidate it under a single view with an emphasis on shortening the amount of time it takes to triage and refine issues.

This will have previously been too ambitious to ever scope but we’ve been able to build essentially all of it in just two months. Since it sits on top of our other systems and acts as more of a window/pass through control pane, the fact that it’s vibe coded poses little risk since we still have all the existing infrastructure under it if something goes awry.

bakugo2mo ago

I guess that's one way to tout a technology as revolutionary without actually needing to provide any proof of it. Just say you're using it for "internal tooling" and "unannounced projects", that way nobody can look at them and notice they're indistinguishable from the slop that clogs up Show HN nowadays.

It's better than the "here's my code, it a giant pile of spaghetti but only luddites care about code quality and maintainability anyway" method, at least.

Daishiman2mo ago

I'm using it to write frontend code literally 5 times faster. What would have been a shell script is now a GUI backed by an API layer that doesn't require looking up internal documentation to know that it exists.

I've been using it to write tools that drastically facilitate spinning up local k8s cluster with an entire suite of development services that used to take two days to set up in Docker.

hellisothers2mo ago

Same and it is working really well (I say contra to most individual reporting).

andriy_koval2mo ago

I have some coworker who says something similar, he vibe coded tons of cryptic code, which indeed solves some problem though could be way more compact and well structured. Now it is hitting complexity limitation, since llm now cant comprehend it, and human cant comprehend it by large a margin.

3 more replies

xtracto2mo ago· 3 in thread

Haven't you seen all the layoffs? Ive been subscribed to r/layoffs for 5+ years, and since a couple of months ago, it's been crazy noisy.

My hypothesis is that companies dont want to offer cheaper nor better services. Only want to cut costs and keep the revenue for investors.

I other news, TQQQ is pretty high!

adrithmetiqa2mo ago

Subscribers will not enable these companies to make their money back. The only way is for them to eat the economy itself

hmaxwell2mo ago

I'm wondering whether the layoffs are partly targeting people who haven't adapted to using AI tools, particularly those who are openly dismissive of AI-assisted work.

dieortin2mo ago

That’s like firing someone because he uses vim instead of VSCode. Who cares about the tools someone uses if he still does his job well?

2 more replies

_puk2mo ago· 3 in thread

I keep seeing this take.

And yet.. building shit is no longer the sole domain of the software engineer.

That's the sea change.

I've literally had finance and GTM stand things up for themselves in the last few weeks. A few tweaks (obviously around security and access), and they are good to go.

They've gone from wrangling spreadsheets to smooth automated workflows that allow them to work at a higher level in a matter of months.

That's what all this AI is doing. The shit we could never get the time to get around to doing.

er2d2mo ago

So... more 'busy work'.

The only thing that matters is the impact on the financials. The shareholders (the people who employ you) dont care about any of this if it does not enhance value.

uncivilized2mo ago

Mind sharing what industry you’re seeing this in? I’ve never talked to finance or GTM as an engineer. I’m not sure GTM exists in my industry.

Miner49er2mo ago

Are they doing PRs? Putting their code in git? Is AI deploying it or do they get help with that?

johanneskanybal2mo ago· 1 in thread

It's a great tool, and at 1/10 or 1/100th the cost of actual developers. In the context of yc I guess watch out getting re-disrupted by a smaller team faster than before. But that's really the trend the past 40 years so nothing is new. Well maybe the velocity combined with us loosing it's footing at the same time.

But yea it's not gonna make facebook 20% better tomorrow just that you need 5 people instead of 40 to build the next facebook.

jazzyjackson2mo ago

And we all know what a positive impact to quality of life building facebook made.

ravenstine2mo ago

Exactly. Software quality has become worse, online media has become even more trash than before, and life is otherwise basically the same, lack of jobs notwithstanding. The legitimately useful things regular people can use AI for would be mostly solved by locally run quantized models. This AI "revolution" may be setting several billion on fire without even 1% of that being real value added to the world.

Coding velocity doesn't matter if it the net result is software that sucks massive schlong. The real world doesn't care if programmers can write code faster.

psadauskas2mo ago

I'm spending a ton of tokens because it insists on manually correcting code that fails the linter, despite the instructions in the AGENTS.md to run the linter with autocorrect.

And also because the Plan agent generates a huge plan, asks me a couple yes/no questions with an obvious answer, and then regenerates the entire plan again. Then the Build agent gets confused anyway and does something else, and I have to round-trip about 5 times with that full context each time.

rnxrx2mo ago

It's not just code generation, either - more and more people in my own org are using Claude Code for infrastructure automation, devops, etc. Obviously some amount of code in there, but an absolute ton of tokens being consumed just dealing with Kubernetes work at scale.

cottoneyejoe2mo ago

I work in a large infrastructure design consultancy. There are massive unmet needs for narrow-purpose automation in project delivery that are being developed by the users that need them for a few pennies at a time. Previously, getting any project of any size done by the in-house developers required 6-24 months of alignment-building with multiple levels of management in multiple business units. Getting anything done comes with an administrative price tag in the $10,000s. Those people spent all their time on a few outward-facing products and building dashboards for EVPs and C-level people, and the people doing real work our customers actually care about are banging rocks together and fingerpainting walls of their caves with excel and email based workflow.

Now there are pockets of people who are extremely productive, and maybe 80-90% of the rest who will never adapt. When I mean extremely, I mean people producing weeks of effort of marketing teams and hundreds of unbillable hours of senior-level professionals with only a few minutes of human involvement. Paid software extensions for (awful) design software we are required by clients to use can all be duplicated in house. Technical leadership is now aware of this, and our spend on software licenses is going to drop fast. I think every project in my own portfolio has some kind of custom automation supporting it, which was unthinkable 5 years ago.

It's going to take years for practical knowledge of how to use these systems to spread and even more for market discipline to expel those who cannot or will not learn. The Industrial Revolution took nearly a century, depending on how you're counting. LLMs have only been producing coherent output in the last 5 years. They've only been as good or better than people at some things you would have done on a computer for about a year or so. Be patient. These are massive changes.

trhway2mo ago

>What is all this AI doing? People are spending 10’s to 100’s of billions and no service or technology seems better or cheaper. Everything is more expensive and worse.

That "more expensive" is someone's revenue. May be AI is the kind of technology that allows to make more and more revenue by making things more expensive and worse than by making them better and cheaper.

amelius2mo ago

Yes but help is on the way. I have asked my OpenClaw agent to build a new RAM factory.

notnullorvoid2mo ago

This is my take away too. I see some interesting toys here and there, but not much of substance. Meanwhile all the GitHub issues I follow for open source projects have slowed to a halt, the products I use have no significant updates. Even AI products are slow to improve their interfaces.

bgun2mo ago

You seem to be under the impression that making services better or cheaper _for the consumer_ is the goal of any corporation. The goal is to make their own operations better and cheaper for them. They are laying off employees and adding features of questionable value as a pretext to raise prices. The playbook has not changed, it has only accelerated.

rjlouv2mo ago

Real-world counter-anecdote from mid-market finance (sub-$500M revenue, 2-3k employees). On a customer's JD Edwards data, a playbook run via MCP returns AP aging by company plus the three vendors driving the variance plus the prior-month delta plus citations to the JDE F0411 rows in roughly 30 seconds. The same in Excel takes ~45 minutes. That's not 100x, it's not transformative, and it's not autonomous. It's one analyst hour reclaimed per query. Multiply by ~80 queries a month and a $50/hour fully loaded cost and the math is mundane and positive. Value is real and unsexy. The marketing has compressed all the unsexy ROI into the same sentence as the trillion-dollar promises. (Disclosure: I co-founded eyko, the platform doing this.)

_zoltan_2mo ago

Claude is great. I'm never going back. There is no way back.

I'm at least 5x faster, if not more. With tooling I might be able to get to 10-15x.

pizzly2mo ago

For myself, its a massive boost when solo developing. Perhaps this is a different use case than most. It can work across multiple programming languages and frameworks that I had zero experience in. I use my existing knowledge of programming to ensure the new code written is correct. Also it really excels at translating from one language/framework to another. I can spend time getting it working well in a platform I know then just ask it to convert to another platform. It gets it 90% right in the first prompt, then its just a matter of fine-tuning, reviewing etc. This last 10% is where I supercharge my learning on those languages/framework. To lean all the new languages and frameworks would have taken me months before I would be productive. Now with a single prompt, we get 90% of the way there. That is incredible value for us.

jonlucc2mo ago

I can say in one role in my job, I'm getting a lot of use and I know my colleagues are at least trying a lot of things. One use is a first-pass review of animal care and use protocols. The Claude project was given all of the relevant policies and guidelines as well as a fairly long prompt that explains the things we look for in protocol review. It's checking some things that the software we use makes very tedious to check and raising inconsistencies between sections. Some places have a full time "protocol reader" who does this kind of first check, but we've never had that, so it's helpful.

Another project I'm seeing in the same realm is taking an approved protocol and some study results and checking that the records of what was done match what they said they could do in the approved protocol. It can also make sure that surgical records have all the things they should have. This can help meet one of the requirements from the national accreditation organization to do "post approval monitoring".

Another way I've used it is to have it collate and compare a particular kind of policy across many institutions who transparently put their policies online. Seeing the commonality between the policies and where some excel helped me rewrite our policy.

This is work that just wasn't happening before or, more accurately, it was being spread over lots of people, and any improvement in efficiency or consistency is hard to measure.

j / k navigate · click thread line to collapse