Once the codebase has become fully agentic, i.e., only agents fundamentally understand it and can modify it, the prices will start rising. After all, these loss making AI companies will eventually need to recoup on their investments.
Sure it will be - perhaps - possible to interchange the underlying AI for the development of the codebase but will they be significantly cheaper? Of course, the invisible hand of the market will solve that problem. Something that OPEC has successfully done for the oil market.
Another issue here is once the codebase is agentic and the price for developers falls sufficiently that it will significant cheaper to hire humans again, will these be able to understand the agentic codebase? Is this a one-way transition?
I'm sure the pro-AIs will explain that technology will only get cheaper and better and that fundamentally it ain't an issue. Just like oil prices and the global economy, fundamentally everything is getting better.
We will miss SaaS dearly. I think history is repeating just with DVD and streaming - we simply bought the same movie twice.
AI more and more feels the same. Half a year ago Claude Opus was Anthropics most expensive model - boy, using Claude Opus 4.6 in the 500k version is like paying 1 dollar per minute now. My once decent budgets get hit not after weeks but days (!) now.
And I am not using agents, subagents which would only multiply the costs - for what?
So what we arrive more and more is the same as always: low, medium, luxury tier. A boring service with different quality and payment structures.
Proof: you cannot compensate with prompt engineering anymore. Month ago you fixed any model discrepancies by being more clever and elaborate with your prompts etc.
Not anymore. There is a hidden factor now that accounts for exactly that. It seems that the reliance on skills and different tiers simply moves us away from prompt engineering which is considered more and more jailbreaking than guidance.
Prompt engineering lately became so mundane, I wonder what vendors were really doing by analyzing the usage data. It seems like that vendors tied certain inquiries with certain outcomes modeled by multistep prompting which was reduced internally to certain trigger sentences to create the illusion of having prompted your result while in fact you haven't.
All you did was asking the same result thousands of user did before and the LLM took an statistical approach to deliver the result.
Maybe you did, but I certainly didnt.
so 60 usd / hour? a plumber earns more
if this allows you to produce features that bring you money, it's a no-brainer
The current discourse around "AI", swarms of agents producing mountains of inscrutable spaghetti, is a tell that this is the future the big players are looking for. They want to create a captive market of token tokers who have no hope of untangling the mess they made when tokens were cheap without buying even more at full price.
What exactly do we mean this? Because it is obviously common for human coders to tackle learning how an unfamiliar and complex codebase works so that they can modify it (new hires do it all the time). I can think this means one of two things:
* The code and architecture being produced by agents takes approaches that are abnormally complex or inscrutable to human reviewers. Is that what folks working with cutting edge agents are seeing? In which case, such code obviously isn’t beeping reviewed; it can’t be.
* the code and architecture being produced by agents can still be understood by human reviewers, but it isn’t actually being reviewed by anyone — since reviewing pull requests isn’t always fun or easy, and injecting in-depth human review slows everything down a lot — and so no one understands how the code works. (I keep thinking about the AI maximalist who recently said he woke up to 75 pull requests from his agent, like that was a good thing)
And maybe it’s a combination of the two: agent-generated pull requests are incrementally harder to grok, which makes reviewing more painful and take longer, which means more of them go without in-depth reviews.
But if your claim is true, the bottom line is that it means no one is fully reviewing code produced by agents.
I agree with you, BUT: I find it much harder to get my head around a medium sized vibe coded project than a medium size bespoke coded project. It's not even close.
I don't know what codebases will look like if/when they become "fully agentic". Right now, LLM-agents get worse, not better, as a codebase grows, and as more if it is coded (or worse architected) by LLM.
Humans get better over time in a project and LLMs get worse, and this seems fundamental to the LLM architecture really. The only real way I see for codebases to become fully agentic right now is if they're small enough. That size grows as context sizes that new models can deal with grows.
If that's how this plays out - context windows get large enough that LLM-agents can work fine in perpetuity in medium or large size projects - I wonder if the resulting projects will be extremely difficult for humans to wrap their heads around. That is, if the LLM relies on looking at massive chunks of the codebase all at once, we could get to the point of fully agentic codebases without having to tackle the problem of LLMs being terrible at architecture, because they don't need it.
- Garden path approaches are definitely a thing, but I don't think this is necessarily catastrophic. A lot depends on the language and framework in question, and also the driver of the change.
- I think it's that plus the fact it's easy to just generate ever more code. Solutions scale in every dimension until they hit a limit where it's not feasible to go further. If AI tools will allow you to write a project with a million or 10 million lines of code, you can bet it will eventually happen. Who's ever gonna fix that?
Sometimes the argument lands, very often it doesn't. As you said, a common refrain is, "but prices won't go up, cost to serve is the highest it will ever be." Or, "inference is already massively profitable and will become more so in the future--I read so on a news site."
And that remark, for me, is unfortunately a discussion-ender. I just haven't ever had a productive conversation with somebody about this after they make these remarks. Somebody saying these things has placed their bets already and are about to throw the dice.
The key is to keep any changes to code small enough to fit in your own "context window." Exceed that at your own risk. Constantly exceeding your capacity for understanding the changes being made leads to either burnout or indifference to the fires you're inevitably starting.
Be proactive with these tools w.r.t. risk mitigation, not reactive. Don't yolo out unverified shit at scales beyond basic human comprehension limits. Sure, you can now randomly generate entirely (unverified) new software into being, but 95% of the time that's a really, really bad idea. It is just gambling and likely some part of our lizard brains finds it enticing, but in order to prevent the slopification of everything, we need to apply some basic fucking discipline.
As you point out, it's our responsibility as human engineers to manage the risk reward tradeoffs with the output of these new tools. Anecdotally, I can tell you, we're doing a fucking bad job of it rn.
- A Kafka topic visualization dashboard
and
- A chrome extension the original "developer" can no longer work on cause the bots will wreck something else on every new feature he tries to add or bug he tries to fix
I think we're a ways out from truly complex code bases that only agents understand.
I've seen a bunch of hype video where people spend lord knows how much money in order to have a bunch of these things run around and I guess... use Facebook, and make reports to distribute amongst themselves, and then the human comes in and spends all their time tweaking this system. And then apparently one day it's going to produce _something_ but two years and counting and much like bitcoin, I've yet to see much of this _something_ materialize in the form of actual, working, quality software that I want to use.
My buddy made a thing that tells him how many people are at the gym by scraping their API and pushing it into a small app package... I guess that's kind of nice.
This reminds me of the apocryphal headline from the dying days of the British Empire:
> Fog in Channel; Continent Cut Off
The latest qwen models are already very useful, and the smaller ones can be run locally on my laptop. These are obviously not as good as the latest frontier models, and that's extremely noticeable for the development workflow, but maybe in a year or two, they will be competitive with the proprietary models we have today, which are incredibly capable. I also expect compute for inference to continue getting cheaper.
The current lock in for me is the UX of Claude Code / codex cli, but this is a very small moat that will definitely be commoditized soon.
No worries there, the huge improvements we see today from GPT and Claude, are at their heart just Reinforcement Learning (CoT, chain of thought and thinking tokens are just one example of many). RL is the cheapest kind of training one can perform, as far as I understand. Please correct me if that's not the case.
In the economy the invisible hand manages to produce everything cheaper and better all the time, but in the digital space the open source invisible hand makes everything completely free.
In this case the limitation is the compute. Very few people have the compute required for AI/LLMs locally or for free (comparable to the performance of Claude). So yes, there are plenty of Open Source models that can be used locally but you need to invest in hardware to make that happen and especially if you want the quality that is available from the commercial offerings.
Not to speak of the training of those models. It's all there to make it possible to do this locally however where's the hardware? AWS? Google? There are hidden costs of the Open Source model in this case.
I agree with most of your points, but computation can be transferred from a place where energy is cheap to a place that is expensive. Energy for cooking cannot be transferred that way.
See for example Amazon-Google datacenters in the Gulf region. We've also got a whole continent, Australia, to put as many solar panels as we desire. Australia got dark for half a day, every day? Put solar panels to the opposite side of the planet.
Energy is a concern, for cooking, transportation etc. Energy for computation is not.
Probably there is an issue with how much there is in CS - each programming language basically represents a different fundamental approach to coding machines. Each paradigm has its application, even COBOL ;)
Perhaps CS has not - yet - found its fundamental rules and approaches. Unlike other sciences that have hard rules and well trodden approaches - the speed of light is fixed but not the speed of a bit.
> these loss making AI companies will eventually need to recoup
This is true, and while AI spend continues to rise, I’m starting to think once the dust settles and the true costs emerge and stable profits are achieved, that it may be expensive enough that it’s a limiting force.
I remember having to pay a pretty penny to have a 3 minute conversation with my dad working half way across the world. Now I can video call my nephew for 45 minutes without blinking an eye. What happened?
Why will Intelligence be like Oil and not Broadband?
I would bet a lot of money that the price of LLM assistance will go down, not up, as the hardware and software advance.
Every genre-defining startup seems to go through this same cycle where the naysayers tell us that it's all going to collapse once the investment money runs out. This was definitely true for technologies without use cases (remember the blockchain-all-the-things era?) but it is not true for businesses that have actual users.
Some early players may go bust by chasing market share without a real business plan, like the infamous Webvan grocery delivery service. But even Webvan was directionally correct, with delivery services now a booming business sector.
Uber is another good example. We heard for years that ridesharing was a fad that would go away as soon as the VC money ran out. Instead, Uber became a profitable company and almost nobody noticed because the naysayers moved on to something else.
AI is different because the hardware is always getting faster and cheaper to operate. Even if LLM progress stalled at Opus 4.6 levels today, it would still be very useful and it would get cheaper with each passing year as hardware improved.
> I'm sure the pro-AIs will explain that technology will only get cheaper and better and that fundamentally it ain't an issue. Just like oil prices
Comparing compute costs to oil prices is apples to oranges. Oil is a finite resource that comes out of the ground and the technology to extract it doesn't improve much over decades. AI compute gets better and cheaper every year because the technology advances rapidly. GPU servers that were as expensive as cars a few years ago are now deprecated and available for cheap because the new technology is vastly faster. The next generation will be faster still.
If you're mentally comparing this to things like oil, you're not on the right track
Rideshare costs are much higher than they have been in years past. Everyone noticed
Yes but the chips, hardware, copper cables, silicon and all the rest of the components that make up a server are finite. Unless these magically appear from outer space, we'll face the same resource constraints as everything else that is pulled out of the ground.
These components are also far more fragile to source, see COVID and the collapse of global supply chains. Also the factories to create these components are expensive to build and fragile to maintain. See the Dutch company that seems to be the sole supply of certain manufacturing skills.[1]
> I would bet a lot of money that the price of LLM assistance will go down, not up, as the hardware and software advance.
My bet would be that it would fuel the profits of AI companies and not make the price of AI come down. Over supply makes price come down but if supply is kept artificially low, then prices stay high.
That's the comparison to OPEC and oil. There is plenty of oil to go around yet the supply is capped and thereby prices kept high. There is no guarantee that savings in hardware or supply will be passed on by AI corps.
Indeed there is no guarantee that there will be serious competition in the market, OPEC is a monopoly so why not have an AI monopoly? At the moment, all major players in AI are based in the same geopolitical sphere, making a monopoly more likely, IMHO.
In the end, it's all speculation what will happen. It just depends on which fairy tail one believes in.
Raw material cost is not a driver of datacenter GPU costs.
> Over supply makes price come down but if supply is kept artificially low, then prices stay high.
Where are you getting "supply kept artificially low" when we're in the middle of an explosion of datacenter buildouts and AI companies?
We're in a race to the bottom on pricing. I haven't seen a realistic argument for why you think prices are going to go up. You're starting with a conclusion and trying to find reasons it might be true.
Whether a generalized and broadly usable model will be able to trained within some N multiple of our current compute availability allowing the price to come down with iterative compute advances is yet to be seen. With the current race to the top in terms of SOTA models and increasingly iteratively smaller improvements on previous generations, I have a feeling the scaling need for compute will outpace the improvements in our hardware architecture, and that's if Moore's law even holds as we start to reach the bounds of physics and not engineering.
However as it stands today, essentially none of these providers are profitable so it's really a question of whether that disconnect will come within their current runway or not and they'll be required to increase their price point to stay alive and/or raise more capital. It's pure conjecture either way.
https://www.goodreads.com/quotes/141645-heard-joke-once-man-...
If someone anonymous says "Using coding agents carelessly produces junk results over time" that's a whole lot less interesting to me than someone with a proven track record of designing and implementing coding agents that other people extensively use.
Yes, but we all have insufficient intelligence and knowledge to fully evaluate all arguments in a reasonable timeframe.
Argument from authority is, indeed, a logical fallacy.
But that is not what is happening here. There is a huge difference between someone saying "Trust me, I'm an expert" and a third party saying "Oh, by the way, that guy has a metric shitton of relevant experience."
The former is used in lieu of a valid argument. The latter is used as a sanity check on all the things that you don't have time to verify yourself.
His blog post on pi is here: https://mariozechner.at/posts/2025-11-30-pi-coding-agent/
One thing about the old days of DOS and original MacOS: you couldn't get away with nearly as much of this. The whole computer would crash hard and need to be rebooted, all unsaved work lost. You also could not easily push out an update or patch --- stuff had to work out of the box.
Modern OSes with virtual memory and multitasking and user isolation are a lot more tolerant of shit code, so we are getting more of it.
Not that I want to go back to DOS but Wordperfect 5.1 was pretty damn rock solid as I recall.
It's not the glut of compute resources, we've already accepted bloat in modern software. The new crutch is treating every device as "always online" paired with mantra of "ship now! push fixes later." Its easier to setup a big complex CI pipeline you push fixes into and it OTA patches the users system. This way you can justify pushing broken unfinished products to beat your competitors doing the same.
I still save stuff every few minutes out of habits formed in the 90s.
Old DOS stuff could either be a total nightmare or some of the most brilliant code you had ever seen. Thats just the way having no giard rails goes.
Remember when OS uptime was super duper important? Now it's a given that you can basically never restart your computer and be fine.
The sad truth is that now, because of the ease of pushing your fix to everything while requiring little more from the user than that their machine be more or less permanently connected to a network, even an OS is dealt with as casually as an application or game.
The other, arguably far more important output, is the programmer.
The mental model that you, the programmer, build by writing the program.
And -- here's the million dollar question -- can we get away with removing our hands from the equation? You may know that knowledge lives deeper than "thought-level" -- much of it lives in muscle memory. You can't glance at a paragraph of a textbook, say "yeah that makes sense" and expect to do well on the exam. You need to be able to produce it.
(Many of you will remember the experience of having forgotten a phone number, i.e. not being able to speak or write it, but finding that you are able to punch it into the dialpad, because the muscle memory was still there!)
The recent trend is to increase the output called programs, but decrease the output called programmers. That doesn't exactly bode well.
See also: Preventing the Collapse of Civilization / Jonathan Blow (Thekla, Inc)
Perhaps on a related note, I've noticed that a lot of the positive talks about AI are about quantity. On the other hand, there is disproportionately very little deep discussion about quality. And I mean not just short term, local quality, but more long term and holistic quality (e.g. managing complexity under evolving requirements in a complex system with multiple connected parts) at real production scale, where there is much less tolerance for failure.
In all the places I've worked in throughout my career, I've felt that there have always been a tension between those who cared more about things like the mental model and holistic quality, and those who seemed to care less or were even oblivious about it. I think one contribution of the current AI hype is that it gave a more concrete shape to this split...
and to me this is so weird, because from what I can tell, quantity hasn't been the winning factor for a very long time now
In software engineering our job is to build reliable systems that scale to meet the needs of our customers.
With the advent of LLMs for generating software, we're simply ignoring many existing tenets of software engineering by assuming greater and greater risk for the hope of some reward of "moving faster" without setting up the proper guard rails we've always had. If a human sends me a PR that has many changes scattered across several concerns, that's an instant rejection to close that PR and tell them to separate those into multiple PRs so it doesn't burn us out reviewing something beyond human comprehension limits. We should be rejecting these risky changes out of hand, with the possible exception when "starting from scratch", but even then I'd suggest a disciplined approach with multiple validation steps and phases.
The hype is snake oil: saying we can and should one-shot everything into existence without human validation, is pure fantasy. This careless use of GenAI is simply a recipe for disasters at scales we've not seen before.
During the Q&A, he responds "do we really want software written that humans cannot understand?!" His steadfast doubts against singularity are called into question, at least by his supporting 2019 responses.
Certainly the speaker is correct that modern hardware allows software to be crappily written — I fondly recall the "olden times" recanted about full-access operating systems of yesteryear. Those days are over...
The fact that a modern computer "needs" to be online to install an update is frustrating/concerning (e.g. for MacOS, without a USB installer must be online to update, even with stand-alone updater downloaded). Just use my local hardware (that I own) and install this software (that I have provided).
And I'm thinking - has anyone actually done that for something meaningful?
Replacing salesforce as your crm or replacing Shopify as your e-commerce platform?
I get the hype but AI doesn't remove accountability, it just moves it up. Oh you can do with 1 person what 3 people used to do? Great, that 1 person is now accountable for 3 person's jobs. And people are naturally uncomfortable with that - you need to understand what's going on and be able to investigate / fix. It's different than say, weaving machines replacing jobs because weaving machines were consistent. 1 person could confidently produce what x weavers could before. But AI is not, and that variability in output & quality introduces massive friction.
So as of now, in both software and people, there's a real limit to how much AI can replace because the remaining people still are equally accountable.
Current gen agents need to be provided with small, actionable units of work that can _easily_ be reviewed by a human. A code deliverable is made easy to review if the scope of change is small and aligned with a specific feature or task, not sprawled across multiple concerns. The changes must be ONLY related to the task at hand. If a PR is generated that does two very different things like fix linting errors in preexisting code AND implement feature X, you're doing it wrong. Or rather, you're simply gambling. I'd rather not leave things up to chance that I may miss something in that new 10000LOC PR. It's better that a 10000LOC never existed at all.
YOLOing out massive, sweeping changes with agents exceed our own (human) "context windows" and as this article points out, we're then left with an inevitable "mess." The untangling of which will take an inordinate amount of time to fix.
If AI can enable engineers to move through the organization more effectively, say by allowing them to work through the service mesh as a whole, that could reduce time. But in order to evaluate code contributions to any space well, as far as I can tell, you still have to put in leg work even if you are an experienced engineer and write some features which exposes you to the libraries, quirks, logging/monitoring, language, etc that make up that specific codebase. (And also to build trust with the people who own that codebase and will be gatekeeping your changes, unless you prefer the Amazon method of having junior engineers YOLO changes onto production codebases without review apparently... holy moly, how did they get to that point in the first place...)
So the gains seem marginal at best in large organizations. I've seen some small organizations move quicker with it, they have less overhead, less complexity, and smaller tasks. Although I've yet to see much besides very small projects/POCs/MVPs from anyone non-technical.
Maybe it'll get to the point where it can handle more complexity, I kind of think we're leveling off on this particular phase of AI, and some headlines seem to confirm that...
- MS starting to make CoPilot a bit less prominent in its products and marketing - Sora shutting down - Lots of murky, weird, circular deals to fund a money pit with no profits - Observations at work
It's really kind of crazy how much our entire society can be hijacked by these hype machines. My company did slow roll AI deployment a bit, but it very much feels like the Wild West, and the amount of money spent! I'm sure it's astronomical. Pretty sure we could have hired contractors to create the Chrome plugin and Kafka topic dashboard we've deployed for far cheaper
The problem is that it's VERY easy to overload oneself with the output of these new tools. Human comprehension is the bottleneck, as much as it always has been. Anyone that tells you otherwise is shilling for these companies.
Perhaps so-called AI is slightly different from hypes like NoSql and microservices in that these reduced to usages that practically apply to only a fraction of the engineering population (albeit, it's still good for anyone to know about them even if we never use them), whereas AI will probably still affect us all even after the dust settles. Just in much less spectacular ways than is being trumpeted currently by some groups. Reminded me of No Silver Bullet: "There is no single development, in either technology or management technique, which by itself promises even one order of magnitude improvement in productivity, in reliability, in simplicity. "
There is other tech that did completely change how we do things. CI/CD, Containers, Kubernetes, distributed tracing etc. are considered standard now (but weren’t not that long ago).
As somebody who has been running systems like these for two decades: the software has not changed. What's changed is that before, nobody trusted anything, so a human had to manually do everything. That slowed down the process, which made flaws happen less frequently. But it was all still crap. Just very slow moving crap, with more manual testing and visual validation. Still plenty of failures, but it doesn't feel like it fails a lot of they're spaced far apart on the status page. The "uptime" is time-driven, not bugs-per-lines-of-code driven.
DevOps' purpose is to teach you that you can move quickly without breaking stuff, but it requires a particular way of working, that emphasizes building trust. You can't just ship random stuff 100x faster and assume it will work. This is what the "move fast and break stuff" people learned the hard way years ago.
And breaking stuff isn't inherently bad - if you learn from your mistakes and make the system better afterward. The problem is, that's extra work that people don't want to do. If you don't have an adult in the room forcing people to improve, you get the disasters of the past month. An example: Google SREs give teams error budgets; the SREs are acting as the adult in the room, forcing the team to stop shipping and fix their quality issues.
One way to deal with this in DevOps/Lean/TPS is the Andon cord. Famously a cord introduced at Toyota that allows any assembly worker to stop the production line until a problem is identified and a fix worked on (not just the immediate defect, but the root cause). This is insane to most business people because nobody wants to stop everything to fix one problem, they want to quickly patch it up and keep working, or ignore it and fix it later. But as Ford/GM found out, that just leads to a mountain of backlogged problems that makes everything worse. Toyota discovered that if you take the long, painful time to fix it immediately, that has the opposite effect, creating more and more efficiency, better quality, fewer defects, and faster shipping. The difference is cultural.
This is real DevOps. If you want your AI work to be both high quality and fast, I recommend following its suggestions. Keep in mind, none of this is a technical issue; it's a business process isssue.
Many years ago, I started working for chip companies. It was like a breath of fresh air. Successful chip companies know the costs (both direct money and opportuity) of a failed tapeout, so the metaphorical equivalent of this cord was there.
Find a bug the morning of tapeout? It will be carefully considered and triaged, and maybe delay tapeout. And, as you point out, the cultural aspect is incredibly important, which means that the messenger won't be shot.
In the past with smaller services those services did break all the time, but the outage was limited to a much smaller area. Also systems were typically less integrated with each other so one service being down rarely took out everything.
What leads to more failure is when you don't engineer those consolidated entities to be reliable. Tech companies have none of the legal requirements or incentives to be reliable, the way physical infrastructure companies do. I agree that the tighter integration is an issue, but the root cause is tech companies have no incentive other than profits. If they're making profits, everything's fine.
A service goes down. He tells the agent to debug it and fix it. The agent pulls some logs from $CLOUDPROVIDER, inspects the logs, produces a fix and then automatically updates a shared document with the postmortem.
This got me thinking that it's very hard to internalize both issue and solution -updating your model of the system involved- because there is not enough friction for you to spend time dealing with the problem (coming up with hypotheses, modifying the code, writing the doc). I thought about my very human limitation of having to write things down in paper so that I can better recall them.
Then I recalled something I read years ago: "Cars have brakes so they can go fast."
Even assuming it is now feasible to produce thousands of lines of quality code, there is a limitation on how much a human can absorb and internalize about the changes introduced to a system. This is why we will need brakes -- so we can go faster.
And at that point, if the autonomous system breaks, realized it’s broken, and fixes itself before you even notice… then do you need to care whether you learn from it? I suppose this could obfuscate some shared root cause that gets worse and worse, but if your system is robust and fault-tolerant _and_ self-heals, then what is there to complain about? Probably plenty, but now you can complain about one higher level of abstraction.
I've been working on an Ansible code base in the past few weeks. I manually put that together a few years ago and unleashed codex on it to modernize it and adapt it to a new deployment. It's been great. I have a lot of skills in that repository that explain how to do stuff. I'm also letting codex run the provisioning and do diagnostics. You can't do that unless you have good guard rails. It's actually a bit annoying because it will refuse to take short cuts (where I would maybe consider) and sticks to the process.
I actually don't write the skills directly. I generate them. Usually at the end of a session where I stumbled on something that works. I just tell it to update the repo local skills with what we just did. Works great and makes stuff repeatable.
I'm at this point comfortable generating code in languages I don't really use myself. I currently have two Go projects that I'm working on, for example. I'm not going to review a lot of that code ever. But I am going to make sure it has tests that prove it implements detailed specifications. I work at the specification level for this. I think a lot of the industry is going to be transitioning that direction.
What are you building? Does the tool help or hurt?
People answered this wrong in the Ruby era, they answered it wrong in the PHP era, they answered it wrong in the Lotus Notes and Visual BASIC era.
After five or six cycles it does become a bit fatiguing. Use the tool sanely. Work at a pace where your understanding of what you are building does not exceed the reality of the mess you and your team are actually building if budgets allow.
This seldom happens, even in solo hobby projects once you cost everything in.
It's not about agile or waterfall or "functional" or abstracting your dependencies via Podman or Docker or VMware or whatever that nix crap is. Or using an agent to catch the bugs in the agent that's talking to an LLM you have next to no control over that's deleting your production database while you slept, then asking it to make illustrations for the postmortem blog post you ask it to write that you think elevates your status in the community but probably doesn't.
I'm not even sure building software is an engineering discipline at this point. Maybe it never was.
This x1000. The last 10 years in the software industry in particular seems full of meta-work. New frameworks, new tools, new virtualization layers, new distributed systems, new dev tooling, new org charts. Ultimately so we can build... what exactly? Are these necessary to build what we actually need? Or are they necessary to prop up an unsustainable industry by inventing new jobs?
Hard to shake the feeling that this looks like one big pyramid scheme. I strongly suspect that vast majority of the "innovation" in recent years has gone straight to supporting the funding model and institution of the software profession, rather than actual software engineering.
> I'm not even sure building software is an engineering discipline at this point. Maybe it never was.
It was, and is. But not universally.
If you formulate questions scientifically and use the answers to make decisions, that's engineering. I've seen it happen. It can happen with LLMs, under the proper guidance.
If you formulate questions based on vibes, ignore the answers, and do what the CEO says anyway, that's not engineering. Sadly, I've seen this happen far too often. And with this mindset comes the Claudiot mindset - information is ultimately useless so fake autogenerated content is just as valuable as real work.
* the ability to find essentially any information ever created by anyone anywhere at anytime,
* the ability to communicate with anyone on Earth over any distance instantaneously in audio, video, or text,
* the ability to order any product made anywhere and have it delivered to our door in a day or two,
* the ability to work with anyone across the world on shared tasks and projects, with no need for centralized offices for most knowledge work.
That was a massive undertaking with many permutations requiring lots of software written by lots of people.
But it's largely done now. Software consumes a significant fraction of all waking hours of almost everyone on Earth. New software mainly just competes with existing software to replace attention. There's not much room left to expand the market.
So it's difficult to see the value of LLMs that can generate even more software even faster. What value is left to provide for users?
LLMs themselves have the potential to offering staggering economic value, but only at huge social cost: replacing human labor on scales never seen before.
All of that to say, maybe this is the reason so much time is being spent on meta-work today than on actual software engineering.
The fundamental ceiling of what an LLM can do when connected to an IDE is incredible, and orders of magnitude higher than the limits of any no-code / low-code platform conceived thus far. "Democratizing" software - where now the only limits are your imagination, tenacity, and ability to keep the bots aligned with your vision, is allowing incredible things that wouldn't have happened otherwise because you now don't strictly need to learn to program for a programming-involved art project to work out.
Should you learn how to code if you're doing stuff like that? Absolutely. But is it letting people who have no idea about computing dabble their feet in and do extremely impressive stuff for the low cost of $20/month? Also yes.
Except from. You know, books. And all the websites die pretty fast. At an insane rate.
>
the ability to communicate with anyone on Earth over any distance instantaneously in audio, video, or text,https://news.ycombinator.com/user?id=jimbokun
No contact info, intentionally.
>* the ability to order any product made anywhere and have it delivered to our door in a day or two,
You can buy the same things from a thousand stores with 99% asking many times what it costs.
>* the ability to work with anyone across the world on shared tasks and projects, with no need for centralized offices for most knowledge work.
Again, in theory yes. I wish it was all true, and it should be. But it isn't, sadly.
In the past two or three days I generated an interactive disk usage crawler tailored to my operating system and my needs. I have audited essentially none of the code, merely providing vision and detailed explanations of the user experience and algorithms that I want, and yet got back an interactive TUI application that does what I want with tons of tests and room to easily expand. I plan to audit the code more deeply soon to get it into a shape I'd be more comfortable open-sourcing. One thing agents suck at is meaningful DRY.
Somehow I doubt that. The monkey is never satisfied.
Maybe agents and AI in general will help with that. Maybe it will just make the problem worse.
I know a half dozen people who've created working software in the past month to solve a problem nothing else solved as well as what they made themselves. Software developers have finally automated themselves out of a job.
(I still think it's interesting that this requires pre-existing languages, libraries, etc, so this might not work in the future. But at least for now, we now have "Visual Basic" without the need for the visual part)
A spreadsheet editor with at most a couple of hundred MBs in size that can compete against Excel, for example. While also not eating from RAM resources. The same goes for a new browser and a new browser engine, it's time for Chrome to have a real competitor, it has become a mess. I can of other such examples, but these are the 2 biggest ones.
Everything and anything people actually want or need, whether it’s every day or just for five minutes, that nobody else could be bothered to make.
Today most won’t know what to do with it, just like they didn’t know what to do with a web browser.
But that won’t last.
I can imagine all the people staring at these software projects amazed at the genius it must have taken to create them. :)
https://quoteinvestigator.com/2023/06/23/invented/
The actual quote from 1884 seems to have been: "The advancement of the arts, from year to year, taxes our credulity, and seems to presage the arrival of that period when human improvement must end." - Henry L. Ellsworth
Either way we have a lot of things but it's not quite STTNG yet. There's no limit to how much more we can do.
The overwhelming majority of real jobs are not related to these things you read about on Hacker News.
I help a local group with resume reviews and job search advice. A common theme is that junior devs really want to do work in these new frameworks, tools, libraries, or other trending topics they've been reading about, but discover that the job market is much more boring. The jobs working on those fun and new topics are few and far between, generally reserved for the few developers who are willing to sacrifice a lot to work on them or very senior developers who are preferred for those jobs.
It can seem that the majority of software in the world is about generating clicks and optimising engagement, but that’s just the very loud minority.
Someone here shared an article here, recently, espousing something along the lines of "home garden programming." I see software development moving in this direction, just like machining did: Either in a space-age shop, that looks more like a lab, with a fix-axis "machining center," or in the garage with Grandpappy's clapped out Atlas - and nothing in between.
Feels like there’s a counter to the frequent citation of Jevon’s Paradox in there somewhere, in the context of LLM impact on the software dev market. Overestimation of external demand for software, or at least any that can be fulfilled by a human-in-the-loop / one-dev-to-many-users model? The end goal of LLMs feels like, in effect, the Last Framework, and the end of (money in) meta-engineering by devs for devs.
I haven't tried this myself but I'm curious if an LLM could build a scalable, maintainable app that doesn't use a framework or external libraries. Could be danger due to lack of training data but I think it's important to build stuff that people use, not stuff that people use to build stuff that people use to build stuff that....
Not that meta frameworks aren't valuable, but I think they're often solving the wrong problem.
I think the entire software industry has reached a saturation point. There's not really anything missing anymore. Existing tools do 99% of what we humans could need, so you're just getting recycled and regurgitated versions of existing tools... slap a different logo and a veneer on it, and its a product.
We still don’t have truly transparent transference in locally-run software. Go anywhere in the world, and your locally running software tags along with precisely preserved state no matter what device you happen to be dragging along with you, with device-appropriate interfacing.
We still don’t have single source documentation with lineage all the way back to the code.
We still don’t treat introspection and observability as two sides of a troubleshooting coin (I think there are more “sides” but want to keep the example simple). We do not have the kind of introspection on modern hardware that Lisp Machines had, and SOTA observability conversations still revolve around sampling enough at the right places to make up for that.
We still don’t have coordination planes, databases, and systems in general capable of absorbing the volume of queries generated by LLM’s. Even if LLM models themselves froze their progress as-is, they’re plenty sophisticated enough when deployed en masse to overwhelm existing data infrastructure.
The list is endless.
IMHO our software world has never been so fertile with possibilities.
Don't forget App Stores. Everyone's still trying to build app stores, even if they have nothing to sell in them.
It's almost as if every major company's actual product is their stock price. Every other thing they do is a side quest or some strategic thing they think might convince analysts to make their stock price to move.
They are pretty much legally obligated to act in this manner.
It's almost as if we lived under capitalism.
What other thing would they do? They are literally setting the Earth on fire to raise the stock price. No hostages taken.
The true alignment problem behind the ploy AGI alignment problem for prêt-à-penser SF philosophers. Or prestidigitators.
This is because all the low-hanging fruit has already been built. CRM. Invoicing. HR. Project/task management. And hundreds of others in various flavors.
[1] https://www.joelonsoftware.com/2000/08/09/the-joel-test-12-s...
Sure everything seems to have gotten better and that's why we now need AIs to understand our code bases - that we created with our great version control tooling.
Fundamentally we're still monkeys at keyboards just that now there are infinitely many digital monkeys.
I don't need an AI to understand my code base, and neither do you. You're smarter then you give yourself credit for.
It has, but we have gotten there by stacking turtles, by building so many layers of abstraction that things no longer make sense.
Think about this hardware -> hypervisor -> vm -> container -> python/node/ruby run time all to compile it back down to Bytecode to run on a cpu.
Some layers exist because of the push/pull between systems being single user (PC) and multi user (mainframe). We exacerbated the problem when "installable software" became a "hard problem" and wanted to mix in "isolation".
And most of that software is written on another pile of abstractions. Most codebases have disgustingly large dependency trees. People keep talking about how "no one is reviewing all this ai generated code"... Well the majority of devs sure as shit arent reviewing that dependency tree... Just yesterday there was yet another "supply chain attack".
How do you protect yourself from such a thing... stack on more software. You cant really use "sub repositories/modules" in git. It was never built that way because Linus didnt need that. The rest of us really do... so we add something like artifactory to protect us from the massive pile of stuff that you're dependent on but NOT looking at. It's all just more turtles on more piles.
Lots of corporate devs I know are really bad at reviewing code (open source much less so). The PR code review process in many orgs is to either find the person who rubber-stamps and avoid the people who only bike shed. I suspect it's because we have spent the last 20 years on the leet code interview where memorizing algorithms and answering brain teasers was the filter. Not reading, reviewing, debugging and stepping through code... Our entire industry is "what is the new thing", "next framework" pilled because of this.
You are right that it got better, but we got there by doing all the wrong things, and were going to have to rip a lot of things apart and "do better".
Neither myself nor the vast majority of other “software engineers” in our field are living up to what it should mean to be an “engineer”.
The people that make bridges and buildings, those are the engineers. Software engineers, for the very very most part, are not.
“Developers build things. Engineers build them and keep them running.”
I like the linguistic point from a standpoint of emphasizing a long term responsibility.
Most recently I wrote cloudformation templates to bring up infra for AWS-based agents. I don't use ai-assisted coding except googling which I acknowledge is an ai summary.
A friend of mine is in a toxic company where everyone has to use AI and they're looked down upon if they don't use it. Every minute of their day has to be logged doing something. They're also going to lay off a bunch of people soon since "AI has replaced them" this is in the context of an agency.
Of course, we use that term for something else in the software world, but architecture really has two tiers, the starchitects building super fancy stuff (equivalent to what we’d call software architects) and the much more normal ones working on sundry things like townhomes and strip malls.
That being said I don’t think people want the architecture pay grades in the software fields.
We're engineers.
1. https://en.wikipedia.org/wiki/Engineer#Definition
2. https://www.abet.org/accreditation/accreditation-criteria/cr...
Maybe software tinkerer?
You should see the code that scientists write...
If I engineer a bridge I know the load the bridge is designed to carry. Then I add a factor of safety. When I build a website can anyone on the product side actually predict traffic?
When building a bridge I can consult a book of materials and understand how much a material deforms under load, what is breaking point is, it’s expected lifespan, etc. Does this exist for servers, web frameworks, network load balancers, etc.?
I actually believe that software “could” be an engineering discipline but we have a long way to go
Hypothetically, could you not? If you engineer a bridge you have no idea what kind of traffic it'll see. But you know the maximum allowable weight for a truck of X length is Y tons and factoring in your span you have a good idea of what the max load will be. And if the numbers don't line up, you add in load limits or whatever else to make them match. Your bridge might end up processing 1 truck per hour but that's ultimately irrelevant compared to max throughput/load.
Likewise, systems in regulated industries have strict controls for how many concurrent connections they're allowed to handle[1], enforced with edge network systems, and are expected to do load testing up to these numbers to ensure the service can handle the traffic. There are entire products built around this concept[2]. You could absolutely do this, you just choose not to.
[1] See NIST 800-53 control SC-7 (3)
[2] https://learn.microsoft.com/en-us/azure/app-testing/load-tes...
This is tremendously expensive (writing two or more independent copies of the core functionality!) and rapidly becomes intractable if the interaction with the world is not pretty strictly limited. It's rarely worth it, so the vast majority of software isn't what I'd call engineered.
If I need a bridge, and there's a perfectly beautiful bridge one town over that spans the same distance - that's useless to me. Because I need my own bridge. Bridges are partly a design problem but mainly a build problem.
In software, if I find a library that does exactly what I need, then my task is done. I just use that library. Software is purely a design problem.
With agentic coding, we're about to enter a new phase of plenty. If everyone is now a 10x developer then there's going to be more software written in the next few years than in the last few decades.
That massive flurry of creativity will move the industry even further from the calm, rational, constrained world of engineering disciplines.
I think this vastly underestimates how much of the build problem is actually a design problem.
If you want to build a bridge, the fact one already exists nearby covering a similar span is almost meaningless. Engineering is about designing things while using the minimal amount of raw resources possible (because cost of design is lower than the cost of materials). Which means that bridge in the other town is designed only within its local context. What are the properties of the ground it's built on? What local building materials exist? Where local can be as small as only a few miles, because moving vast quantities of material of long distances is really expensive. What specific traffic patterns and loadings it is built for? What time and access constraints existed when it was built?
If you just copied the design of a bridge from a different town, even one only a few miles up the road, you would more than likely end up with a design that either won't stand up in your local context, or simply can't be built. Maybe the other town had plenty of space next to the location of the bridge, making it trivial to bring in heavy equipment and use cranes to move huge pre-fabbed blocks of concrete, but your town doesn't. Or maybe the local ground conditions aren't as stable, and the other towns design has the wrong type of foundation resulting in your new bridge collapsing after a few years.
Engineering in other disciplines don't have the luxury of building for a very uniform, tightly controlled target environment where it's safe to make assumptions that common building blocks will "just work" without issue. As a result engineering is entirely a design problem, i.e. how do you design something that can actually be built? The building part is easy, there's a reason construction contractors get paid comparatively little compared to the engineers and architects that design what they're building.
- license restrictions, relicensing
- patches, especially to fix CVEs, that break assumptions you made in your consumption of the package
- supply chain attacks
- sunsetting
There’s no real “set it and forget it” with software reuse. For that matter, there’s no “set it and forget it” in civil engineering either, it also requires monitoring and maintenance.
It certain mission critical applications, it is treated as engineering. One example - https://en.wikipedia.org/wiki/DO-178B
I'd propose a definition of engineering that's more or less just "composing tools together to solve problems".
- Edsger Dijkstra, 1988
I think, unfortunately, he may have had us all dead to rights on this one.
Dijkstra was a mathematician. It is a necessary discipline. If it alone were sufficient, then the "program correctness" fans would have simply and inarguably outdone everyone else forty years ago at the peak of their efforts, instead of having resorted to eloquently whiny, but still whiny, thinkpieces (such as the 1988 example [1] quoted here above) about how and why they would like history to understand them as having failed.
[1] https://www.cs.utexas.edu/~EWD/ewd10xx/EWD1036.PDF [2]
[2] I will freely grant that the man both wrote and lettered with rare beauty, which shames me even in this photocopier-burned example when I compare it to the cheerful but largely unrefined loops and scrawls of my own daily hand.
But yes, I think the best rebuttal to Dijkstra-style griping is Perlis' "one can't proceed from the informal to the formal by formal means". That said I also believe kind of like Chesterton's quote about Christianity, they've also mostly not been tried and found wanting but rather found hard and left untried. By myself included, although I do enjoy a spot of the old dependent types (or at least their approximations). There's an economic argument lurking there about how robust most software really needs to be.
Literally nothing else matters, and we (or at least I) have wasted a ton of time getting good at writing software.
I agree, but I'm not sure this says what you think it does.
The people on the car assembly line may know nothing of engineering, and the assembly line has theoretically been set up where that is OK.
The people on the software assembly line may also (and arguably often do) know nothing of engineering, but it's not clear that it is possible to set up the assembly line in such a way so as to make this OK.
Arguably, the use of LLMs will at least have some utility in helping us to figure this out, because a lot of LLMs are now being used on the assembly line.
I once received a "bonsai" seed kit from a former boss during a holiday dinner. I think it was meant as a joke, but even now I'm not so sure. I planted those seeds anyway. I told some people about it and they immediately mocked me saying it was a waste of time and going to take 30 years. This interaction immediately said everything to me about the expectations and attitudes of others.
Obviously, they grew like any other plants and actually quite nicely. Of course they're a commitment, but not a huge one.
I just wanted some plants for my apartment and they fit the bill. In a few years I had good looking plants. A decade later, I still have them and they're now more recognizably "bonsai". My home now looks nicer, I have a story to tell, and I learned a little bit from a very low stakes hobby.
My point is, I think it's nice when people have projects. I think it's nice to see what comes of it. I guess my only regret is ever saying "I planted bonsai" too soon just because that's what the box said. I didn't know how else to describe what I had done that weekend to those people who threw theirs in the trash.
I wouldn't've laughed at you. I view bonsai as a representation of steadfastness, endurance, determination, effort, (and self-mastery?) in the face of tremendous hardship, challenge, and deprivation. That said, I've never been particularly good at any of those things.
IDK if I would've taken you all that seriously either, though. Six months until you move and it's left behind on the curb. Or a year and a half until your cat knocks it off the windowsill. Or three years until some blight infects it and it dies off despite your best efforts. Eight years until, for whatever reason, it just succumbs to some kind of vegetative ennui. Nine years until your significant other overwaters it one too many times and the roots rot.
That's not meant disrespectfully. I just tend to view uncertainty and complexity as opportunities for shit to go sideways. Especially in this case, where it's unlikely you'll wake up to find your tree has spontaneously cloned itself, or has eaten a 1-UP mushroom. Disasters happen all the time, and miracles don't.
I suppose I'm just having a bit of a spiritual crisis right now. But thank you for your comment. It gives me a lot to think about, in a positive sense.
All that is gold does not glitter,
Not all those who wander are lost;
The old that is strong does not wither,
Deep roots are not reached by the frost.
― J.R.R. Tolkien, The Fellowship of the RingThat's increasingly not possible. This is the first time for me in 20 years where I've had a programming tool rammed down my throat.
There's a crisis of software developer autonomy and it's actually hurting software productivity. We're making worse software, slower because the C levels have bought this fairy tale that you can replace 5 development resource with 1 development resource + some tokens.
In 18 years AI is the third or 4th tool forced upon a shop/team, I will say of those it is the forst one that is genuinely able to make me more productive overall, even with the drawbacks.
I think AI really pushes this higher up the abstraction layer:
> What problem are you solving?
I've spent a good amount of my careering using engineering and math to solve specific problems, I'm usually adjacent to software teams.
What I've seen happen with agentic coding is that traditional software engineers keep focusing on using it to build software, while ignoring the problem they're trying to solve.
Meanwhile I've seen junior data analysts start interfacing with applications and tools they never dreamed of before, and delivering results to stakeholders in record times. Things that were previously blocked by engineering no longer are.
But many engineers today are not really problem solvers, they're software builders. The idea that solving the end users problem is the goal, not building them software, is incomprehensible.
And so they continue to struggle to use AI effectively because they're trying to build software with it. Which it's not terrible at, but it's really the wrong tool for that job.
Sometimes software is necessary to solve a problem, a few years ago, software was necessary for a fairly large problem surface area (though, to your point, even then a lot of software was not really built to solve those problems). Today that surface area is shrinking, and as economic constraints loom on the horizon, I believe it will increasingly be people who are solving problems (with or without AI) that will be the ones surviving.
The bigger the problem set and context the less helpful an LLM gets.
Other places were "hack it until we don't know of any major bugs, then ship it before someone finds one". And now they're "hey, AI agents - we can use that as a hack-o-matic!" But they were having trouble with sustainability before, and they're going to still, except much faster.
I don't think people were releasing at this pace, so the failure states are fast and furious so there is just that much more viability. I think the microslop windos failures lately are just them being the same "them" that they've always been .. just MUCH faster. (they just need to stop monkeying with windows and stop adding more features on top of an already shaky foundation.) Maybe we just need more of the stories like Anthropic working with Mozilla to squash 5x the amount of bugs in a similar time frame first, AND THEN "vibe a browser together from nothing but specification files and an army of bots in a weekend".
Personally I think that whole Karpathy thing is the slowest thing in the world. I mean you can spin the wheels on a dragster all you like and it is really loud and you can smell the fumes but at some point you realize you're not going anywhere.
My own frustration with the general slowness of computing (iOS 26, file pickers, build systems, build systems, build systems, ...) has been peaking lately and frankly the lack of responsiveness is driving me up the wall. If I wasn't busy at work and loaded with a few years worth of side projects I'd be tearing the whole GUI stack down to the bottom and rebuilding it all to respect hard real time requirements.
> People answered this wrong in the Ruby era, they answered it wrong in the PHP era, they answered it wrong in the Lotus Notes and Visual BASIC era.
I'm assuming you're saying these tools hurt more than help?
In that case I disagree so much that I'm struggling to reply. It's like trying to convince someone that the Earth is not flat, to my mental model.
PHP, Ruby and VB have more successful code written in them than all current academic or disproportionately hyped languages will ever have combined.
And there's STILL software being written in them. I did Visual Basic consulting for a greenfield project last week despite my current expertise being more with Go, Python, C# and C. And there's a RoR work lined up next. So the presence gap between these helpful tools and other minor, but over index tools, is still increasing.
It's easy to think that the languages one see mor often in HN are the prevalent ones but they are just the tip of the iceberg.
Now I barely look at ticket requirements, feed it to an LLM, have it do the work, spend an hour reviewing it, then ship it 3 days later. Plenty of fuck off time, which is time well spent when I know nothing will change anyway. If I'm gonna lose my career to LLMs I may as well enjoy burning shareholder capital. I've optimized my life completely to maximize fuck off time.
At the end of the day they created the environment. It would be criminal to not take advantage of their stupidity.
I was interested in making a semi-automous skill improvement program for open code, and I wired up systemd to watch my skills directory; when a new skill appeared, it'd run a command prompt to improve it and cohere it to a skill specification.
It was told to make a lock file before making a skill, then remove the lock files. Multiple times it'd ignore that, make the skill, then lock and unlock on the same line. I also wanted to lock the skill from future improvements, but that context overode the skills locking, so instead I used the concept of marking the skills as readonly.
So in reality, agents only exist because of context poisoning and overlap; they're not some magicaly balm to improving the speed of work, or multiplying the effort, they simply prevent context poisoning from what's essentially subprocesses.
Once you realize that, you really have to scale back the reality because not only are they just dumb, they're not integrating any real information about what they're doing.
At some point I became so burnt out I couldn't look at an IDE or coloured text for that matter.
I found the way back by just changing my motto and focus... Find good people, do good work. That's it, that's all I want.
I don't care whether the 'property is hot' or what the market is doing anymore, I just build software in my lane, with good people around.
RoR is no longer at its peak, but is still have its marginal stable share of the web, while PHP gets the lion part[1]
Ok, Lotus Notes is really relic from an other era now. But it’s not a PL, so not the same kind of beast.
Well, also LLMs are different beast compared to PL. They actually really are the things that evocate the most the expression "taming the beast" when you need to deal with them. So it indeed as far away as possible of engineering as one can probably use a computer to build any automation. Maybe to stay in scientific realms ethology would be a better starting point than a background in informatics/CS to handle these stuffs.
Aren't you conveniently ignoring the fact that there were people saw through that and didn't go down those routes?
Or better yet point out the better paths they chose instead. Were they wrestling with Java and "Joda Time"? Talking to AWS via a Python library named after a dolphin? Running .NET code on Linux servers under Mono that never actually worked? Jamming apps into a browser via JQuery? Abstracting it up a level and making 1,400 database calls via ActiveRecord to render a ten item to-do list and writing blog posts about the N+1 problem? Rewriting grep in Rust to keep the ruskies out of our precious LLCs?
Asking the wrong questions, using the wrong tools, then writing dumb blog posts about it is what we do. It's what makes us us.
On one hand there's an approach to computing where it is a branch of mathematics that is universal. There are some creatures that live under the ice on a moon circling a gas giant around another star and if they have computers they are going to understand the halting problem (even if they formulate it differently) and know bubble sort is O(N^2) and about algorithms that sort O(N log N).
On the other hand we are divided by communities of practice that don't like one another. For instance there is the "OO sux" brigade which thinks I suck because I like Java. There still are shops where everything is done in a stored procedure (oddly like the fashionable architecture where you build an API server just because... you have to have an API) and other shops where people would think you were brain damaged to go anywhere near stored procs, triggers or any of that. It used to be Linux enthusiasts thought anybody involved in Windows was stupid and you'd meet Windows admins who were click-click-click-click-clicking over and over again to get IIS somewhat working who thought IIS was the only web server good enough for "the enterprise"
Now apart for the instinctual hate for the tools there really are those chronic conceptual problems for which datetime is the poster child. I think every major language has been through multiple datetime libraries in and out of the standard lib in the last 20 years because dates and times just aren't the simple things that we wish they would be and the school of hard knocks keeps knocking us to accept a complicated reality.
Pedanticism (or pedantry) is the excessive, tiresome concern for minor details, literal accuracy, or formal rules, often at the expense of understanding the broader context.
I don't think this had anything to do with minor details at all. You're trying to convey a point while ignoring the half of the population who didn't go down that route.The ones who got acquired - never really had to stand up to any due diligence scrutiny on the technical side. Other sides of the businesses did for sure, but not that side.
Many of you here work for "real" tech companies with the budget and proper skin in the game to actually have real engineers and sane practices. But many of you do not, and I am sure many have seen what I have seen and can attest to this. If someone like the person I mentioned above asks you to join them to help fix their problems, make sure the compensation is tremendous. Slop clean-up is a real profession, but beware.
It feels like this takes on a whole new meaning now we have agents - which I think is the same point you were making
Software engineering is not real engineering because we do not rigorously engineer software the way "real" engineers engineer real things. <--- YOU ARE HERE
Software engineering is real engineering because we "rigorously" engineer software the way "real" engineers engineer real things.
Edit: quotes imply sarcasm.
It is though. Picking the right approaches and tools makes more difference than anything else. Sure, you don't need the right tools if you can make the right choices - but it's much easier to pick a better methodology than to hire smarter people.
I'm watching a team which is producing insane amounts of code for their team size, but the level of thought that has gone into all of the details that would make their product a fit predator to run at scale and solve the underlying business problem has been neglected.
Moving really fast in the wrong direction is no help to anyone.
1. Applied physics - Software is immediately disqualified. Symbols have no physics.
2. Ethics - Lives and livelihoods depend on you getting it right. Software people want to be disqualified because that stuff is so boring, but this is becoming a more serious issue with every passing day.
So most software developers in France are absolutely software engineers.
Many physical processes are controlled by software.
I live in the happy place in negligence. Go software has almost zero maintenance costs and it will continue to build my programs in 10 years with zero changes to my codebase being necessary.
I probably will never touch C++ again, even though CGo is the most painful FFI/ABI implementation I've dealt with.
Just today I tried to build a project that's using bergamoth and a shitload of broken C++ dependencies and decided to not give a damn after 5 hours of trying to fix crappy code that changed for whatever reasons between c++14 and c++15, well, or the dependencies are broken, or the dependency versions are broken, or the maintainer's code never compiled in the first place... I just don't care.
My hopes were higher during the conan peak days, but now the ecosystem is just so broken even with jinja and whatever build framework the new kids are using.
I guess I just really hate the C++ ecosystem, and the lack of self reflection in there about the self inflicted pain that shouldn't be necessary in 2026.
In regards to agentic coding: I am toying around with codestral:22b right now and xiaomi's mimo models, and am building my own local dev environment which makes this kinda nice.
It's local and I like it, sometimes need to use claude still but it's getting there. But I am delegating only the gruntwork, not decisions, so I use temperature usually below 0.3. My approach is to make this sandboxed per folder I run it in and that agents are only allowed to communicate via notes or tasks, so that they are forced to use better documentation. Specific roles don't have write access to certain things, e.g. coder can't touch tests, and tester can't touch code.
It's a craft.
Just another reason we should cut software jobs and replace them with A(G)I.
If the human "engineers" were never doing anything precisely, why would the robot engineers need to?
It isn't. Show me the licensing requirements to be a "software engineer." There are none. A 12 year old can call himself a software engineer and there are probably some who have managed to get remote work on major projects.
That's assuming the axiom that "engineer" must require licensing requirements. That may be true in some jurisdictions, but it's not axiomatically or definitionally true.
Some kinds of building software may be "engineering", some kinds may not be, but anyone seeking to argue that "licensing requirements" should come into play will have to actually argue that rather than treat it as an unstated axiom.
Depends which definition you're going by. Some people do define "engineer" that way.
For the other countries, though, arguing "some countries do it that way" is as persuasive as "some countries drive on the other side of the road." It's true, but so what? Why should we change to do it their way?
Where specifically? I've been working as a "Software engineer" for multiple decades, across three countries in Europe, and 2-3 countries outside of Europe, never been sued or received a "big fine" for this, even have had presentations for government teams and similar, not a single person have reacted to me (or others) calling ourselves "software engineers" this whole time.
I do like the tips on how to work with agents for delegation. Let it do boring things. The deterministic things where you know what the result should look like each time.
Product design has a slightly different problem than engineering, because the speed of development is so high we cannot dogfood and play with new product decisions, features. By the time I’ve realized we made a stupid design choice and it doesn’t really work in real world, we already built 4 features on top of it. Everyone makes bad product decisions but it was easy and natural to back out of them.
It’s all about how we utilize these things, if we focus on sheer speed it just doesn’t work. You need own architecture and product decisions. You need to use and test your products with humans (and automate those as regression testing). You need to able to hold all of the product or architecture in your mind and help agents to make the right decisions with all the best practice you’ve learned.
I've been building the same AI product for months - a coaching loop that persists across sessions. Every few weeks someone ships a "competitor" in a weekend. Feature list looks similar. The difference is everything that breaks when a real user comes back for session 3 or 4. Context drifts, scores stop calibrating, plans don't adapt. None of that shows up in a demo. You only find it after sitting in the same codebase for weeks, running real sessions, getting confused by your own data. That's the friction the post is talking about and I don't think you can skip it.
Similar how „tech debt“ describes the same mechanism in business terms.
There’s going to be a bottleneck on what is verified because over time we will realize how much tail risk we are creating by simply surrendering our own agency to the agents - https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6298838
If there are any common apps which are unhinged please do share your experiences. LinkedIn was never great quality but it's off the charts. Also catching some on Spotify.
Did I miss something? I haven't used it in a minute, but why is the author claiming that it's "uninstallable malware"?
Minimalist alternative with no hooks or dependencies for the curious: https://github.com/wedow/ticket
I don't agree, but bigger issue to me is many/most companies don't even know what they want or think about what the purpose is. So whereas in past devs coding something gave some throttle or sanity checks, now we'd just throw shit over wall even faster.
I'm seeing some LinkedIn lunatics brag about "my idea to production in an hour" and all I can think is: that is probably a terrible feature. No one I've worked with is that good or visionary where that speed even matters.
But in many agent-skeptical pieces, I keep seeing this specific sentiment that “agent-written code is not production-ready,” and that just feels… wrong!
It’s just completely insane to me to look at the output of Claude code or Codex with frontier models and say “no, nothing that comes out of this can go straight to prod — I need to review every line.”
Yes, there are still issues, and yes, keeping mental context of your codebase’s architecture is critical, but I’m sorry, it just feels borderline archaic to pretend we’re gonna live in a world where these agents have to have a human poring over every single line they commit.
Oh, it can't take the phone call and fix the issue? Then I'm reviewing its output before it goes into prod.
You won't always be able to get ahold of someone at 2am. You won't be able to get ahold of me at 2am, for example. It'll throw some notification on my screen and I won't see it until I wake up.
The answer is that it's very easy for bad code to cause more problems than it solves. This:
> Then one day you turn around and want to add a new feature. But the architecture, which is largely booboos at this point, doesn't allow your army of agents to make the change in a functioning way.
is not a hypothetical, but a common failure mode which routinely happens today to teams who don't think carefully enough about what they're merging. I know a team of a half-dozen people who's been working for years to dig themselves out of that hole; because of bad code they shipped in the past, changes that should have taken a couple hours without agentic support take days or weeks even with agentic support.
I'm one-shotting AI code for my website without even looking at it. Straight to prod (well, github->cf worker). It is glorious.
Air Traffic Controller software - sure. 99% of other softwares around that are not mission-critical (like Facebook) just punch it to production - "move fast and break shit" has been cool way before "AI"
Even if we ignore criticality, things just get really messy and confusing if you push a bunch of broken stuff and only try to start understanding what's actually going on after it's already causing issues.
It's insane to me that someone can arrive at any other conclusion. LLMs very obviously put out bad code, and you have no idea where it is in their output. So you have to review it all.
Does it feel archaic because LLMs are clearly producing output of a quality that doesn't require any review, or because having to review all the code LLMs produce clips the productivity gains we can squeeze out of them?
Fwiw OP isn't an agent skeptic, he wrote one of the most popular agent frameworks.
For an early startup validating their idea, that prod can take it.
For a platform as a service used by millions, nope.
This cuts to the problem and is excellent framing. A rogue employee can achieve the same, but probably less quickly, and we've designed systems to help catch them early.
I would go further and remove that second option. If the code is important, LLM support or not, write it yourself.
At least for me, there is a clear qualitative difference in thinking between typing the code and watching it being typed, even if I follow along with every line.
If I type it, my brain is constantly questioning whether what I'm doing is correct. What are the edge cases here? Is this introducing a vulnerability? Am I getting the right data from the right place?
By watching an agent or someone else code, the mindset is different. I'm checking someone else's work under the implicit assumption that they have some idea of what they're doing and I'm just reviewing mostly for superficial stuff. I can force myself to ask those other questions, but it takes conscious effort and isn't sustainable over long sessions.
I play around with agentic coding, but I'm always shocked at how much worse the result is compared to working in a separate chat and typing (not pasting!) the suggestions. In the direct comparison, it's easy to see how agentic code turns so incredibly shit so ridiculously fast.
I think this is very good take on AI adoption: https://mitchellh.com/writing/my-ai-adoption-journey. I've had tremendous success with roughly following the ideas there.
> The point is: let the agent do the boring stuff, the stuff that won't teach you anything new, or try out different things you'd otherwise not have time for. Then you evaluate what it came up with, take the ideas that are actually reasonable and correct, and finalize the implementation.
That's partially true. I've also had instances where I could have very well done a simple change by myself, but by running it through an agent first I became aware of complexities I wasn't considering and I gained documentation updates for free.
Oh and the best part, if in three months I'm asked to compile a list of things I did, I can just look at my session history, cross with my development history on my repositories and paint a very good picture of what I've achieved. I can even rebuild the decision process with designing the solution.
It's always a win to run things through an agent.
My gut says something simple is missing that makes all of the difference.
One thought I had was that our problem lives between all the things taking something in and spitting something out. Perhaps 90% of the work writing a "function" should be to formally register it as taking in data type foo 1.54.32 and bar 4.5.2 then returning baz 42.0 The register will then tell you all the things you can make from baz 42.0 and the other data you have. A comment(?) above the function has a checksum that prevents anyone from changing it.
But perhaps the solution is something entirely different. Maybe we just need a good set of opcodes and have abstractions represent small groups of instructions that can be combined into larger groups until you have decent higher languages. With the only difference being that one can read what the abstraction actually does. The compiler can figure lots of things out but it wont do architecture.
I think that's the part where it remains difficult. Someone has to convey clearly what the semantics and side effects of the function are. Consumers have to read and understand it. Failing that, you get breakage.
Like the way we say something is an mp3. Why would it be good to have one unifying concept where we pretend a car crash and Beethoven are the same thing? It can be a WAV too!
Do you prefer hard or soft cover books?
Ill try an example, those always have the potential to describe things even worse.
Imagine a type that is an outdoor datetimetemperature in utcc or a first name form value or a solitaire terms of service checkbox value. Have both the chewing gum balls in dispenser and a total weight of chewing gum balls in dispenser as well as a min-max weight per chewing gum ball in dispenser.
Make it just as ridiculous as it sounds. If you can quantify it a type must be registered. If there is a pair of quantifications to be had register that too.
The vision just expanded! Make for everything an xml implementation then do a ram drive and make all variables into files.
The idea sounds so ridiculous it might actually work. Think of the employment opportunities!
We have too much code - languages to program machines.
We need a new different language now.
A plan.md, written in what... legalese English? Really? Am I back in 1897? People committing that to vcs, sheesh...
By introducing progressive semantically enriching layers (starting with prose, reasoning and terminology and going all the way into specifying interaction surfaces), we can reduce the dark matter between spec and code, make code more disposable – if your semantics live in the spec layer rather than the implementation, you can throw away and regenerate the implementation without losing understanding – and, critically, give LLMs a way to navigate a graph of knowledge instead of gobbling up walls of text.
https://clayers.com -- https://github.com/CognitiveLayers/clayers
I really want to read people's perspectives on LLM's, it was just impossible to find quality when everyone wanted to give their opinion. This is the worst on LinkedIn, where mentioning AI gives you free "brownie points" (I have yet to figure out what Managers gained from this). I don't care what you use it for, unless you have a new perspective I can ponder over.
Regardless, nothing is black and white, and most things are a shade of grey. LLM's have been more positive leaning, making the CTA for working on something a lot simpler. Although, I end up refactoring my day away (which I am fine with, I quite enjoy putting the dots on the i's).
This is a great point.
I have been avoiding LLM's for awhile now, but realized that I might want to try working on a small PDF book to Markdown conversion project[0]. I like the Claude code because command line. I'm realizing you really need to architect with good very precise language to avoid mistakes.
I didn't try to have a prompt do everything at once. I prompted Claude Code to do the conversion process section by section of the document. That seemed to reduce the mistake the agent would make
[0]: https://www.scottrlarson.com/publications/publication-my-fir...
But the rough edges are temporary. Coding agents are becoming superhuman along certain dimensions; the progress is staggering. As Andrej Karpathy put it, anything measurable or legible can be optimized by AI. The gaps will close fast.
The harder question is HCI. How do you expose this kind of intelligence in interfaces that actually align with human values? That's the design problem worth obsessing over.
That may be the case where AI leaks into, but not every software developer uses or depends on AI. So not all software has become more brittle.
Personally I try to avoid any contact with software developers using AI. This may not be possible, but I don't want to waste my own time "interacting" with people who aren't really the ones writing code anymore.
AI is the only growth industry of the last decade, and it's the only thing people talk about, we've been so long without growth that people are scared of it now.
I use Aider on my private computers and Copilot at work. Both feel equally powerful when configured with a decent frontier model. Are they really generations apart? What am I missing?
Will it track people down and refuse orders, or give poisoned output?
I think a lot of this is just Typescript developers. I bet if you removed them from the equation most of the problem he's writing about go away. Typescript developers didn't even understand what React was doing without agent, now they are just one-shot prompting features, web apps, clis, desktop apps and spitting it out to the world.
The prime example of this is literally Anthropic. They are pumping out features, apps, clis and EVERY single one of them release broken.
https://gist.github.com/ontouchstart/d43591213e0d3087369298f...
(Note: pi was written by the author of the post.)
Now it is time to read them carefully without AI.
We are all rabbits.
Integration is the key to the agents. Individual usages don't help AI much because it is confined within the domain of that individual.
Pull the bandaid off quickly, it hurts less.
I'm one of those people and I'm not going to slow down. I want to move on from bullshit jobs.
The only people that fear what is coming are those that lack imagination and think we are going to run out of things to do, or run out of problems to create and solve.
So are you aiming for death poverty? Once those bullshit jobs go, we’re going to find a lot of people incapable of producing anything of value while still costing quite a bit to upkeep. These people will have to be gotten rid of somehow.
> and think we are going to run out of things to do, or run out of problems to create and solve.
There will be plenty of problems to solve. Like who will wipe the ass of the very people that hate you and want to subjugate you.
Again:
The only people that fear what is coming are those that lack imagination and think we are going to run out of things to do, or run out of problems to create and solve.
I use agents all day, every single day. But I also push back, understand what was written, and ensure I read and understand everything I ship.
Does it slow me down? Uh, yup. You bet.
Yes, this article literally advocates for slowing the fuck down, but it also makes the coding agents out to be the problem, but they're not.
Sensible engineers who look AI as another (potentially powerful) tool in the toolbox "aren't forward looking enough". I watched this happen in real time at my previous company, where every discussion about quality was interpreted as slowing down progress, and the only thing that was looked on favorably was the idea of replacing developers with machines - because they are "cheaper and faster".
The logical minds here on HN are less prone to believing in magic and AI fairies, but they are often not the ones setting the rules. And the number of companies being run by people with critical thinking skills is getting smaller by the day.
Yes, humans are accountable for the ultimate output. But so are the people who design and build these automation tools. As the saying goes, the purpose of a system is what it does.
i'm making specific usage pattersn out to be the problem, and explain why those patterns can't work due to the way agents work.
But the LLM regularly makes lots of mistakes (sometimes, due to me, giving it bogus information). I can’t imagine just letting it do the whole thing, as a “black box.”
I’m old enough to remember the advent of ATMs. When they first came out, they were universally free, for years.
Once people got hooked, the fees began to appear.
Agreed w this TLDR. TFA has some good observations, but the repeated use of the word "booboos" (dozens of times) made it almost unreadable.
Maybe some people have already reached that point after so much AI coding and are now warning us; they pushed so hard that they understand the limits. But this is the kind of thing you need to experience on your own.
You need to experiment, learn, test the limits, think for yourself, take as many steps back as you need.
Why? Next week a new version of Claude and GPT will come out and the limits will change again. Are you really fully testing every new version of every LLM agant to see where its limits are?
Those of us old enough to have seen this cycle before know its a fools game trying to keep up with development pace in the initial bubble. Its much better to wait for development and progress to start plateuing and then its easier to see the wood for the trees.
Reminds me of Carson Gross' very thoughtful post on AI also: https://htmx.org/essays/yes-and/
[Y]ou are going to fall into The Sorcerer’s Apprentice Trap, creating systems you don’t understand and can’t control.
Companies will face the maintenance and availability consequences of these tools but it may take a while for the feedback loop to close
It’s very hard to say right now what happens at the other side of this change right now.
All these new growing pains are happening in many companies simultaneously and they are happening at elevated speed. While that change is taking place it can be quite disorienting and if you want to take a forward looking view it can be quite unclear of how you should behave.
Spotify's CEO recently bragged about the app's code being written almost entirely by AI. Just saying.
It’s time to slow the fuck down!
Oh they even swore in the title.
Oh and of course it's anti-economics and is probably going to hurt whoever actually follows it.
Three for three. It's not logical it's emotional.