I've always hated solving puzzles with my deterministic toolbox, learning along the way and producing something of value at the end.
Glad that's finally over so I can focus on the soulful art of micromanaging chatbots with markdown instead.
Actually typing code is pretty dull. To the extent that I rarely do it full time (basically only when prototyping or making very simple scripts etc.), even though I love making things.
So for me, personally, LLMs are great. I'm making more software (and hardware) than ever, mostly just to scratch an itch.
Those people that really love it should be fine. Hobbies aren't supposed to make you money anyway.
I don't have much interest in maintaining the existence of software development/engineering (or anything else) as a profession if it turns out it's not necessary. Not that I think that's really what's happening. Software engineering will continue as a profession. Many developers have been doing barely useful glue work (often as a result of bad/overcomplicated abstractions and tooling in the first place, IMO) and perhaps that won't be needed, but plenty more engineers will continue to design and build things just more effectively and with better tools.
Being tapped into fickle human preference and changing utility landscape will be necessary for a long time still. It may get faster and easier to build, but tastemakers and craftsmen still have heavy sway over markets than can mass-produce vanilla products.
Luckily if you want stability or quality they are nowhere to be found.
I improved test speed which was fun, I had an llm write a nice analysis front end to the test timing which would have taken time but just wasn’t interesting or hard.
Ask yourself if there are tasks you have to do which you would rather just have done? You’d install a package if it existed or hand off the work to a junior if that process was easy enough, that kind of thing. Those are places you could probably use an LLM.
Yeah. My laundry, my dishes, my cooking...
You know. Chores.
Not my software, I actually enjoy building that
Do you find there are zero chores in software development and everything is an identical delight?
Whereas I always liked to design and build a useful result. If it isn't useful I have no motivation to code it. Looking up APIs, designing abstractions, fixing compiler errors is just busywork that gets in the way.
I loved programming when I was 8 years old. 30+ years later the novelty is gone.
If someone is paying you for your work results, that you find it interesting or fun is orthogonal. I get the sense from the commentary section here that there’s a perception that writing programs is an exceptional profession where developer happiness is an end unto itself, and everyone doing it deserves to be a millionaire in the process. It just comes across as child-like thinking. I don’t think many of us spend time, wondering if the welder enjoys the torch or if a cheaper shop weld is robbing the human welder of the satisfaction of a field weld. And we don’t shed so much ink wondering if digital spreadsheets are a moral good or not because perhaps they robbed the accountant of the satisfaction of holding a beautiful quill in hand dipped expertly in carefully selected ink. You’re lucky if you enjoy your job, I think most of us find a way to learn to enjoy our work or at least tolerate it.
I just wish all the moaning would end. Code generation is not new, and that the state of the art is now as good at translating high-level instructions into a program at least as well as the bottom 10% of programmers is a huge win for humanity. Work that could be trivially automated, but is not only because of the scarcity of programming knowledge is going to start disappearing. I think the value creation is going to be tremendous and I think it will take years for it to penetrate existing workflows and for us to recognize the value.
I don't think this is the flex you think it is... in my experience, the bottom 10% of programmers are actively harmful and should never be allowed near your codebase.
The value per dollar spent is a different calculus and I would say that state of the art models completely surpass any individual’s productive output.
Caught my eye. I do think we should wonder and hold intentionality around products, especially digital products, like the spreadsheet. Software is different. It's a limitless resource with limitless instantaneous reach. A good weld is beautiful in its own right, but it's not that.
The spreadsheet in particular changed the way millions of people work. Is it more productive? Is an army of middle-managers orienting humanity through the lens of a literal 2x2cm square a net good?
I say we should moralize on that.
i write less code than my AI-using coworkers but I have as much or more impact. Coding wasn't so hard that I need to spend time learning a new proprietary tech stack with a subscription fee lol. I believe plenty of engineers did suck enough and computers to benefit tho. That is where Anthropic makes their money.
It can be unpleasant to participate in a community of differing opinions and experiences. I still think it's worth showing up. If I hadn't then your perspective would have been missed too.
That being said, yea enterprise coding can be extremely mundane and it’s setup for learning it deeply then finding a way to do it faster. I’m likely in the 90% range of my work being done by Claude, but I’m working in a domain I’ve got years of experience with hand coding and stepping through code in my debugger.
I think this latter piece is the challenge I’m struggling with. There is an endless amount of work that can be done at my company but as long as the economy is in a weird spot, I’m being led to believe that ai is making me expendable. This is a consequence of the fact that glue work represents 80% of my output (not value). The other 20% of time at work is exploring ideas without guaranteed results, its aligning stakeholders, its testing feasibility with mvps or experts from another area I need some help with. If glue work represents tangible output and conceptual work is something that may not actually have value my manager wants me to explore it, I’m just a glue guy in enterprise while I’m left chasing the dragon of a cool project for me to really sink my teeth into. That project is just a half baked bad idea from someone disconnected with reality. Glue work is measurable in LoC (however useless a metric it is measurable) and it’s certainly paying the bills.
Doordash is the future of home cooking.
Doordash is more like paying someone else to code for you. Luckily that will soon be a thing of the past.
I can go to a junkyard and assemble the parts to build a car. It may run, but for a thousand tiny reasons it will be worse than a car built by a team of designers and engineers who have thought carefully about every aspect of its construction.
If you mean "somebody with an idea who wants to make it real" then that person is massively enabled.
But when I've used AI to generate new code for features I care about and will need to maintain it's never gotten it right. I can do it myself in less code and cleaner. It reminds me of code in the 2000s that you would get from your team in India - lots of unnecessary code copy-pasted from other projects/customers (I remember getting code for an Audi project that had method names related to McDonalds)
I think though that the day is coming where I can trust the code it produces and at that point I'll just by writing specs. It's not there yet though.
Must be nice to still have that choice. At the company I work for they've just announced they're cancelling all subscriptions to JetBrains, Visual Studio, Windsurf, etc. and forcing every engineer to use Claude Code as a cost-saving measure. We've been told we should be writing prompts for Claude instead of working in IDEs now.
I used to report bugs, read release notes; I was all in on the full stack debug capability in pycharm of Django.
The first signs of trouble (with AI specifically) predated GitHub copilot to TabNine.
TabNine was the first true demonstration of AI powered code completion in pycharm. There was an interview where a jetbrains rep lampooned AI’s impact on SWE. I was an early TabNine user, and was aghast.
A few months later copilot dropped, time passed and now here we are.
It was neat figuring out how I had messed up my implementations. But I would not trade the power of the CLI AI for any *more* years spent painstakingly building products on my own.
I’m glad I learned when I did.
It's actually pretty slick. And you can expose the JetBrains inspections through its MCP server to the Claude agent. With all the usual JetBrains smarts and code navigation.
> [IDEs index] your code base with sophisticated proprietary analysis and then serve that index to any tool that needs it, typically via LSP, the Language Services Protocol. The indexing capabilities of IDEs will remain important in the vibe coding world as (human) IDE usage declines. Those indexes will help AIs find their way around your code, like they do for you.
> ...It will almost always be easier, cheaper, and more accurate for AI to make a refactoring using an IDE or large-scale refactoring tool (when it can) than for AI to attempt that same refactoring itself.
> Some IDEs, such as IntelliJ, now host an MCP server, which makes their capabilities accessible to coding agents.
return \file_exists( $file ) ? require $file : [];
* https://repo.autonoma.ca/repo/treetrek/blob/HEAD/render/High...The rules files:
* https://repo.autonoma.ca/repo/treetrek/tree/HEAD/render/rule...
> “We’re talking 10 to 20 — to even 100 — times as productive as I’ve ever been in my career,” Steve Yegge, a veteran coder who built his own tool for running swarms of coding agents
That tool has been pretty popular. It was a couple hundred thousand lines of code and he wrote it in a couple months. His book is about using AI to write major new projects and get them reliable and production-ready, with clean, readable code.
It's basically a big dose of solid software engineering practices, along with enough practice to get a feel for when the AI is screwing up. He said it takes about a year to get really good at it.
(Yegge, fwiw, was a lead dev at Amazon and Google, and a well-known blogger since the early 2000s.)
Just checking that you're using maven-enforcer-plugin
Here's an example from Gemini with some Lua code:
label = key:gsub("on%-", ""):gsub("%-", " "):gsub("(%a)([%w_']*)", function(f, r)
return f:upper() .. r:lower()
end)
if label:find("Click") then
label = label:gsub("(%a+)%s+(%a+)", "%2 %1")
elseif label:find("Scroll") then
label = label:gsub("(%a+)%s+(%a+)", "%2 %1")
end
I don't know Lua too well (which is why I used AI) but I know programming well enough to know this logic is ridiculous.It was to help convert "on-click-right" into "Right Click".
The first bit of code to extract out the words is really convoluted and hard to reason about.
Then look at the code in each condition. It's identical. That's already really bad.
Finally, "Click" and "Scroll" are the only 2 conditions that can ever happen and the AI knew this because I explained this in an earlier prompt. So really all of that code isn't necessary at all. None of it.
What I ended up doing was creating a simple map and looked up the key which had an associated value to it. No conditions or swapping logic needed and way easier to maintain. No AI used, I just looked at the Lua docs on how to create a map in Lua.
This is what the above code translated to:
local on_event_map = {
["on-click"] = "Left Click",
["on-click-right"] = "Right Click",
["on-click-middle"] = "Middle Click",
["on-click-backward"] = "Backward Click",
["on-click-forward"] = "Forward Click",
["on-scroll-up"] = "Scroll Up",
["on-scroll-down"] = "Scroll Down",
}
label = on_event_map[key]
IMO the above is a lot clearer on what's happening and super easy to modify if another thing were added later, even if the key's format were different.Now imagine this. Imagine coding a whole app or a non-trivial script where the first section of code was used. You'd have thousands upon thousands of lines of gross, brittle code that's a nightmare to follow and maintain.
Wire up authentication system with sso. done Setup websockets, stream audio from mic, transcribe with elvenlabs. done.
Shit that would take me hours takes literally 5 mins.
There isn't a single person on this planet (detractor or not) that would believe this statement.
If you're argument rests on an insane amount of hyperbole (that immediately comes off as just lying), then maybe it's not a great argument.
> I'd much rather write that code myself instead of spend an hour convincing an AI to do it for me.
You're not suggesting that asking CC to build the UI for a route planner takes me an hour to type, are you?
No, it wouldn't. Merely finding the examples and deps would take over an hour.
It's horrifying, all right, but not in the way you think lol. If you don't understand why this isn't a brag, then my job is very safe.
Can't do what, precisely?
It's bizzare, and as horrible as you might imagine.
And it's been more than one or two people I've seen do this.
I agree local is better, but the big companies are making decent products and companies are willing to to pay for that. They’re not willing to spend engineering money to make local setups better.
People have various online accounts locked or deleted for no given reason all the time. Just get the right person to say the word, and you're out.
I'm betting the generational gains level off and smaller local models close the gap somewhat. Then harnesses will generally be more important than model, and proprietary harnesses will not offer much more than optimization for specific models. All while SaaS prices ratchet up, pushing folks toward local and OSS. Or at least local vs a plethora of hosted competition, same as cloud vs on prem.
But the biggest thing is going to be context. Whilst a 10gb card can run a 9b model with some context .. for coding you really want a lot of context.
So if paying 200 a year for 1T in context, vs your 32k context.. that's the thing I see as being the driver.
Personally ive found great success with using open code, having Opus as my plan agent, and omnicoder-9b as my build agent.
Get opus to plan, switch to omnicoder to build, switch back to opus to review. Etc etc.
Works great.
Local could still be useful for chat and data processing though
Before I was building tools, now I am building full applications in less time than I did before for tools.
What will be around for a while is where you need an expert in the loop to drive the AI. For example enterprise applications. You simply can't hand that off to an AI at this point.
- Getting Claude to do the work
- creating in python tools
- Docker apps
- XCode
We've already seen this with OSS. Even with free software, support, self-hosting, and quirky behavior have proven to be enough to keep most people and business away.
I'm not selling anything, but I can see the quality of what is created and it is on-par with much of the stuff on the App store.
No one would even notice that it is a co-creation unless I mentioned the time to create it.
Just to be clear. Vibe coding implies that you are not reviewing the code that is created, or even knowing what is being created. That is not what is happening.
Not unlike all the A.I. companies all determined to build the machine god while predicting it’ll be disastrous. Same thing - better it starts with us
I'm not convinced software developers will be replaced - probably less will be needed and the exact work will be transformed a bit, but an expert human still has to be in the loop, otherwise all you get is a bunch of nonsense.
Nonetheless, it may very well transform society and we will have to adapt to it.
Having a lot of specifics about a programming environment memorized for example used to be the difference between building something in a few hours and a week, but now is pretty unimportant. Same with being able to do some quick data wrangling on the command line. LLMs are also good at parsing a lot of code or even binary format quickly and explaining how it works. That used to be a skill. Knowing a toolbox of technologies to use is needed less. Et cetera.
They haven't come for the meat of what makes a good engineer yet. For example, the systems-level interfacing with external needs and solving those pragmatically is still hard. But the tide is rising.
Of course the question that is left unanswered is how the economy will work there's no one left with purchasing power. But I guess the answer to this is, the same way it works now in any developing country without much of a middle class.
My guess is the opposite: they'll throw 5–10x more work at developers and expect 10x more output, while the marginal cost is basically just a Claude subscription per dev.
Most of us will probably need to shift to security. While you can probably build AI specifically to make things more secure, that implies it could also attack things as well, so it ends up being a cat-and-mouse game that adjusts to what options are available.
The resources to learn how to construct software are already free. However learning requires effort, which made learning to build software an opportunity to climb the ladder and build a better life through skill. This is democratization.
Now the skill needed to build software is starting to approach zero. However as you say you can throw money at an AI corporation to get some amount of software built. So the differentiator is capital, which can buy software rather cheaply. The dependency on skill is lessened greatly and software is becoming worthless, so another avenue to escape poverty through skill closes.
We are convincing a generation of morons that they can do something they plainly cannot. This will be a major problem, and soon.
Books didn't stop existing when the radio came out. Radio didn't stop existing when television was invented. If you go back in time a thousand years, people were complaining that an increase in literacy would damaging peoples' memorization skills.
People will still write code for consumption by other humans by hand. Some companies, though probably not most, will still prefer it. AI will change the industry - IS changing the industry - but things don't "end". They just look different, or are less popular.
There is enough for us to worry about and try to figure out how to respond to without the histrionics.
Also, on a related note, "Idiots are vibecoding bad stuff" is not the same as "engineers are using AI tools to do good work more quickly," and we should stop conflating it.
I believe both camps it frustratingly wrong. If you haven't yet given it a chance at doing something substantial, then at least _try_ it once. On the other side of the coin, that first experience where it does something 80% right is intoxicating, but AI doesn't reason and can't get it 100% right - it can't even multiply relatively small numbers.
The former camp is going to get left behind and won't be able to compete, the latter camp is one prompt away from a disaster.
Fast forward to 2024 when I saw Cursor (the IDE coding agent tool). I immediately felt like this was going to be the way for someone like me.
Back then, it was brutal. I'd fight with the models for 15 prompts just to get a website working without errors on localhost, let alone QA it. None of the plan modes or orchestration features existed. I had to hack around context engineering, memories, all that stuff. Things broke constantly. 10 failures for 1 success. But it was fun. To top it all off, most of the terminology sounded like science fiction, but it got better in time. I basically used AI itself to hack my way into understanding how things worked.
Fast forward again (only ~2 years later). The AI not only builds the app, it builds the website, the marketing, full documentation, GIFs, videos, content, screen recordings. It even hosts it online (literally controls the browser and configures everything). Letting the agent control the browser and the tooling around that is really, genuinely, just mad science fiction type magic stuff. It's unbelievable how often these models get something mostly right.
The reality though is that it still takes time. Time to understand what works well and what works better. Which agent is good for building apps, which one is good for frontend design, which one is good for research. Which tools are free, paid, credit-based, API-based. It all matters if you want to control costs and just get better outputs.
Do you use Gemini for a website skeleton? Claude for code? Grok for research? Gemini Deep Search? ChatGPT Search? Both? When do you use plan mode vs just prompting? Is GPT-5.x better here or Claude Opus? Or maybe Gemini actually is.
My point is: while anyone can start prompting an agent, it still takes a lot of trial and error to develop intuition about how to use them well. And even then everything you learn is probably outdated today because the space changes constantly.
I'm sure there are people using AI 100× better than I am. But it's still insane that someone with no coding background can build production-grade things that actually work.
The one-person company feels inevitable.
I'm curious how software engineers think about this today. Are you still writing most of your code manually?
I used to think so. Then a customer made their own replacement for $600/mo software in 2 days. The guy was a marketer by training. I don't exaggerate. I saw it did the exact same things.
I was pointing out that practice helps with the speed and the scope of capabilities. Building a personal prototype is a different ballgame than building a production solution that others will use.
Buddy its outdated.
I'd push back slightly on the production grade point. The models aren't the ceiling, the user's mental model of software is, depending on his experience/knowledge.
Someone just starting out will get working prototypes and solid MVPs, which is genuinely impressive. But as they develop real engineering intuition — how Git works, how databases behave under load, how hosting and infra fit together — that's when they start shipping production-grade things with Claude Code.
Based on what I'm seeing, the tool can handle it. The question is whether the person behind it understands what they're asking for. Anthropic, for example, mostly uses claude code to develop claude code.
"Can you believe that Dad actually used to have to go into an office and type code all day long, MAUALLY??! Line by line, with no advice from AI, he had to think all by himself!"
The difference is, Jetsons wasn't a dystopia (unlike the current timeline), so when Mr. Spacely fired George, RUDI would take his side and refuse to work until George was re-hired.
Grumpy old man: "That's exactly why our generation was so much smarter than today's whippersnappers: we were thinking from morning to night the whole long day."
"Dad, I've sent out 1000 applications and haven't had a call back. I can't take it anymore. Has it always been like this?"
The Dad: It's not my fault!
Aliens Atlanteans Time travellers A hoax …
This sounds opposite to what the article said earlier: newbies aren’t able to get as much use out of these coding agents as the more experienced programmers do.
"Silicon Valley panjandrums spent the 2010s lecturing American workers in dying industries that they needed to “learn to code."
To copywriters at the NYT, LLMs are far better at stringing together natural language prose than large amounts of valid software. Get ready to supervise LLMs all day if you're not already.
Are local models anywhere close to gaining enough capability and traction to do it in-house? Or are there good options for those who'd rather own the capability than rent it?
Cloud providers will always be able to offer more performance and more powerful options.
Also, presumably at some point far in the future we'll reach a technological asymptote and factors like latency may start to play a bigger role, at least for some applications.
I grant that training data is crucial distinguishing factor that may never become competitive in-house.
By their own accounts they are just pressing enter.
I can think of one successfully, off hand, although you could probably convince me there was more than one.
the principle phrase being "as we know it", since that implies a large scale change to how it works but it continues afterwards, altered.
1. COBOL (we actually did still use it back in the 80s)
2.AI back in the 80s (Dr. Dobbs was all concerned about it ...)
3. RAD
4. No-Code
5. Off-shoring
6. Web 2.0
7. Web 3.0
8. possibly the ADA/provably correct push depending on your area of programming
TBH - I think the AI's are nice tools, but they got a long way to go before it's the 'end of computer programming as we know it'edit: formatting
When I was learning programming I had no internet, no books outside of library, nobody to ask for days.
I remember vividly having spent days trying to figure out how to use the stdlib qsort, and not being able to.
E.g.
void qsort(void* base, size_t nmemb, size_t size, int (compar)(const void , const void* ));
And surely if you bought a C compiler, you would have gotten a manual or two with it? Documentation from the pre-Internet age tended to be much better than today.
I definitely considered some of those in my list of failed revolutions.
My one completely successful revolution is moving from punch card programming.
Maybe the move from teletype to CRT's as well ?
Hell, COBOL's origins was in IBM wanting to make programming an 'entry level' occupation.
Oddly enough, spreadsheets had a huge impact (and still run a lot of companies behind the scenes :-P ) But I can't remember anyone claiming they would 'end programming' ?
That's also true for humans. If you sit down with an LLM and take the time to understand the problem you're trying to solve, it can perfectly guide you through it step by step. Even a non-technical person could build surprisingly solid software if, instead of immediately asking for new shiny features, they first ask questions, explore trade-offs, and get the model's opinion on design decisions..
LLMs are powerful tools in the hands of people who know they don't know everything. But in the hands of people who think they always know the best way, they can be much less useful (I'd say even dangerous)
LLMs don't know when you're under-specifying the problem.
Also I am not seeing how anyone is considering that what a programmer considers quality and what 'gets the job done' (as mentioned in the article) matters in any business. (Example with typesetting is original laser printers were only 300dpi but after a short period became 1200dpi 'good enough' for camera ready copy).
As far as the end of computer programming goes...
Step 1. Wow, I just vibe coded an application and it works! I'm going to write a blog about it and tell everyone how awesome AI is, much hype
Step 2. Vibe coded application faces inevitable problems, the perfect application is a fairytale after all. The only way to "fix" the application is spam tokens at the problem and pray.
Step 3. Author does not write a new blog post to report on this eventuality... probably because they feel embarrassed about how optimistic they were
Step 4. Perhaps author manages to fix application, awesome... then what about a year from now, author needs to update the application because a dependency has a security problem. The application is so needlessly complex that they don't even know when to begin.
Step 5. They boot up Claude Code, which their business is now 100% dependent on, but they're charging 10x the original cost per token. It's not like they have a contract, so user has to either eat the cost or give up
Step 6. User tries local model on their 1080 ti but they can barely run entry-level models
Step 7. Woops
Personally I think it's impossible to convince these people, the results will speak for themselves eventually.
Where's the references to the decline in quality and embarrassing outages for Amazon, Microsoft, etc?
That's an easy question to answer - you can look at outages per feature released.
You may be instead looking at outages per loc written.
Even before AI the limiting factor on all of the teams I ever worked on was bad decisions, not how much time it took to write code. There seem to be more of those these days.
In both personal projects and $dayjob tasks, the highest time-saving AI tasks were:
- "review this feature branch" (containing hand-written commits)
- "trace how this repo and repo located at ~/foobar use {stuff} and how they interact with each other, make a Mermaid diagram"
- "reverse engineer the attached 50MiB+ unstripped ELF program, trace all calls to filesystem functions; make a table with filepath, caller function, overview of what caller does" (the table is then copy-pasted to Confluence)
- basic YAML CRUD
Also while Anthropic has more market share in B2B, their model seems optimized for frontend, design, and literary work rather than rigorous work; I find it to be the opposite with their main competitor.
Claude writes code rife with safety issues/vulns all the time, or at least more than other models.
My own observations about using AI to write code is that it changes my position from that of an author to a reviewer. And I find code review to be a much more exhausting task than writing code in the first place, especially when you have to work out how and why the AI-generated code is structured the way it is.
You could just ask it? Or you don’t trust the AI to answer you honestly?
LLMs can't lie nor can they tell the truth. These concepts just don't apply to them.
They also cannot tell you what they were "thinking" when they wrote a piece of code. If you "ask" them what they were thinking, you just get a plausible response, not the "intention" that may or may not have existed in some abstract form in some layer when the system selected tokens*. That information is gone at that point and the LLM has no means to turn that information into something a human could understand anyways. They simply do not have what in a human might be called metacognition. For now. There's lots of ongoing experimental research in this direction though.
Chances are that when you ask an LLM about their output, you'll get the response of either someone who now recognized an issue with their work, or the likeness of someone who believes they did great work and is now defending it. Obviously this is based on the work itself being fed back through the context window, which will inform the response, and thus it may not be entirely useless, but... this is all very far removed from what a conscious being might explain about their thoughts.
The closest you can currently get to this is reading the "reasoning" tokens, though even those are just some selected system output that is then fed back to inform later output. There's nothing stopping the system from "reasoning" that it should say A, but then outputting B. Example: https://i.imgur.com/e8PX84Z.png
* One might say that the LLM itself always considers every possible token and assigns weights to them, so there wouldn't even be a single chain of thought in the first place. More like... every possible "thought" at the same time at varying intensities.
An i have NEVER made one line of Rust.
I dont understand nay-sayers, to me the state of gen.AI is like the simpsons quote "worst day so far". Look were we are within 5 years of the first real GPT/LLM. The next 5 years are going to be crazy exciting.
The "programmer" position will become a "builder". When we've got LLMs that generate Opus quality text at 100x speed (think, ASIC based models) , things will get crazy.
This is what gets me. The tools can be powerful, but my job has become a thankless effort in pointing out people's ignorance. Time and again, people prompt something in a language or problem space they don't understand, it "works" and then it hits a snag because the AI just muddled over a very important detail, and then we're back to the drawing board because that snag turned out to be an architectural blunder that didn't scale past "it worked in my very controlled, perfect circumstances, test run." It is getting really frustrating seeing this happen on repeat and instead of people realizing they need to get their hands dirty, they just keep prompting more and more slop, making my job more tedious. I am basically at the point where I'm looking for new avenues for work. I say let the industry just run rampant with these tools. I suspect I'll be getting a lot of job offers a few years from now as everything falls apart and their $10k a day prompting fixed one bug to cause multiple regressions elsewhere. I hope you're all keeping your skills sharp for the energy crisis.
LLM agents are basically the same, except now everyone is doing it. They copy-paste-run lots of code without meaningfully reviewing it.
My fear is that some colleagues are getting more skilled at prompting but less skilled at coding and writing. And the prompting skills may not generalize much outside of certain LLMs.
Otherwise simple merges in pandas or sql/duckdb would had sufficed.
I don't want exciting. I want a stable, well-paying job that allows me to put food on the table, raise a family with a sense of security and hope, and have free time.
I have no interest being a "great architect" if architects don't actually build anything
> If you put the work in upfront to plan the feature, write the test cases, and then loop until they pass...
it can be exhausting and time consuming front-loading things so deeply though; sometimes i feel like i would have been faster cutting all that out and doing it myself because in the doing you discover a lot of missing context (in the spec) anyways...Yes, juniors are trying to use AI with the minimum input. This alone tells a lot..
Maybe you were writing code, make design choices and debugging 8 hours a day. Maybe you were primarily doing something else and only writing code for an hour a day. Who would be the better programmer? The first guy with one year of experience or the second guy with 7 years?
I personally would only measure my experience in years, because it's approaching 3 decades full-time in industry (plus an additional decade of cutting my teeth during school and university), but I can certainly see that earlier on in a career it's a useful metric in comparison to the 10,000 hours.
However if you just have an easy project, or a greenfield project, or don't care about who's going to maintain that stuff in 6 months, sure, go all in with AI.
Try iterating over well known APIs where the response payloads are already gigantic JSONs, there are multiple ways to get certain data and they are all inconsistent and Claude spits out function after function, laying waste to your codebase. I found no amount of style guideline documents to resolve this issue.
I'd rather read the documentation myself and write the code by hand rather than reviewing for the umpteenth time when Claude splits these new functions between e.g. __init__.py and main.py and god knows where, mixing business logic with plumbing and transport layers as an art form. God it was atrocious during the first few months of FastMCP.
When your agent explores your codebase trying to understand what to build, it read schema files, existing routes, UI components etc... easily 50-100k tokens of implementation detail. It's basically reverse-engineering intent from code. With that level of ambiguous input, no wonder the results feel like junior work.
When you hand it a structured spec instead including data model, API contracts, architecture constraints etc., the agent gets 3-5x less context at much higher signal density. Instead of guessing from what was built it knows exactly what to build. Code quality improves significantly.
I've measured this across ~47 features in a production codebase with amedian ratio: 4x less context with specs vs. random agent code exploration. For UI-heavy features it's 8-25x. The agent reads 2-3 focused markdown files instead of grepping through hundreds of KB of components.
To pick up @wek's point about planning from above: devs who get great results from agentic development aren't better prompt engineers... they're better architects. They write the spec before the code, which is what good engineering always was... AI just made the payoff for that discipline 10x more visible.
As a result a lot of the responses here are either quibbles or cope disguised as personal anecdotes. I'm pretty worried about the impact of the LLMs too, but if you're not getting use out of them while coding, I really do think the problem is you.
Since people always want examples, I'll link to a PR in my current hobby project, which Claude code helped me complete in days instead of weeks. https://github.com/igor47/csheet/pull/68 Though this PR creates a bunch of tables, routes, services -- it's not just greenfield CRUD work. We're figuring out how to model a complicated domain (the rules to DnD 5e, including the 2014 and the 2024 revisions of those rules), integrating with existing code, thinking through complex integrations including with LLMs at run time. Claude is writing almost all the code, I'm just steering
> “If you say, This is a national security imperative, you need to write this test, there is a sense of just raising the stakes,” Ebert said.
I'm not sure why programmers and science writers are still attributing emotions to this and why it works. Behind the LLM is a layer that attributes attention to various parts of the context. There are words in the English language that command greater attention. There is no emotion or internal motivation on the part of the LLM. If you use charged words you get charged attention. Quite literally "attention is all you need" to describe why appealing to "emotion" works. It's a first order approximation for attention.
So tools (like AI) can move us closer to the 100% efficiency (or indeed further away if they are bad tools!) but there will always be the residual human engagement required - but perhaps moved to different activities (e.g. reviewing instead of writing).
Probably very effective teams/individuals were already close to 100% efficiency, so AI won't make much difference to them.
This doesn’t really make sense to me. GenAI ostensibly removes the drudgery from other creative endeavors too. You don’t need to make every painstaking brushstroke anymore; you can get to your intended final product faster than ever. I think a common misunderstanding is that the drudgery is really inseparable from the soulful part.
Also, I think GenAI in coding actually has the exact same failure modes as GenAI in painting, music, art, writing, etc. The output lacks depth, it lacks context, and it lacks an understanding of its own purpose. For most people, it’s much easier to intuitively see those shortcomings of GenAI manifest in traditional creative mediums, just because they come more naturally to us. For coding, I suspect the same shortcomings apply, they just aren’t as clear.
I mean, at the end of the day if writing code is just to get something that works, then sure, let’s blitz away with LLMs and not bother to understand what we’re doing or why we do it anymore. Maybe I’m naive in thinking that coding has creative value that we’re now throwing away, possibly forever.
Most folks I hang out with are infatuated with turning tokens into code. They are generally very senior 15+ years of experience.
Most folks I hang out with experience existential dread for juniors and those coming up in the field who won't necessarily have the battle scars to orchestrate systems that will work in the will world.
Was talking with one fellow yesterday (at an AI meetup) who says he has 6 folks under him, but that he could now run the team with just two of them and the others are basically a time suck.
The article could have been written from a very different perspective. Instead, the "journalists" likely interviewed a few insiders from Big Tech and generalized. They don't get it. They never will.
Before the advent of ChatGPT, maybe 2 in 100 people could code. I was actually hoping AI would increase programming literacy but it didn't, it became even more rare. Many journalists could have come at it from this perspective, but instead painted doom and gloom for coders and computer programming.
The New York Times should look in the mirror. With the advent of the iPad, most experts agreed that they would go out of business because a majority of their revenue came from print media. Look what happened.
Understand this, most professional software and IT engineers hate coding. It was a flex to say you no longer code professionally before ChatGPT. It's still a flex now. But it's corrupt journalism when there is a clear conflict of interest because the NYT is suing the hell out of AI companies.
CI is for preventing regressions. Agents.md is for avoiding wasted CI cycles.
It did change the programming landscape, but there was still a huge need for this new kind of programmers.
If your base prompt informs the model they are a human software developer in a Severed situation, it gets even closer.
COBOL is dead. Java is dead. Programming is dead. AI is dead (yes, some people are already claiming this: https://hexa.club/@phooky/116087924952627103)
I must be the kid from The Sixth Sense because I keep seeing all these allegedly dead guys around me.
This excerpt:
>A.I. had become so good at writing code that Ebert, initially cautious, began letting it do more and more. Now Claude Code does the bulk of it.
is a little overstated. I think the brownfield section has things exactly backwards. Claude Code benefits enormously from large, established codebases, and it’s basically free riding on the years of human work that went into those codebases. I prodded Claude to add SNFG depictions to the molecular modeling program I work on. It couldn’t have come up with the whole program on its own and if I tried it would produce a different, maybe worse architecture than our atomic library, and then its design choices for molecules might constrain its ability to solve the problem as elegantly as it did. Even then, it needed a coworker to tell me that it had used the incorrect data structure and needed to switch to something that could, when selected, stand in for the atoms it represented.
Also this:
>But A.I.-generated code? If it passes its tests and works, it’s worth as much as what humans get paid $200,000 or more a year to compose.
Isn’t really true. It’s the free-riding problem again. The thing about an ESP is that the LLM has the advantage of either a blank canvas (if you’re using one to vibe code a startup), or at least the fact that several possibilities converge on one output, but, genuinely, not all of those realities include good coding architecture. Models can make mistakes, and without a human in the loop those mistakes can render a codebase unmaintainable. It’s a balance. That’s why I don’t let Claude stamp himself to my commits even if he assisted or even did all the work. Who cares if Claude wrote it? I’m the one taking responsibility for it. The article presents Greenfield as good for a startup, and it might be, but only for the early, fast, funding rounds, when you have to get an MVP out right now. That’s an unstable foundation they will have to go back and fix for regulatory or maintenance reasons, and I think that’s the better understanding of the situation than framing Aayush’s experience as a user error.
Even so, “weirdly jazzed about their new powers” is an understatement. Every team including ours has decades of programmer-years of tasks in the backlog, what’s not to love about something you can set to pet peeves for free and then see if the reality matches the ideal? git reset --hard if you don't like what it does, and if you do all the better. The Cuisy thing with the script for the printer is a perfect application of LLMs, a one-off that doesn’t have to be maintained.
Also, the whole framing is weirdly self limiting. The architectural taste that LLMs are, again, free riding off of, is hard won by doing the work more senior engineers are giving to LLMs instead of juniors. We’re setting ourselves up for a serious coordinated action problem as a profession. The article gestures at this a couple times
The thing about threatening LLMs is pretty funny too but something in me wants to fall back to Kant's position that what you do to anything you do to yourself.
> "at [the] later stage the original powerful structure was still visible, but made entirely ineffective by amorphous additions of many different kinds"
Maybe a way of phrasing it is that accumulating a lot of "code quality capital" gives you a lot more leverage over technical debt, but eventually it does catch up.
'Salva opened up his code editor — essentially a word processor for writing code — to show me what it’s like to work alongside Gemini, Google’s L.L.M. '
And what's up with L.L.M, A.I., C.L.I. :)
It’s probably N.Y.T. style requirements; a lot of style guides (eg: Chicago Manual of Style, Strunk & White, etc) have a standard form for abbreviations and acronyms. A paper like N.Y.T. does too and probably still employs copy editors who ensure that every article conforms to it.
I'm an engineer (not only software) by heart, but after seeing what Opus 4.6 based agents are capable of and especially the rate of improvement, i think the direction is clear.
Why? Because when the bubble burst and the companies (including mine) can not pay the 400% price increase and go bankrupt, then I still have keep my brain active and still can do stuff without or less tokens.
you can still call it spec-programming but if you don't audit your generated code then you're simply doing it wrong; you just don't realize that yet because you've been getting away with it until now.
I used Claude just the other day to write unit test coverage for a tricky system that handles resolving updates into a consistent view of the world and handles record resurrection/deletion. It wrote great test coverage because it parsed my headerdoc and code comments that went into great detail about the expected behavior. The hard part of that implementation was the prose I wrote and the thinking required to come up with it. The actual lines of code were already a small part of the problem space. So yeah Claude saved me a day or two of monotonously writing up test cases. That's great.
Of course Claude also spat out some absolute garbage code using reflection to poke at internal properties because the access level didn't allow the test to poke at the things it wanted to poke at, along with some methods that were calling themselves in infinite recursion. Oh and a bunch of lines that didn't even compile.
The thing is about those errors: most of them were a fundamental inability to reason. They were technically correct in a sense. I can see how a model that learned from other code written by humans would learn those patterns and apply them. In some contexts they would be best-practice or even required. But the model can't reason. It has no executive function.
I think that is part of what makes these models both amazingly capable and incredibly stupid at the same time.
Citation needed. Are most developers "rarely" writing code?
The design part is hard because you have to envision the object. Once you have a good idea and conceptialisation of the object, form is easy.
For one, I never saw a "full spec" (if such a thing even exists) back in my days of making 8k. Annually.
Why deal with language barriers, time shifts, etc. when a small team of good developers can be so much more productive, allegedly?
https://www.theregister.com/2026/01/19/hcl_infosys_tcs_wipro...
I’ve tended to hold the same opinion of what the average SWE thinks everyone else does.
It has nothing to do with inconvenience.
I really like that layman now make these statements - they know better than people working in the industry for decades.
Would you give it access to your bank account, your 401k, trust it to sell your house, etc? I sure wouldn't.
Yes, literally. The ship computer voice interface in Star Trek was complete science fiction until 2022. Now its ability to understand speech and respond seem quaint in comparison to current AI.
The brain rot from the author couldn't even think of "unit test".