I've always hated solving puzzles with my deterministic toolbox, learning along the way and producing something of value at the end.
Glad that's finally over so I can focus on the soulful art of micromanaging chatbots with markdown instead.
Actually typing code is pretty dull. To the extent that I rarely do it full time (basically only when prototyping or making very simple scripts etc.), even though I love making things.
So for me, personally, LLMs are great. I'm making more software (and hardware) than ever, mostly just to scratch an itch.
Those people that really love it should be fine. Hobbies aren't supposed to make you money anyway.
I don't have much interest in maintaining the existence of software development/engineering (or anything else) as a profession if it turns out it's not necessary. Not that I think that's really what's happening. Software engineering will continue as a profession. Many developers have been doing barely useful glue work (often as a result of bad/overcomplicated abstractions and tooling in the first place, IMO) and perhaps that won't be needed, but plenty more engineers will continue to design and build things just more effectively and with better tools.
Being tapped into fickle human preference and changing utility landscape will be necessary for a long time still. It may get faster and easier to build, but tastemakers and craftsmen still have heavy sway over markets than can mass-produce vanilla products.
I improved test speed which was fun, I had an llm write a nice analysis front end to the test timing which would have taken time but just wasn’t interesting or hard.
Ask yourself if there are tasks you have to do which you would rather just have done? You’d install a package if it existed or hand off the work to a junior if that process was easy enough, that kind of thing. Those are places you could probably use an LLM.
Yeah. My laundry, my dishes, my cooking...
You know. Chores.
Not my software, I actually enjoy building that
Whereas I always liked to design and build a useful result. If it isn't useful I have no motivation to code it. Looking up APIs, designing abstractions, fixing compiler errors is just busywork that gets in the way.
I loved programming when I was 8 years old. 30+ years later the novelty is gone.
If someone is paying you for your work results, that you find it interesting or fun is orthogonal. I get the sense from the commentary section here that there’s a perception that writing programs is an exceptional profession where developer happiness is an end unto itself, and everyone doing it deserves to be a millionaire in the process. It just comes across as child-like thinking. I don’t think many of us spend time, wondering if the welder enjoys the torch or if a cheaper shop weld is robbing the human welder of the satisfaction of a field weld. And we don’t shed so much ink wondering if digital spreadsheets are a moral good or not because perhaps they robbed the accountant of the satisfaction of holding a beautiful quill in hand dipped expertly in carefully selected ink. You’re lucky if you enjoy your job, I think most of us find a way to learn to enjoy our work or at least tolerate it.
I just wish all the moaning would end. Code generation is not new, and that the state of the art is now as good at translating high-level instructions into a program at least as well as the bottom 10% of programmers is a huge win for humanity. Work that could be trivially automated, but is not only because of the scarcity of programming knowledge is going to start disappearing. I think the value creation is going to be tremendous and I think it will take years for it to penetrate existing workflows and for us to recognize the value.
I don't think this is the flex you think it is... in my experience, the bottom 10% of programmers are actively harmful and should never be allowed near your codebase.
Caught my eye. I do think we should wonder and hold intentionality around products, especially digital products, like the spreadsheet. Software is different. It's a limitless resource with limitless instantaneous reach. A good weld is beautiful in its own right, but it's not that.
The spreadsheet in particular changed the way millions of people work. Is it more productive? Is an army of middle-managers orienting humanity through the lens of a literal 2x2cm square a net good?
I say we should moralize on that.
i write less code than my AI-using coworkers but I have as much or more impact. Coding wasn't so hard that I need to spend time learning a new proprietary tech stack with a subscription fee lol. I believe plenty of engineers did suck enough and computers to benefit tho. That is where Anthropic makes their money.
It can be unpleasant to participate in a community of differing opinions and experiences. I still think it's worth showing up. If I hadn't then your perspective would have been missed too.
That being said, yea enterprise coding can be extremely mundane and it’s setup for learning it deeply then finding a way to do it faster. I’m likely in the 90% range of my work being done by Claude, but I’m working in a domain I’ve got years of experience with hand coding and stepping through code in my debugger.
I think this latter piece is the challenge I’m struggling with. There is an endless amount of work that can be done at my company but as long as the economy is in a weird spot, I’m being led to believe that ai is making me expendable. This is a consequence of the fact that glue work represents 80% of my output (not value). The other 20% of time at work is exploring ideas without guaranteed results, its aligning stakeholders, its testing feasibility with mvps or experts from another area I need some help with. If glue work represents tangible output and conceptual work is something that may not actually have value my manager wants me to explore it, I’m just a glue guy in enterprise while I’m left chasing the dragon of a cool project for me to really sink my teeth into. That project is just a half baked bad idea from someone disconnected with reality. Glue work is measurable in LoC (however useless a metric it is measurable) and it’s certainly paying the bills.
Doordash is the future of home cooking.
I can go to a junkyard and assemble the parts to build a car. It may run, but for a thousand tiny reasons it will be worse than a car built by a team of designers and engineers who have thought carefully about every aspect of its construction.
But when I've used AI to generate new code for features I care about and will need to maintain it's never gotten it right. I can do it myself in less code and cleaner. It reminds me of code in the 2000s that you would get from your team in India - lots of unnecessary code copy-pasted from other projects/customers (I remember getting code for an Audi project that had method names related to McDonalds)
I think though that the day is coming where I can trust the code it produces and at that point I'll just by writing specs. It's not there yet though.
Must be nice to still have that choice. At the company I work for they've just announced they're cancelling all subscriptions to JetBrains, Visual Studio, Windsurf, etc. and forcing every engineer to use Claude Code as a cost-saving measure. We've been told we should be writing prompts for Claude instead of working in IDEs now.
return \file_exists( $file ) ? require $file : [];
* https://repo.autonoma.ca/repo/treetrek/blob/HEAD/render/High...The rules files:
* https://repo.autonoma.ca/repo/treetrek/tree/HEAD/render/rule...
> “We’re talking 10 to 20 — to even 100 — times as productive as I’ve ever been in my career,” Steve Yegge, a veteran coder who built his own tool for running swarms of coding agents
That tool has been pretty popular. It was a couple hundred thousand lines of code and he wrote it in a couple months. His book is about using AI to write major new projects and get them reliable and production-ready, with clean, readable code.
It's basically a big dose of solid software engineering practices, along with enough practice to get a feel for when the AI is screwing up. He said it takes about a year to get really good at it.
(Yegge, fwiw, was a lead dev at Amazon and Google, and a well-known blogger since the early 2000s.)
Just checking that you're using maven-enforcer-plugin
Here's an example from Gemini with some Lua code:
label = key:gsub("on%-", ""):gsub("%-", " "):gsub("(%a)([%w_']*)", function(f, r)
return f:upper() .. r:lower()
end)
if label:find("Click") then
label = label:gsub("(%a+)%s+(%a+)", "%2 %1")
elseif label:find("Scroll") then
label = label:gsub("(%a+)%s+(%a+)", "%2 %1")
end
I don't know Lua too well (which is why I used AI) but I know programming well enough to know this logic is ridiculous.It was to help convert "on-click-right" into "Right Click".
The first bit of code to extract out the words is really convoluted and hard to reason about.
Then look at the code in each condition. It's identical. That's already really bad.
Finally, "Click" and "Scroll" are the only 2 conditions that can ever happen and the AI knew this because I explained this in an earlier prompt. So really all of that code isn't necessary at all. None of it.
What I ended up doing was creating a simple map and looked up the key which had an associated value to it. No conditions or swapping logic needed and way easier to maintain. No AI used, I just looked at the Lua docs on how to create a map in Lua.
This is what the above code translated to:
local on_event_map = {
["on-click"] = "Left Click",
["on-click-right"] = "Right Click",
["on-click-middle"] = "Middle Click",
["on-click-backward"] = "Backward Click",
["on-click-forward"] = "Forward Click",
["on-scroll-up"] = "Scroll Up",
["on-scroll-down"] = "Scroll Down",
}
label = on_event_map[key]
IMO the above is a lot clearer on what's happening and super easy to modify if another thing were added later, even if the key's format were different.Now imagine this. Imagine coding a whole app or a non-trivial script where the first section of code was used. You'd have thousands upon thousands of lines of gross, brittle code that's a nightmare to follow and maintain.
Wire up authentication system with sso. done Setup websockets, stream audio from mic, transcribe with elvenlabs. done.
Shit that would take me hours takes literally 5 mins.
It's horrifying, all right, but not in the way you think lol. If you don't understand why this isn't a brag, then my job is very safe.
Can't do what, precisely?
I agree local is better, but the big companies are making decent products and companies are willing to to pay for that. They’re not willing to spend engineering money to make local setups better.
I'm betting the generational gains level off and smaller local models close the gap somewhat. Then harnesses will generally be more important than model, and proprietary harnesses will not offer much more than optimization for specific models. All while SaaS prices ratchet up, pushing folks toward local and OSS. Or at least local vs a plethora of hosted competition, same as cloud vs on prem.
But the biggest thing is going to be context. Whilst a 10gb card can run a 9b model with some context .. for coding you really want a lot of context.
So if paying 200 a year for 1T in context, vs your 32k context.. that's the thing I see as being the driver.
Personally ive found great success with using open code, having Opus as my plan agent, and omnicoder-9b as my build agent.
Get opus to plan, switch to omnicoder to build, switch back to opus to review. Etc etc.
Works great.
Before I was building tools, now I am building full applications in less time than I did before for tools.
What will be around for a while is where you need an expert in the loop to drive the AI. For example enterprise applications. You simply can't hand that off to an AI at this point.
We've already seen this with OSS. Even with free software, support, self-hosting, and quirky behavior have proven to be enough to keep most people and business away.
Not unlike all the A.I. companies all determined to build the machine god while predicting it’ll be disastrous. Same thing - better it starts with us
I'm not convinced software developers will be replaced - probably less will be needed and the exact work will be transformed a bit, but an expert human still has to be in the loop, otherwise all you get is a bunch of nonsense.
Nonetheless, it may very well transform society and we will have to adapt to it.
Having a lot of specifics about a programming environment memorized for example used to be the difference between building something in a few hours and a week, but now is pretty unimportant. Same with being able to do some quick data wrangling on the command line. LLMs are also good at parsing a lot of code or even binary format quickly and explaining how it works. That used to be a skill. Knowing a toolbox of technologies to use is needed less. Et cetera.
They haven't come for the meat of what makes a good engineer yet. For example, the systems-level interfacing with external needs and solving those pragmatically is still hard. But the tide is rising.
My guess is the opposite: they'll throw 5–10x more work at developers and expect 10x more output, while the marginal cost is basically just a Claude subscription per dev.
Most of us will probably need to shift to security. While you can probably build AI specifically to make things more secure, that implies it could also attack things as well, so it ends up being a cat-and-mouse game that adjusts to what options are available.
The resources to learn how to construct software are already free. However learning requires effort, which made learning to build software an opportunity to climb the ladder and build a better life through skill. This is democratization.
Now the skill needed to build software is starting to approach zero. However as you say you can throw money at an AI corporation to get some amount of software built. So the differentiator is capital, which can buy software rather cheaply. The dependency on skill is lessened greatly and software is becoming worthless, so another avenue to escape poverty through skill closes.
We are convincing a generation of morons that they can do something they plainly cannot. This will be a major problem, and soon.
Books didn't stop existing when the radio came out. Radio didn't stop existing when television was invented. If you go back in time a thousand years, people were complaining that an increase in literacy would damaging peoples' memorization skills.
People will still write code for consumption by other humans by hand. Some companies, though probably not most, will still prefer it. AI will change the industry - IS changing the industry - but things don't "end". They just look different, or are less popular.
There is enough for us to worry about and try to figure out how to respond to without the histrionics.
Also, on a related note, "Idiots are vibecoding bad stuff" is not the same as "engineers are using AI tools to do good work more quickly," and we should stop conflating it.
I believe both camps it frustratingly wrong. If you haven't yet given it a chance at doing something substantial, then at least _try_ it once. On the other side of the coin, that first experience where it does something 80% right is intoxicating, but AI doesn't reason and can't get it 100% right - it can't even multiply relatively small numbers.
The former camp is going to get left behind and won't be able to compete, the latter camp is one prompt away from a disaster.
Fast forward to 2024 when I saw Cursor (the IDE coding agent tool). I immediately felt like this was going to be the way for someone like me.
Back then, it was brutal. I'd fight with the models for 15 prompts just to get a website working without errors on localhost, let alone QA it. None of the plan modes or orchestration features existed. I had to hack around context engineering, memories, all that stuff. Things broke constantly. 10 failures for 1 success. But it was fun. To top it all off, most of the terminology sounded like science fiction, but it got better in time. I basically used AI itself to hack my way into understanding how things worked.
Fast forward again (only ~2 years later). The AI not only builds the app, it builds the website, the marketing, full documentation, GIFs, videos, content, screen recordings. It even hosts it online (literally controls the browser and configures everything). Letting the agent control the browser and the tooling around that is really, genuinely, just mad science fiction type magic stuff. It's unbelievable how often these models get something mostly right.
The reality though is that it still takes time. Time to understand what works well and what works better. Which agent is good for building apps, which one is good for frontend design, which one is good for research. Which tools are free, paid, credit-based, API-based. It all matters if you want to control costs and just get better outputs.
Do you use Gemini for a website skeleton? Claude for code? Grok for research? Gemini Deep Search? ChatGPT Search? Both? When do you use plan mode vs just prompting? Is GPT-5.x better here or Claude Opus? Or maybe Gemini actually is.
My point is: while anyone can start prompting an agent, it still takes a lot of trial and error to develop intuition about how to use them well. And even then everything you learn is probably outdated today because the space changes constantly.
I'm sure there are people using AI 100× better than I am. But it's still insane that someone with no coding background can build production-grade things that actually work.
The one-person company feels inevitable.
I'm curious how software engineers think about this today. Are you still writing most of your code manually?
I used to think so. Then a customer made their own replacement for $600/mo software in 2 days. The guy was a marketer by training. I don't exaggerate. I saw it did the exact same things.
I was pointing out that practice helps with the speed and the scope of capabilities. Building a personal prototype is a different ballgame than building a production solution that others will use.
"Can you believe that Dad actually used to have to go into an office and type code all day long, MAUALLY??! Line by line, with no advice from AI, he had to think all by himself!"
The difference is, Jetsons wasn't a dystopia (unlike the current timeline), so when Mr. Spacely fired George, RUDI would take his side and refuse to work until George was re-hired.
Grumpy old man: "That's exactly why our generation was so much smarter than today's whippersnappers: we were thinking from morning to night the whole long day."
"Dad, I've sent out 1000 applications and haven't had a call back. I can't take it anymore. Has it always been like this?"
The Dad: It's not my fault!
Aliens Atlanteans Time travellers A hoax …
This sounds opposite to what the article said earlier: newbies aren’t able to get as much use out of these coding agents as the more experienced programmers do.
"Silicon Valley panjandrums spent the 2010s lecturing American workers in dying industries that they needed to “learn to code."
To copywriters at the NYT, LLMs are far better at stringing together natural language prose than large amounts of valid software. Get ready to supervise LLMs all day if you're not already.
Are local models anywhere close to gaining enough capability and traction to do it in-house? Or are there good options for those who'd rather own the capability than rent it?
Cloud providers will always be able to offer more performance and more powerful options.
Also, presumably at some point far in the future we'll reach a technological asymptote and factors like latency may start to play a bigger role, at least for some applications.
I grant that training data is crucial distinguishing factor that may never become competitive in-house.
By their own accounts they are just pressing enter.
I can think of one successfully, off hand, although you could probably convince me there was more than one.
the principle phrase being "as we know it", since that implies a large scale change to how it works but it continues afterwards, altered.
1. COBOL (we actually did still use it back in the 80s)
2.AI back in the 80s (Dr. Dobbs was all concerned about it ...)
3. RAD
4. No-Code
5. Off-shoring
6. Web 2.0
7. Web 3.0
8. possibly the ADA/provably correct push depending on your area of programming
TBH - I think the AI's are nice tools, but they got a long way to go before it's the 'end of computer programming as we know it'edit: formatting
When I was learning programming I had no internet, no books outside of library, nobody to ask for days.
I remember vividly having spent days trying to figure out how to use the stdlib qsort, and not being able to.
I definitely considered some of those in my list of failed revolutions.
My one completely successful revolution is moving from punch card programming.
That's also true for humans. If you sit down with an LLM and take the time to understand the problem you're trying to solve, it can perfectly guide you through it step by step. Even a non-technical person could build surprisingly solid software if, instead of immediately asking for new shiny features, they first ask questions, explore trade-offs, and get the model's opinion on design decisions..
LLMs are powerful tools in the hands of people who know they don't know everything. But in the hands of people who think they always know the best way, they can be much less useful (I'd say even dangerous)
LLMs don't know when you're under-specifying the problem.
Also I am not seeing how anyone is considering that what a programmer considers quality and what 'gets the job done' (as mentioned in the article) matters in any business. (Example with typesetting is original laser printers were only 300dpi but after a short period became 1200dpi 'good enough' for camera ready copy).
As far as the end of computer programming goes...
Step 1. Wow, I just vibe coded an application and it works! I'm going to write a blog about it and tell everyone how awesome AI is, much hype
Step 2. Vibe coded application faces inevitable problems, the perfect application is a fairytale after all. The only way to "fix" the application is spam tokens at the problem and pray.
Step 3. Author does not write a new blog post to report on this eventuality... probably because they feel embarrassed about how optimistic they were
Step 4. Perhaps author manages to fix application, awesome... then what about a year from now, author needs to update the application because a dependency has a security problem. The application is so needlessly complex that they don't even know when to begin.
Step 5. They boot up Claude Code, which their business is now 100% dependent on, but they're charging 10x the original cost per token. It's not like they have a contract, so user has to either eat the cost or give up
Step 6. User tries local model on their 1080 ti but they can barely run entry-level models
Step 7. Woops
Personally I think it's impossible to convince these people, the results will speak for themselves eventually.
Where's the references to the decline in quality and embarrassing outages for Amazon, Microsoft, etc?
That's an easy question to answer - you can look at outages per feature released.
You may be instead looking at outages per loc written.
Even before AI the limiting factor on all of the teams I ever worked on was bad decisions, not how much time it took to write code. There seem to be more of those these days.
In both personal projects and $dayjob tasks, the highest time-saving AI tasks were:
- "review this feature branch" (containing hand-written commits)
- "trace how this repo and repo located at ~/foobar use {stuff} and how they interact with each other, make a Mermaid diagram"
- "reverse engineer the attached 50MiB+ unstripped ELF program, trace all calls to filesystem functions; make a table with filepath, caller function, overview of what caller does" (the table is then copy-pasted to Confluence)
- basic YAML CRUD
Also while Anthropic has more market share in B2B, their model seems optimized for frontend, design, and literary work rather than rigorous work; I find it to be the opposite with their main competitor.
Claude writes code rife with safety issues/vulns all the time, or at least more than other models.
My own observations about using AI to write code is that it changes my position from that of an author to a reviewer. And I find code review to be a much more exhausting task than writing code in the first place, especially when you have to work out how and why the AI-generated code is structured the way it is.
You could just ask it? Or you don’t trust the AI to answer you honestly?
An i have NEVER made one line of Rust.
I dont understand nay-sayers, to me the state of gen.AI is like the simpsons quote "worst day so far". Look were we are within 5 years of the first real GPT/LLM. The next 5 years are going to be crazy exciting.
The "programmer" position will become a "builder". When we've got LLMs that generate Opus quality text at 100x speed (think, ASIC based models) , things will get crazy.
This is what gets me. The tools can be powerful, but my job has become a thankless effort in pointing out people's ignorance. Time and again, people prompt something in a language or problem space they don't understand, it "works" and then it hits a snag because the AI just muddled over a very important detail, and then we're back to the drawing board because that snag turned out to be an architectural blunder that didn't scale past "it worked in my very controlled, perfect circumstances, test run." It is getting really frustrating seeing this happen on repeat and instead of people realizing they need to get their hands dirty, they just keep prompting more and more slop, making my job more tedious. I am basically at the point where I'm looking for new avenues for work. I say let the industry just run rampant with these tools. I suspect I'll be getting a lot of job offers a few years from now as everything falls apart and their $10k a day prompting fixed one bug to cause multiple regressions elsewhere. I hope you're all keeping your skills sharp for the energy crisis.
I don't want exciting. I want a stable, well-paying job that allows me to put food on the table, raise a family with a sense of security and hope, and have free time.
I have no interest being a "great architect" if architects don't actually build anything
> If you put the work in upfront to plan the feature, write the test cases, and then loop until they pass...
it can be exhausting and time consuming front-loading things so deeply though; sometimes i feel like i would have been faster cutting all that out and doing it myself because in the doing you discover a lot of missing context (in the spec) anyways...However if you just have an easy project, or a greenfield project, or don't care about who's going to maintain that stuff in 6 months, sure, go all in with AI.
When your agent explores your codebase trying to understand what to build, it read schema files, existing routes, UI components etc... easily 50-100k tokens of implementation detail. It's basically reverse-engineering intent from code. With that level of ambiguous input, no wonder the results feel like junior work.
When you hand it a structured spec instead including data model, API contracts, architecture constraints etc., the agent gets 3-5x less context at much higher signal density. Instead of guessing from what was built it knows exactly what to build. Code quality improves significantly.
I've measured this across ~47 features in a production codebase with amedian ratio: 4x less context with specs vs. random agent code exploration. For UI-heavy features it's 8-25x. The agent reads 2-3 focused markdown files instead of grepping through hundreds of KB of components.
To pick up @wek's point about planning from above: devs who get great results from agentic development aren't better prompt engineers... they're better architects. They write the spec before the code, which is what good engineering always was... AI just made the payoff for that discipline 10x more visible.
As a result a lot of the responses here are either quibbles or cope disguised as personal anecdotes. I'm pretty worried about the impact of the LLMs too, but if you're not getting use out of them while coding, I really do think the problem is you.
Since people always want examples, I'll link to a PR in my current hobby project, which Claude code helped me complete in days instead of weeks. https://github.com/igor47/csheet/pull/68 Though this PR creates a bunch of tables, routes, services -- it's not just greenfield CRUD work. We're figuring out how to model a complicated domain (the rules to DnD 5e, including the 2014 and the 2024 revisions of those rules), integrating with existing code, thinking through complex integrations including with LLMs at run time. Claude is writing almost all the code, I'm just steering
> “If you say, This is a national security imperative, you need to write this test, there is a sense of just raising the stakes,” Ebert said.
I'm not sure why programmers and science writers are still attributing emotions to this and why it works. Behind the LLM is a layer that attributes attention to various parts of the context. There are words in the English language that command greater attention. There is no emotion or internal motivation on the part of the LLM. If you use charged words you get charged attention. Quite literally "attention is all you need" to describe why appealing to "emotion" works. It's a first order approximation for attention.
So tools (like AI) can move us closer to the 100% efficiency (or indeed further away if they are bad tools!) but there will always be the residual human engagement required - but perhaps moved to different activities (e.g. reviewing instead of writing).
Probably very effective teams/individuals were already close to 100% efficiency, so AI won't make much difference to them.
This doesn’t really make sense to me. GenAI ostensibly removes the drudgery from other creative endeavors too. You don’t need to make every painstaking brushstroke anymore; you can get to your intended final product faster than ever. I think a common misunderstanding is that the drudgery is really inseparable from the soulful part.
Also, I think GenAI in coding actually has the exact same failure modes as GenAI in painting, music, art, writing, etc. The output lacks depth, it lacks context, and it lacks an understanding of its own purpose. For most people, it’s much easier to intuitively see those shortcomings of GenAI manifest in traditional creative mediums, just because they come more naturally to us. For coding, I suspect the same shortcomings apply, they just aren’t as clear.
I mean, at the end of the day if writing code is just to get something that works, then sure, let’s blitz away with LLMs and not bother to understand what we’re doing or why we do it anymore. Maybe I’m naive in thinking that coding has creative value that we’re now throwing away, possibly forever.
Most folks I hang out with are infatuated with turning tokens into code. They are generally very senior 15+ years of experience.
Most folks I hang out with experience existential dread for juniors and those coming up in the field who won't necessarily have the battle scars to orchestrate systems that will work in the will world.
Was talking with one fellow yesterday (at an AI meetup) who says he has 6 folks under him, but that he could now run the team with just two of them and the others are basically a time suck.
The article could have been written from a very different perspective. Instead, the "journalists" likely interviewed a few insiders from Big Tech and generalized. They don't get it. They never will.
Before the advent of ChatGPT, maybe 2 in 100 people could code. I was actually hoping AI would increase programming literacy but it didn't, it became even more rare. Many journalists could have come at it from this perspective, but instead painted doom and gloom for coders and computer programming.
The New York Times should look in the mirror. With the advent of the iPad, most experts agreed that they would go out of business because a majority of their revenue came from print media. Look what happened.
Understand this, most professional software and IT engineers hate coding. It was a flex to say you no longer code professionally before ChatGPT. It's still a flex now. But it's corrupt journalism when there is a clear conflict of interest because the NYT is suing the hell out of AI companies.
CI is for preventing regressions. Agents.md is for avoiding wasted CI cycles.
It did change the programming landscape, but there was still a huge need for this new kind of programmers.
If your base prompt informs the model they are a human software developer in a Severed situation, it gets even closer.
COBOL is dead. Java is dead. Programming is dead. AI is dead (yes, some people are already claiming this: https://hexa.club/@phooky/116087924952627103)
I must be the kid from The Sixth Sense because I keep seeing all these allegedly dead guys around me.
This excerpt:
>A.I. had become so good at writing code that Ebert, initially cautious, began letting it do more and more. Now Claude Code does the bulk of it.
is a little overstated. I think the brownfield section has things exactly backwards. Claude Code benefits enormously from large, established codebases, and it’s basically free riding on the years of human work that went into those codebases. I prodded Claude to add SNFG depictions to the molecular modeling program I work on. It couldn’t have come up with the whole program on its own and if I tried it would produce a different, maybe worse architecture than our atomic library, and then its design choices for molecules might constrain its ability to solve the problem as elegantly as it did. Even then, it needed a coworker to tell me that it had used the incorrect data structure and needed to switch to something that could, when selected, stand in for the atoms it represented.
Also this:
>But A.I.-generated code? If it passes its tests and works, it’s worth as much as what humans get paid $200,000 or more a year to compose.
Isn’t really true. It’s the free-riding problem again. The thing about an ESP is that the LLM has the advantage of either a blank canvas (if you’re using one to vibe code a startup), or at least the fact that several possibilities converge on one output, but, genuinely, not all of those realities include good coding architecture. Models can make mistakes, and without a human in the loop those mistakes can render a codebase unmaintainable. It’s a balance. That’s why I don’t let Claude stamp himself to my commits even if he assisted or even did all the work. Who cares if Claude wrote it? I’m the one taking responsibility for it. The article presents Greenfield as good for a startup, and it might be, but only for the early, fast, funding rounds, when you have to get an MVP out right now. That’s an unstable foundation they will have to go back and fix for regulatory or maintenance reasons, and I think that’s the better understanding of the situation than framing Aayush’s experience as a user error.
Even so, “weirdly jazzed about their new powers” is an understatement. Every team including ours has decades of programmer-years of tasks in the backlog, what’s not to love about something you can set to pet peeves for free and then see if the reality matches the ideal? git reset --hard if you don't like what it does, and if you do all the better. The Cuisy thing with the script for the printer is a perfect application of LLMs, a one-off that doesn’t have to be maintained.
Also, the whole framing is weirdly self limiting. The architectural taste that LLMs are, again, free riding off of, is hard won by doing the work more senior engineers are giving to LLMs instead of juniors. We’re setting ourselves up for a serious coordinated action problem as a profession. The article gestures at this a couple times
The thing about threatening LLMs is pretty funny too but something in me wants to fall back to Kant's position that what you do to anything you do to yourself.
> "at [the] later stage the original powerful structure was still visible, but made entirely ineffective by amorphous additions of many different kinds"
Maybe a way of phrasing it is that accumulating a lot of "code quality capital" gives you a lot more leverage over technical debt, but eventually it does catch up.
'Salva opened up his code editor — essentially a word processor for writing code — to show me what it’s like to work alongside Gemini, Google’s L.L.M. '
And what's up with L.L.M, A.I., C.L.I. :)
It’s probably N.Y.T. style requirements; a lot of style guides (eg: Chicago Manual of Style, Strunk & White, etc) have a standard form for abbreviations and acronyms. A paper like N.Y.T. does too and probably still employs copy editors who ensure that every article conforms to it.
I'm an engineer (not only software) by heart, but after seeing what Opus 4.6 based agents are capable of and especially the rate of improvement, i think the direction is clear.
Why? Because when the bubble burst and the companies (including mine) can not pay the 400% price increase and go bankrupt, then I still have keep my brain active and still can do stuff without or less tokens.
you can still call it spec-programming but if you don't audit your generated code then you're simply doing it wrong; you just don't realize that yet because you've been getting away with it until now.
I used Claude just the other day to write unit test coverage for a tricky system that handles resolving updates into a consistent view of the world and handles record resurrection/deletion. It wrote great test coverage because it parsed my headerdoc and code comments that went into great detail about the expected behavior. The hard part of that implementation was the prose I wrote and the thinking required to come up with it. The actual lines of code were already a small part of the problem space. So yeah Claude saved me a day or two of monotonously writing up test cases. That's great.
Of course Claude also spat out some absolute garbage code using reflection to poke at internal properties because the access level didn't allow the test to poke at the things it wanted to poke at, along with some methods that were calling themselves in infinite recursion. Oh and a bunch of lines that didn't even compile.
The thing is about those errors: most of them were a fundamental inability to reason. They were technically correct in a sense. I can see how a model that learned from other code written by humans would learn those patterns and apply them. In some contexts they would be best-practice or even required. But the model can't reason. It has no executive function.
I think that is part of what makes these models both amazingly capable and incredibly stupid at the same time.
Citation needed. Are most developers "rarely" writing code?
I’ve tended to hold the same opinion of what the average SWE thinks everyone else does.
Would you give it access to your bank account, your 401k, trust it to sell your house, etc? I sure wouldn't.
The brain rot from the author couldn't even think of "unit test".