The fulcrum of Weakly's argument is that agents should stay in their lane, offering helpful Clippy-like suggestions and letting humans drive. But what exactly is the value in having humans grovel through logs to isolate anomalies and create hypotheses for incidents? AI tools are fundamentally better at this task than humans are, for the same reason that computers are better at playing chess.
What Weakly seems to be doing is laying out a bright line between advising engineers and actually performing actions --- any kind of action, other than suggestions (and only those suggestions the human driver would want, and wouldn't prefer to learn and upskill on their own). That's not the right line. There are actions AI tools shouldn't perform autonomously (I certainly wouldn't let one run a Terraform apply), but there are plenty of actions where it doesn't make sense to stop them.
The purpose of incident resolution is to resolve incidents.
Sounds like heuristics added on top of statistics, which is trying to remedy some root problem with another hack.
I guess the chess analogy would be that it makes a lot of sense to analyse positions yourself, even though Leela and Stockfish can do a far more thorough job in much less time. Of course, if you just need to know the best move right now, you would use the AI, and professionals do that all the time.
But as a decently strong chess player I cannot imagine improving without doing this kind of manual practice (at least beyond a basic level of skill like knowing how pieces move). Grandmasters routinely drill tactics exercises, for instance, even though they are "mundane" at that level of ability.
I guess the crux of it - do you think AI+person learns faster than just person for this kind of thing? And why? It's not obvious to me either way (and another question is whether the skill is even relevant any more... I think so, but I know people who don't).
You don’t run analysis of your chess game when the clock is ticking.
I'm curious as to where you would draw the line. Assuming you've adhered to DevOps best practices, most--if not all--changes would require some sort of code commit and promotion through successive environments to reach production. This isn't just application code, of course; it's also your infrastructure. In such a situation, what would you permit an agent to autonomously perform in the course of incident resolution?
Agreed! I think about this using Weakly's own reference to "standing on the shoulders of giants."
To me, building abstractions to handle tedious work is how we do that. We moved from assembly to compilers, and from manual memory management to garbage collectors. That wasn't "deskilling" - it just freed us up to solve more interesting problems at a higher level.
Manually crawling through logs feels like the next thing we should happily give up. It's painful, and I don't know many engineers who enjoy it.
Disclaimer: I'm very biased - working on an agent for this exact use case.
See also: Tool AIs Want To Be Agent AIs.
Predicted almost a decade ago.
https://news.ycombinator.com/item?id=44627910
Basically environments/platforms that gives all the knobs,levers,throttles to humans while being tightly integrated with AI capabilities. This is hard work that goes far beyond a VSCode fork.
I would love to see other interfaces other than chats for interacting with AI.
Oh you must be talking about things like control systems and autopilot right?
Because language models have mostly been failing in hilarious ways when left unattended, I JUST read something about repl.it ...
I know that we can modify CLAUDE.md and maintain that as well as docs. But it would be awesome if CC had something built in for teams to collaborate more effectively
Suggestions are welcomed
Then you just need to include instructions on how to use it to communicate.
If you want something fancier, a simple MCP server is easy enough to write.
Perhaps it could be implemented as a tool? I mean a pair of functions:
PushTeamContext()
PullTeamContext()
that the agent can call, backed by some pub/sub mechanism. It seems very complicated and I'm not sure we'd gain that much to be honest.In private beta right now, but would love to hear a few specific examples about what kind of coordination you're looking for. Email hi [at] nmn.gl
you can enforce these rules in code review after CC finishes writing code
email ilya (at) wispbit.com and ill send you a link to set this up
That might have been what they tested at IMO.
I've spent the last 15 years doing R&D on (non-programmer) domain-expert-augmenting ML applications and have never delivered an application that follows the principles the author outlines. The fact that I have such a different perspective indicates to me that the design space is probably massive and it's far too soon to say that any particular methodology is "backwards." I think the reality is we just don't know at this point what the future holds for AI tooling.
But I agree that the space is wide enough that different interpretations arise depending on where we stand.
However, I still find it good practice to keep humans (and their knowledge/retrieval) as much in the loop as possible.
I think at its best, ML models give new data-driven capabilities to decision makers (as in the example above), or make decisions that a human could not due to the latency of human decision-making -- predictive maintenance applications like detecting impending catastrophic failure from subtle fluctuations in electrical signals fall into this category.
I don't think automation inherently "de-skills" humans, but it does change the relative value of certain skills. Coming back to agentic coding, I think we're still in the skeuomorphic phase, and the real breakthroughs will come from leveraging models to do things a human can't. But until we get there, it's all speculation as far as I'm concerned.
My issue with applying this reasoning to AI is that prior technologies addressed bottlenecks in distribution, whereas this more directly attacks the creative process itself. Stratechery has a great post on this, where he argues that AI is attempting to remove the "substantiation" bottleneck in idea generation.
Doing this for creative tasks is fine ONLY IF it does not inhibit your own creative development. Humans only have so much self-control/self-awareness
A better analogy than the printing press, would be synthesizers. Did their existence kill classical music? Does modern electronic music have less creativity put into it than pre-synth music? Or did it simply open up a new world for more people to express their creativity in new and different ways?
"Code" isn't the form our thinking must take. To say that we all will stunt our thinking by using natural language to write code, is to say we already stunted our thinking by using code and compilers to write assembly.
So if the printing press stunted our writing what will the thinking press stunt.
https://gizmodo.com/microsoft-study-finds-relying-on-ai-kill...
1) I can't remember the last time I write something meaningfully long with an actual pen/pencil. My handwriting is beyond horrible.
2) I can't no longer find my way driving without a GPS. Reading a map? lol
That's a skill that depends on motor functions of your hands, so it makes sense that it degrades with lack of practice.
> I can't no longer find my way driving without a GPS. Reading a map? lol
Pretty sure what that actually means in most cases is "I can go from A to B without GPS, but the route will be suboptimal, and I will have to keep more attention to street names"
If you ever had a joy of printing map quest or using a paper map, I'm sure you still these people skill can do, maybe it will take them longer. I'm good at reading mall maps tho.
Most people would still be able to. But we fantasize about the usefulness of maps. I remember myself on the Paris circular highway (at the time 110km/h, not 50km/h like today), the map on the driving wheel, super dangerous. You say you’d miss GPS features on a paper map, but back then we had the same problems: It didn’t speak, didn’t have the blinking position, didn’t tell you which lane to take, it simplified details to the point of losing you…
You won’t become less clever with AI: You already have Youtube for that. You’ll just become augmented.
I like this zooming in and zooming out, mentally. At some point i can zoom out another level. I miss coding. While i still code a lot.
People say the same thing about code but there's been a big conflation between "writing code" and "thinking about the problem". Way too often people are trying to get AI to "think about the problem" instead of simply writing the code.
For me, personally, the writing the code part goes pretty quick. I'm not convinced that's my bottleneck.
Which theoretically could actually be a benefit someday: if your company does many similar customer deployments, you will eventually be more efficient. But if you are doing custom code meant just for your company... there may never be efficiency increase
For me, refactoring is really the essence of coding. Getting the initial version of a solution that barely works —- that’s necessary but less interesting to me. What’s interesting is the process of shaping that v1 into something that’s elegant and fits into the existing architecture. Sanding down the rough edges, reducing misfit, etc. It’s often too nitpicky for an LLM to get right.
If you write it by hand you don't need to "learn it thoroughly", you wrote it
There is no way you understand code between by reading it than by creating it. Creating it is how you prove you understand it!
For beginners my I think this is a very important step in learning how to break down problems (into smaller components) and iterating.
There is no doubt in my mind that software quality has taken a nosedive everywhere AI has been introduced. Our entire industry is hallucinating its way into a bottomless pit.
I imagine people can start making code (probably already are) where functions/modules are just boxes as a UI and the code is not visible, test it with in/out, join it to something else.
When I'm tasked to make some CRUD UI I plan out the chunks of work to be done in order and I already feel the rote-ness of it, doing it over and over. I guess that is where AI can come in.
But I do enjoy the process of making something even like a POSh camera GUI/OS by hand..
If AI tools continue to improve, there will be less and less need for humans to write code. But -- perhaps depending on the application -- I think there will still be need to review code, and thus still need to understand how to write code, even if you aren't doing the writing yourself.
I imagine the only way we will retain these skills is be deliberately choosing to do so. Perhaps not unlike choosing to read books even if not required to do so, or choosing to exercise even if not required to do so.
Maybe, but I don't think it's that easy.
What is different about LLM-created code is that compilers work. Reliably and universally. I can just outsource the job of writing the assembly to them and don't need to think about it again. (That is, unless you are in one of those niches that require hyper-optimized software. Compilers can't reliably give you that last 2x speed-up.)
LLMs by their turn will never be reliable. Their entire goal is opposite to reliability. IMO, the losses are still way higher than the gains, and it's questionable if this is an architectural premise that will never change.
Course, then there's lovable, which spits out the front-end I describe, which it is very impressively good at. I just want a starting point, then I get going, if I get stuck I'll ask clarifying questions. For side projects where I have limited time, LLMs are perfect for me.
On the other hand I do a lot more fundamental coding than the median. I do quite a few game jams, and I am frequently the only one in the room who is not using a game engine.
Doing things like this I have written so many GUI toolkits from scratch now that It's easy enough for me to make something anew in the middle of a jam.
For example https://nws92.itch.io/dodgy-rocket In my experience it would have been much harder to figure out how to style scrollbars to be transparent with in-theme markings using an existing toolkit than writing a toolkit from scratch. This of course changes as soon as you need a text entry field. I have made those as well, but they are subtle and quick to anger.
I do physics engines the same way, predominantly 2d, (I did a 3d physics game in a jam once but it has since departed to the Flash afterlife). They are one of those things that seem magical until you've done it a few times, then seem remarkably simple. I believe John Carmack experienced that with writing 3d engines where he once mentioned quickly writing several engines from scratch to test out some speculative ideas.
I'm not sure if AI presents an inhibiter here any more than using an engine or a framework. They both put some distance between the programmer and the result, and as a consequence the programmer starts thinking in terms of the interface through which they communicate instead of how the result is achieved.
On the other hand I am currently using AI to help me write a DMA chaining process. I initially got the AI to write the entire thing. The final code will use none of that emitted output, but it was sufficient for me to see what actually needed to be done. I'm not sure if I could have done this on my own, AI certainly couldn't have done it on it's own. Now that I have (almost (I hope)) done it once in collaboration with AI, I think I could now write it from scratch myself should I need to do it again.
I think AI, Game Engines, and Frameworks all work against you if you are trying to do something abnormal. I'm a little amazed that Monument Valley got made using an engine. I feel like they must have fought the geometry all the way.
I think this jam game I made https://lerc.itch.io/gyralight would be a nightmare to try and implement in an engine. Similarly I'm not sure if an AI would manage the idea of what is happening here.
A simple UX change makes the difference between education and dumbing users of your service.
The reason I think that is because it often ask about things I already took great care to explicitly type out. I honestly don't think those extra questions add much to the actually searching it does.
I definetly sometimes ask really specialized questions and in that case i just say "do the search" and ignore the questions, but a lot of times it helps me determine what i am really asking.
I suspect people with execellent communication abilities might find less utility from the questions
* Sophisticated find and replace i.e. highlight a bunch of struct initalisations and saying "Convert all these to Y". (Regex was always a PITA for this, though it is more deterministic.)
* When in an agentic workflow, treating it as a higher level than ordinary code and not so much as a simulated human. I.e. the more you ask it to do at once, the less it seems to do it well. So instead of "Implement the feature" you'd want to say "Let's make a new file and create stub functions", "Let's complete stub function 1 and have it do x", "Complete stub function 2 by first calling stub function 1 and doing Y", etc.
* Finding something in an unfamiliar codebase or asking how something was done. "Hey copilot, where are all the app's routes defined?" Best part is you can ask a bunch of questions about how a project works, all without annoying some IRC greybeard.
But you need to: * Research problems * Describe features * Define API contracts * Define basic implementation plan * Setup credentials * Provide testing strategy and setup efficient testing setup/teardown * Define libraries docs and references and find legit documentation for AI * Also AI does a lot mistakes with imports etc. and long running processes
> Humans learn collectively and innovate collectively via copying, mimicry, and iteration on top of prior art. You know that quote about standing on the shoulders of giants? It turns out that it's not only a fun quote, but it's fundamentally how humans work.
Creativity is search. Social search. It's not coming from the brain itself, it comes from the encounter between brain and environment, and builds up over time in the social/cultural layer.
That is why I don't ask myself if LLMs really understand. As long as they search, generating ideas and validating them in the world, it does not matter.
It's also why I don't think substrate matters, only search does. But substrate might have to do with the search spaces we are afforded to explore.
I think it'll be like driving: the automatic transmission, power brakes, and other tech made it more accessible but in the process we forgot how to drive. that doesn't mean nobody owns a manual anymore, but it's not a growing % of all drivers
That combined with having to manually do it has helped me be able to learn how to do things on my own, compared to when I just copy paste or use agents.
And the more concepts you can break things in to, the better. From now on, I’ve started projects working with AI to make “phases” for projects for testability, traceability, and over understanding
My defacto has become using AI on my phone with pictures of screens and voicing questions, to try to force myself to use it right. When you can’t mindlessly copy paste, even though it might feel annoying in the moment, the learning that happens from that process saves so much time later from hallucination-holes!
AI requires a holistic revision. When the OS's catch up, we'll have some real fun.
The author is good to call out the differences in UX. Sad that design has always been given less attention.
When I first saw the title, my initial thought was this may relate to AX, which I think compliments the topic very well: https://x.com/gregisenberg/status/1947693459147526179
We are in the middle of peer vs pair sort of abstraction. Is the peer reliable enough to be delegated the task? If not, the pair design pattern should be complementary to human skill set. I sensed the frustration with ai agents came from being not fully reliable. That means a human in the loop is absolutely needed, and if there is a human, dont have ai being good at what human can do, instead it be good assistant by doing things human would need. I agree on that part, though if reliability is ironed out, for most of my tasks, i am happy ai can do the whole thing. Other frustrations stem from memory or lack of(in research), hallucinations and overconfidence, lack of situational awareness (somehow situational awareness is what agents market themselves on). If these are fixed, treating agents as a pair vs treating agents as a peer might tilt more towards the peer side.
But there are plenty of active investigative steps you'd want to take in generating hypotheses for an outage. Weakly's piece strongly suggests AI tools not take these actions, but rather suggest them to operators. This is a waste of time, and time is the currency of incident resolution.
We should be working to make HITL tools, not HOTL workflows where the humans are expected to just work with the final output. At some point the abstraction will leak.
I was very happy to see AWS release Kiro. It was quite validating to me seeing them release it and follow up with discussions on how this methodology of integrating AI with software development was effective for them
If these are the priors why would I keep reading?
Why even ask this question?
Seems like the waiter told you how your sausage is made so you left the restaurant, but you'd eat it if you weren't reminded.
However, I could not help but get caught up on this totally bonkers statement, which detracted from the point of the article:
> Also, innovation and problem solving? Basically the same thing. If you get good at problem solving, propagating learning, and integrating that learning into the collective knowledge of the group, then the infamous Innovator’s Dilemma disappears.
This is a fundamental misunderstanding of what the innovator's dilemma is about. It's not about the ability to be creative and solve problems, it is about organizational incentives. Over time, an incumbent player can become increasingly disincentivized from undercutting mature revenue streams. They struggle to diversify away from large, established, possibly dying markets in favor of smaller, unproven ones. This happens due to a defensive posture.
To quote Upton Sinclair, "it is difficult to get a man to understand something when his salary depends upon his not understanding it." There are lots of examples of this in the wild. One famous one that comes to mind is AT&T Bell Labs' invention of magnetic recording & answering machines that AT&T shelved for decades because they worried that if people had answering machines, they wouldn't need to call each other quite so often. That is, they successfully invented lots of things, but the parent organization sat on those inventions as long as humanly possible.
i do wonder if you could make a prompt to force your LLM to always respond like this and if that would already be a sort of dirty fix... im not so clever at prompting yet :')
MS Clippy was the AI tool we should all aspire to build