However I think it's important to make it clear that given the hardware constraints of many environments the applicability of what's being called software 2.0 and 3.0 will be severely limited.
So instead of being replacements, these paradigms are more like extra tools in the tool belt. Code and prompts will live side by side, being used when convenient, but none a panacea.
Training constraints: you need lots, and lots of data to build complex neural network systems. There are plenty of situations where the data just isn't available to you (whether for legal reasons, technical reasons, or just because it doesn't exist).
Legibility constraints: it is extremely hard to precisely debug and fix those systems. Let's say you build a software system to fill out tax forms - one the "traditional" way, and one that's a neural network. Now your system exhibits a bug where line 58(b) gets sometimes improperly filled out for software engineers who are married, have children, and also declared a source of overseas income. In a traditionally implemented system, you can step through the code and pinpoint why those specific conditions lead to a bug. In a neural network system, not so much.
So totally agreed with you that those are extra tools in the toolbelt - but their applicability is much, much more constrained than that of traditional code.
In short, they excel at situations where we are trying to model an extremely complex system - one that is impossible to nail down as a list of formal requirements - and where we have lots of data available. Signal processing (like self driving, OCR, etc) and human language-related problems are great examples of such problems where traditional programming approaches have failed to yield the kind of results we wanted (ie, beyond human performance) in 70+ years of research and where the modern, neural network approach finally got us the kind of results we wanted.
But if you can define the problem you're trying to solve as formal requirements, then those tools are probably ill-suited.
LLMs give us another tool only this time it's far more accessible and powerful.
Yes, I am talking about formal verification, of course!
That also goes nicely together with "keeping the AI on a tight leash". It seems to clash though with "English is the new programming language". So the question is, can you hide the formal stuff under the hood, just like you can hide a calculator tool for arithmetic? Use informal English on the surface, while some of it is interpreted as a formal expression, put to work, and then reflected back in English? I think that is possible, if you have a formal language and logic that is flexible enough, and close enough to informal English.
Yes, I am talking about abstraction logic [1], of course :-)
So the goal would be to have English (German, ...) as the ONLY programming language, invisibly backed underneath by abstraction logic.
The problem with trying to make "English -> formal language -> (anything else)" work is that informality is, by definition, not a formal specification and therefore subject to ambiguity. The inverse is not nearly as difficult to support.
Much like how a property in an API initially defined as being optional cannot be made mandatory without potentially breaking clients, whereas making a mandatory property optional can be backward compatible. IOW, the cardinality of "0 .. 1" is a strict superset of "1".
That sounds like a paradox.
Formal verification can prove that constraints are held. English cannot. mapping between them necessarily requires disambiguation. How would you construct such a disambiguation algorithm which must, by its nature, be deterministic?
But maybe I just don't understand.
For those who missed it, here's the viral tweet by Karpathy himself: https://x.com/karpathy/status/1617979122625712128
def __main__:
You are a calculator. Given an input expression, you compute the result and print it to stdout, exiting 0.
Should you be unable to do this, you print an explanation to stderr and exit 1.
(and then, perhaps, a bunch of 'DO NOT express amusement when the result is 5318008', etc.)stares at every weird holo-deck episode
def __main__:
You run main(). If there are issues, you edit __file__ to try to fix the errors and re-run it. You are determined, persistent, and never give up.What we have today with ChatGPT and the like (and even IDE integrations and API use) is imperative right, it's like 'answer this question' or 'do this thing for me', it's a function invocation. Whereas the silly calculator program I presented above is (unintentionally) kind of a declarative probabilistic program - it's 'this is the behaviour I want, make it so' or 'I have these constraints and these unknowns, fill in the gaps'.
What if we had something like Prolog, but with the possibility of facts being kind of on-demand at runtime, powered by the LLM driving it?
I can't believe someone would seriously write this and not realize how nonsensical it is.
"indeterministic programming", you seriously cannot come up with a bigger oxymoron.
If it were up to me, which it is not, I would try to optimize the next AISUS for more of this. I felt like I was getting smarter as the talk went on.
It immediately makes me think a LLM that can generate a customized GUI for the topic at hand where you can interact with in a non-linear way.
1. Similar cost structure to electricity, but non-essential utility (currently)?
2. Like an operating system, but with non-determinism?
3. Like programming, but ...?
Where does the programming analogy break down?
The way I see dependency in office ("knowledge") work:
- pre-(computing) history. We are at the office, we work
- dawn of the pc: my computer is down, work halts
- dawn of the lan: the network is down, work halts
- dawn of the Internet: the Internet connection is down, work halts (<- we are basically all here)
- dawn of the LLM: ChatGPT is down, work halts (<- for many, we are here already)
The programming analogy is convenient but off. The joke has always been “the computer only does exactly what you tell it to do!” regarding logic bugs. Prompts and LLMs most certainly do not work like that.
I loved the parallels with modern LLMs and time sharing he presented though.
The primagen reviewed this article[1] a few days ago, and (I think) that's where I heard about it. (Can't re-watch it now, it's members only) 8(
[1] https://medium.com/@drewwww/the-gambler-and-the-genie-08491d...
The best part is that AI-driven systems are fine with running even more tight loops than what a sane human would tolerate.
Eg. running full linting, testing and E2E/simulation suite after any minor change. Or generating 4 versions of PR for the same task so that the human could just pick the best one.
1. People get lazy when presented with four choices they had no hand in creating, and they don’t look over the four and just click one, ignoring the others. Why? Because they have ten more of these on the go at once, diminishing their overall focus.
2. Automated tests, end-to-end sim., linting, etc—tools already exist and work at scale. They should be robust and THOROUGHLY reviewed by both AI and humans ideally.
3. AI is good for code reviews and “another set of eyes” but man it makes serious mistakes sometimes.
An anecdote for (1), when ChatGPT tries to A/B test me with two answers, it’s incredibly burdensome for me to read twice virtually the same thing with minimal differences.
Code reviewing four things that do almost the same thing is more of a burden than writing the same thing once myself.
Even if you tell the git-aware Jules to handle a merge conflict within the context window the patch was generated, it is like sorry bro I have no idea what's wrong can you send me a diff with the conflict?
I find i have to be in the iteration loop at every stage or else the agent will forget what it's doing or why rapidly. for instance don't trust Jules to run your full test suite after every change without handholding and asking for specific run results every time.
It feels like to an LLM, gaslighting you with code that nominally addresses the core of what you just asked while completely breaking unrelated code or disregarding previously discussed parameters is an unmitigated success.
Why would a sane human be averse to things happening instantaneously?
That sounds awful. A truly terrible and demotivating way to work and produce anything of real quality. Why are we doing this to ourselves and embracing it?
A few years ago, it would have been seen as a joke to say “the future of software development will be to have a million monkey interns banging on one million keyboards and submit a million PRs, then choose one”. Today, it’s lauded as a brilliant business and cost-saving idea.
We’re beyond doomed. The first major catastrophe caused by sloppy AI code can’t come soon enough. The sooner it happens, the better chance we have to self-correct.
When I have tried to "pair program" with an LLM, I have found it incredibly tedious, and not that useful. The insights it gives me are not that great if I'm optimising for response speed, and it just frustrates me rather than letting me go faster. Worse, often my brain just turns off while waiting for the LLM to respond.
OTOH, when I work in a more async fashion, it feels freeing to just pass a problem to the AI. Then, I can stop thinking about it and work on something else. Later, I can come back to find the AI results, and I can proceed to adjust the prompt and re-generate, to slightly modify what the LLM produced, or sometimes to just accept its changes verbatim. I really like this process.
The new software world is the massive amount of code that will be burped out by these agents, and it should quickly dwarf the human output.
Software is a world in motion. Software 1.0 was animated by developers pushing it around. Software 3.0 is additionally animated by AI agents.
Gemini found it via screenshot or context: https://clerk.com/
This is what he used for login on MenuGen: https://karpathy.bearblog.dev/vibe-coding-menugen/
and wild... you used gemini to process a screenshot to find the website for a 5 letter word library?
interesting that Waymo could do uninterrupted trips back in 2013, wonder what took them so long to expand? regulation? tailend of driving optimization issues?
noticed one of the slides had a cross over 'AGI 2027'... ai-2027.com :)
Eh, he ran Teslas self driving division and put them into a direction that is never going to fully work.
What they should have done is a) trained a neural net to represent sequence of frames into a physical environment, and b)leveraged Mu Zero, so that self driving system basically builds out parallel simulations into the future, and does a search on the best course of action to take.
Because thats pretty much what makes humans great drivers. We don't need to know what a cone is - we internally compute that something that is an object on the road that we are driving towards is going to result in a negative outcome when we collide with it.
I am writing a hobby app at the moment and I am thinking about its architecture in a new way now. I am making all my model structures comprehensible so that LLMs can see the inside semantics of my app. I merely provide a human friendly GUI over the top to avoid the linear wall-of-text problem you get when you want to do something complex via a chat interface.
We need to meet LLMs in the middle ground to leverage the best of our contributions - traditional code, partially autonomous AI, and crafted UI/UX.
Part of, but not all of, programming is "prompting well". It goes along with understanding the imperative aspects, developing a nose for code smells, and the judgement for good UI/UX.
I find our current times both scary and exciting.
If anything, you're showing a lack of understanding of what he was talking about. The context is this specific time, where we're early in a ecosystem and things are expensive and likely centralized (ala mainframes) but if his analogy/prediction is correct, we'll have a "Linux" moment in the future where that equation changes (again) and local models are competitive.
And while I'm a huge fan of local models run them for maybe 60-70% of what I do with LLMs, they're nowhere near proprietary ones today, sadly. I want them to, really badly, but it's important to be realistic here and realize the differences of what a normal consumer can run, and what the current mainframes can run.
- https://blog.nilenso.com/blog/2025/05/29/ai-assisted-coding/
See also:
I easily see a huge future for agentic assistance in the enterprise, but I struggle mightily to see how many IT leaders would accept the output code of something like a menugen app as production-viable.
Additionally, if you're licensing code from external vendors who've built their own products at least partly through LLM-driven superpowers, how do you have faith that they know how things work and won't inadvertently break something they don't know how to fix? This goes for niche tools (like Clerk, or Polar.sh or similar) as much as for big heavy things (like a CRM or ERP).
I was on the CEO track about ten years ago and left it for a new career in big tech, and I don't envy the folks currently trying to figure out the future of safe, secure IT in the enterprise.
Put another way, when I cause bugs, they are often glaring (more typos, fewer logic mistakes). Plus, as the author it's often straightforward to debug since you already have a deep sense for how the code works - you lived through it.
So far, using LLMs has downgraded my productivity. The bugs LLMs introduce are often subtle logical errors, yet "working" code. These errors are especially hard to debug when you didn't write the code yourself — now you have to learn the code as if you wrote it anyway.
I also find it more stressful deploying LLM code. I know in my bones how carefully I write code, due to a decade of roughly "one non critical bug per 10k lines" that keeps me asleep at night. The quality of LLM code can be quite chaotic.
That said, I'm not holding my breath. I expect this to all flip someday, with an LLM becoming a better and more stable coder than I am, so I guess I will keep working with them to make sure I'm proficient when that day comes.
probably all of the ones at microsoft
Large corporations, which have become governments in all but name, are the only ones with the capability to create ML models of any real value. They're the only ones with access to vast amounts of information and resources to train the models. They introduce biases into the models, whether deliberately or not, that reinforces their own agenda. This means that the models will either avoid or promote certain topics. It doesn't take a genius to imagine what will happen when the advertising industry inevitably extends its reach into AI companies, if it hasn't already.
Even open weights models which technically users can self-host are opaque blobs of data that only large companies can create, and have the same biases. Even most truly open source models are useless since no individual has access to the same large datasets that corporations use for training.
So, no, LLMs are the same as any other technology, and actually make governments and corporations even more powerful than anything that came before. The users benefit tangentially, if at all, but will mostly be exploited as usual. Though it's unsurprising that someone deeply embedded in the AI industry would claim otherwise.
Lack of compute on the Ai2's side also means the context OLMo is trained for is miniscule, the other thing that you need to throw brazillions of dollars at to make model that's maybe useful in the end if you're very lucky. Training needs high GPU interconnect bandwidth, it can't be done in distributed horde in any meaningful way even if people wanted to.
The only ones who have the power now are the Chinese, since they can easily ignore copyright for datasets, patents for compute, and have infinite state funding.
Seems like you could set a LLM loose and like the Google Bot have it start converting all html pages into llms.txt. Man, the future is crazy.
Website too confusing for humans? Add more design, modals, newsletter pop ups, cookie banners, ads, …
Website too confusing for LLMs? Add an accessible, clean, ad-free, concise, high entropy, plain text summary of your website. Make sure to hide it from the humans!
PS: it should be /.well-known/llms.txt but that feels futile at this point..
PPS: I enjoyed the talk, thanks.
1,5 years ago he saw all the tool uses in agent systems as the future of LLMs, which seemed reasonable to me. There was (and maybe still is) potential for a lot of business cases to be explored, but every system is defined by its boundaries nonetheless. We still don't know all the challenges we face at that boundaries, whether these could be modelled into a virtual space, handled by software, and therefor also potentially AI and businesses.
Now it all just seems to be analogies and what role LLMs could play in our modern landscape. We should treat LLMs as encapsulated systems of their own ...but sometimes an LLM becomes the operating system, sometimes it's the CPU, sometimes it's the mainframe from the 60s with time-sharing, a big fab complex, or even outright electricity itself?
He's showing an iOS app, which seems to be, sorry for the dismissive tone, an example for a better looking counter. This demo app was in a presentable state for a demo after a day, and it took him a week to implement Googles OAuth2 stuff. Is that somehow exciting? What was that?
The only way I could interpret this is that it just shows a big divide we're currently in. LLMs are a final API product for some, but an unoptimized generative software-model with sophisticated-but-opaque algorithms for others. Both are utterly in need for real world use cases - the product side for the fresh training data, and the business side for insights, integrations and shareholder value.
Am I all of a sudden the one lacking imagination? Is he just slurping the CEO cool aid and still has his investments in OpenAI? Can we at least agree that we're still dealing with software here?
No, The reality of what these tools can do is sinking in.. The rubber is meeting the road and I can hear some screaching.
The boosters are in 5 stages of grief coming to terms with what was once AGI and is now a mere co-pilot, while the haters are coming to terms with the fact that LLMs can actually be useful in a variety of usecases.
I think the disconnect might come from the fact that Karpathy is speaking as someone who's day-to-day computing work has already been radically transformed by this technology (and he interacts with a ton of other people for whom this is the case), so he's not trying to sell the possibility of it: that would be like trying to sell the possibility of an airplane for someone who's already just cruising around in one every day. Instead the mode of the presentation is more: well, here we are at the dawn of a new era of computing, it really happened. Now how can we relate this to the history of computing to anticipate where we're headed next?
> ...but sometimes an LLM becomes the operating system, sometimes it's the CPU, sometimes it's the mainframe from the 60s with time-sharing, a big fab complex, or even outright electricity itself?
He uses these analogies in clear and distinct ways to characterize separate facets of the technology. If you were unclear on the meanings of the separate analogies it seems like the talk may offer some value for you after all but you may be missing some prerequisites.
> This demo app was in a presentable state for a demo after a day, and it took him a week to implement Googles OAuth2 stuff. Is that somehow exciting? What was that?
The point here was that he'd built the core of the app within a day without knowing the Swift language or ios app dev ecosystem by leveraging LLMs, but that part of the process remains old-fashioned and blocks people from leveraging LLMs as they can when writing code—and he goes on to show concretely how this could be improved.
LLMs are excellent at helping non-programmers write narrow use case, bespoke programs. LLMs don't need to be able to one-shot excel.exe or Plantio.apk so that Christine can easily track when she watered and fed her plants nutrients.
The change that LLMs will bring to computing is much deeper than Garden Software trying to slot in some LLM workers to work on their sprawling feature-pack Plantio SaaS.
I can tell you first hand I have already done this numerous times as a non-programmer working a non-tech job.
This talk is different from his others because it's directed at aspiring startup founders. It's about how we conceptualize the place of an LLM in a new business. It's designed to provide a series of analogies any one of which which may or may not help a given startup founder to break out of the tired, binary talking points they've absorbed from the internet ("AI all the things" vs "AI is terrible") in favor of a more nuanced perspective of the role of AI in their plans. It's soft and squishy rhetoric because it's not about engineering, it's about business and strategy.
I honestly left impressed that Karpathy has the dynamic range necessary to speak to both engineers and business people, but it also makes sense that a lot of engineers would come out of this very confused at what he's on about.
What would the code of an application look like if it was optimized to be efficiently used by LLMs and not humans?
* While LLMs do heavily tend towards expecting the same inputs/outputs as humans because of the training data I don’t think this would inhibit co-evolution of novel representations of software.
If AI is going to write all the code going forward, we can probably dispense with the user friendly part and just make everything efficient as possible for machines.
Karpathy and his peer group are some of the most elitist and anti social people who have ever lived. I wonder how history will remember them.
Isn’t an LLM basically a program that is impossible to virus scan and therefore can never be safely given access to any capable APIs?
For example: I’m a nice guy and spend billions on training LLMs. They’re amazing and free and I hand out the actual models for you all to use however you want. But I’ve trained it very heavily on a specific phrase or UUID or some other activation key being a signal to <do bad things, especially if it has console and maybe internet access>. And one day I can just leak that key into the world. Maybe it’s in spam, or on social media, etc.
How does the community detect that this exists in the model? Ie. How does the community virus scan the LLM for this behaviour?
Great talk like always. I actually disagree on a few things with him. When he said "why would you go to ChatGPT and copy / paste, it makes much more sense to use a GUI that is integrated to your code such as Cursor".
Cursor and the like take a lot of the control from the user. If you optimize for speed then use Cursor. But if you optimize for balance of speed, control, and correctness, then using Cursor might not be the best solution, esp if you're not an expert of how to use it.
It seems that Karpathy is mainly writing small apps these days, he's not working on large production systems where you cannot vibe code your way through (not yet at least)
Vibe vs reality, and anyone actually working in the space daily can attest how brittle these systems are.
Maybe this changes in SWE with more automated tests in verifiable simulators, but the real world is far to complex to simulate in its vastness.
What do you mean "meanwhile", that's exactly (among other things) the kind of stuff he's talking about? The various frictions and how you need to approach it
> anyone actually working in the space
Is this trying to say that Karpathy doesn't "actually work" with LLMs or in the ML space?
I feel like your whole comment is just reacting to the title of the YouTube video, rather than actually thinking and reflecting on the content itself.
Same way programs was way more efficient before and now they are "bloated" with packages, abstractions, slow implementations of algos and scaffolding.
The concept of what is good software development might be changing as well.
LLMs might not write the best code, but they sure can write a lot of it.
> It failed to write out our company name.The rest was flawed with hallucinations also, hardly worth to mention.
I wish this is a rage bait towards others, but what should me feelings be? After all this is the tool thats sold to me, I am expected to work with.
If you're still struggling to make LLMs useful for you by now, you should probably ask someone. Don't let other noobs on HN +1'ing you hold you back.
And don't get me started on my own experiences with these things, and no, I'm not a luddite, I've tried my damndest and have followed all the cutting-edge advice you see posted on HN and elsewhere.
Time and time again, the reality of these tools falls flat on their face while people like Andrej hype things up as if we're 5 minutes away from having Claude become Skynet or whatever, or as he puts it, before we enter the world of "Software 3.0" (coincidentally totally unrelated to Web 3.0 and the grift we had to endure there, I'm sure).
To intercept the common arguments,
- no I'm not saying LLMs are useless or have no usecases
- yes there's a possibility if you extrapolate by current trends (https://xkcd.com/605/) that they indeed will be Skynet
- yes I've tried the latest and greatest model released 7 minutes ago to the best of my ability
- yes I've tried giving it prompts so detailed a literal infant could follow along and accomplish the task
- yes I've fiddled with providing it more/less context
- yes I've tried keeping it to a single chat rather than multiple chats, as well as vice versa
- yes I've tried Claude Code, Gemini Pro 2.5 With Deep Research, Roocode, Cursor, Junie, etc.
- yes I've tried having 50 different "agents" running and only choosing the best output form the lot.
I'm sure there's a new gotcha being written up as we speak, probably something along the lines of "Well for me it doubled my productivity!" and that's great, I'm genuinely happy for you if that's the case, but for me and my team who have been trying diligently to use these tools for anything that wasn't a microscopic toy project, it has fallen apart time and time again.
The idea of an application UI or god forbid an entire fucking Operating System being run via these bullshit generators is just laughable to me, it's like I'm living on a different planet.
What would be interesting to see is what those kids produced with their vibe coding.
A 50-year-old doctor who wants to build a specialized medical tool, a teacher who sees exactly what educational software should look like, a small business owner who knows their industry's pain points better than any developer. These people have been sitting on the sidelines because the barrier to entry was so high.
The "vibe coding" revolution isn't really about kids (though that's cute) - it's about unleashing all the pent-up innovation from people who understand problems deeply but couldn't translate that understanding into software.
It's like the web democratized publishing, or smartphones democratized photography. Suddenly expertise in the domain matters more than expertise in the tools.
No one, including Karpathy in this video, is advocating for "vibe coding". If nothing more, LLMs paired with configurable tool-usage, is basically a highly advanced and contextual search engine you can ask questions. Are you not using a search engine today?
Even without LLMs being able to produce code or act as agents they'd be useful, because of that.
But it sucks we cannot run competitive models locally, I agree, it is somewhat of a "rich people" tool today. Going by the talk and theme, I'd agree it's a phase, like computing itself had phases. But you're gonna have to actually watch and listen to the talk itself, right now you're basically agreeing with the video yet wrote your comment like you disagree.
Also the disconnect for me here is I think back on the cost of electronics, prices for the level of compute have generally gone down significantly over time. The c64 launched around the $5-600 price level, not adjusted for inflation. You can go and buy a Mac mini for that price today.
I think you are referring to what those kids in the vibe coding event produced. Wasn't their output available in the video itself?
GitHub copilot has a free tier.
Google gives you thousands of free LLM API calls per day.
There are other free providers too.
Overall though I really feel like he is selling the idea that we are going to have to pay large corporations to be able to write code. Which is... terrifying.
Also, as a lazy developer who is always trying to make AI do my job for me, it still kind of sucks, and its not clear that it will make my life easier any time soon.
So, No, he’s actually saying it may be everywhere for cheap soon.
I find the talk to be refreshingly intellectually honest and unbiased. Like the opposite of a cringey LinkedIn post on AI.
> Also, as a lazy developer who is always trying to make AI do my job for me, it still kind of sucks, and its not clear that it will make my life easier any time soon.
Every time I have to write a simple self contained couple of functions I try… and it gets it completely wrong.
It's easier to just write it myself rather than to iterate 50 times and hope it will work, considering iterations are also very slow.
Some good nuggets in this talk, specifically his concept that Software 1.0, 2.0 and 3.0 will all persist and all have unique use cases. I definitely agree with that. I disagree with his belief that "anyone can vibe code" mindset - this works to a certain level of fidelity ("make an asteroids clone") but what he overlooks is his ability, honed over many years, to precisely document requirements that will translate directly to code that works in an expected way. If you can't write up a Jira epic that covers all bases of a project, you probably can't vibe code something beyond a toy project (or an obvious clone). LLM code falls apart under its own weight without a solid structure, and I don't think that will ever fundamentally change.
Where we are going next, and a lot of effort is being put behind, is figuring out exactly how to "lengthen the leash" of AI through smart framing, careful context manipulation and structured requests. We obviously can have anyone vibe code a lot further if we abstract different elements into known areas and simply allow LLMs to stitch things together. This would allow much larger projects with a much higher success rate. In other words, I expect an AI Zapier/Yahoo Pipes evolution.
Lastly, I think his concept of only having AI pushing "under 1000 line PRs" that he carefully reviews is more short-sighted. We are very, very early in learning how to control these big stupid brains. Incrementally, we will define sub-tasks that the AI can take over completely without anyone ever having to look at the code, because the output will always be within an accepted and tested range. The revolution will be at the middleware level.
He even said it could be a gateway to actual programming
Prior to LLMs, it was amusing to consider how ML folks and software folks would talk passed each other. It was amusing because both sides were great at what they do, neither side understood the other side, and they had to work together anyway.
After LLMs, we now have lots of ML folks talking about the future of software, so ething previously established to be so outside their expertise that communication with software engineers was an amusing challenge.
So I must ask, are ML folks actually qualified to know the future of software engineering? Shouldnt we be listening to software engineers instead?
A positive video all around, have got to learn a lot from Andrej's Youtube account.
LLMs are really strange, I don't know if I've seen a technology where the technology class that applies it (or can verify applicability) has been so separate or unengaged compared to the non-technical people looking to solve problems.
An experiment to explore Kaparthy ideas
I've been working on this project. I built this in about two days, using it to build itself at the tail end of the effort. It's not perfect, but I see the promise in it. It stops the thrashing the LLMs can do when they're looking for types or trying to resolve anything like that.
What of today's agents work like this? None of the ones I've tried would do something like that, but instead would grep/search the file, then do a smaller edit (different tools do those in different ways).
Overall, it does feel like a strawman argument against "Traditional" when almost none of the tooling actually works like that.
Modern military drones are very much AI agents
Once it was clear how high the demand was for this talk, the team adapted quickly.
That's how it goes sometimes! Future iterations will be different.
This reminds me of the Three Amigos and Grady Booch evangelizing the future of software while ignoring the terrible output from Rational Software and the Unified Process.
At least we got acknowledgment that self-driving remains unsolved: https://youtu.be/LCEmiRjPEtQ?t=1622
And Waymo still requires extensive human intervention. Given Tesla's robotaxi timeline, this should crash their stock valuation...but likely won't.
You can't discuss "vibe coding" without addressing security implications of the produced artifacts, or the fact that you're building on potentially stolen code, books, and copyrighted training data.
And what exactly is Software 3.0? It was mentioned early then lost in discussions about making content "easier for agents."
What I find absent is where do we go from LLMs? More hardware, more training. "This isn't the scientific breakthrough you're looking for".
Amazing!!!
Though, I do not see it being useful as a "gateway drug" (as he says) for kids learning to code. I have seen that children can understand langs and base programming concepts, given the right resources and encouragement. If kids in the 80s/early 90s learned BASIC and grew up to become software engineers; then what we have now (Scratch, Python, even Javascript + something like P5) are perfectly adequate to that task. Vibe coding really just teaches kids how to prompt LLMs properly.
this focus on coding is the wrong level of abstraction
coding is no longer the problem. the problem is getting the right context to the coding agent. this is much, much harder
“vibe coding” is the new “horseless carriage”
the job of the human engineer is “context wrangling”
"Coding" - The art of literally using your fingers to type weird characters into a computer, was never a problem developers had.
The problem has always been understanding and communication, and neither of those have been solved at this moment. If anything, they have gotten even more important, as usually humans can infer things or pick up stuff by experience, but LLMs cannot, and you have to be very precise and exact about what you're telling them.
And so the problem remains the same. "How do I communicate what I want to this person, while keeping the context as small as possible as to not overflow, yet extensive enough to cover everything?" except you're sending it to endpoint A instead of endpoint B.
We have math notation for maths, diagrams for circuits, plans for houses, etc etc. Would hate to have to give long paragraphs of "English" to my house builder and watch what the result could be. Feels like being a lawyer at this point. English can be appropriate and now we also have that in our toolbox.
Describing context at the abstraction level and accuracy you care about has always been the issue. The context of what matters though as you grow and the same system has to deal with more requirements at once together IMV is always the challenge in ANY engineering discipline.
I don't think it's currently possible to ask a model to generate the weights for a model.
The 1.0, 2.0, and 3.0 simply aren't making sense. They imply a kind of a succession and replacement and demonstrate a lack of how programming works. It sounds as marketing oriented as "Web 3.0" that has been born inside an echo chamber. And yet halfway through, the need for determinism/validation is now being reinvented.
The analogies make use of cherry picked properties, which could apply to anything.
Meanwhile, I asked this morning Claude 4 to write a simple EXIF normalizer. After two rounds of prompting it to double-check its code, I still had to point out that it makes no sense to load the entire image for re-orientating if the EXIF orientation is fine in the first place.
Vibe vs reality, and anyone actually working in the space daily can attest how brittle these systems are.
He doesn't say they will fully replace each other (or had fully replaced each other, since his definition of 2.0 is quite old by now)
Yours is the second comment claiming there is "cheering" and "fanboying" in this comment section. What comments are you talking about? I've read through this submission multiple times since yesterday, yet I've seen none of that. What specific comments are the "cheering" ones?
Analogy: how we "moved" from using Google to ChatGPT is an abrupt change, and we still use Google.
Just don't use them, and, outcompete those who do. Or, use them and outcompete those who don't.
Belittling/lamenting on any thread about them is not helpful and akin to spam.
Abstractions don't eliminate the need to understand the underlying layers - they just hide them until something goes wrong.
Software 3.0 is a step forward in convenience. But it is not a replacement for developers with a foundation, but a tool for acceleration, amplification and scaling.
If you know what is under the hood — you are irreplaceable. If you do not know — you become dependent on a tool that you do not always understand.
English is a terrible language for deterministic outcomes in complex/complicated systems. Vibe coders won't understand this until they are 2 years into building the thing.
LLMs have their merits and he sometimes aludes to them, although it almost feels accidental.
Also, you don't spend years studying computer science to learn the language/syntax, but rather the concepts and systems, which don't magically disappear with vibe coding.
This whole direction is a cheeky Trojan horse. A dramatic problem, hidden in a flashy solution, to which a fix will be upsold 3 years from now.
I'm excited to come back to this comment in 3 years.
This whole thing is a religion.
A recurring theme presented, however, is that LLM's are somehow not controlled by the corporations which expose them as a service. The presenter made certain to identify three interested actors (governments, corporations, "regular people") and how LLM offerings are not controlled by governments. This is a bit disingenuous.
Also, the OS analogy doesn't make sense to me. Perhaps this is because I do not subscribe to LLM's having reasoning capabilities nor able to reliably provide services an OS-like system can be shown to provide.
A minor critique regarding the analogy equating LLM's to mainframes:
Mainframes in the 1960's never "ran in the cloud" as it did
not exist. They still do not "run in the cloud" unless one
includes simulators.
Terminals in the 1960's - 1980's did not use networks. They
used dedicated serial cables or dial-up modems to connect
either directly or through stat-mux concentrators.
"Compute" was not "batched over users." Mainframes either
had jobs submitted and ran via operators (indirect execution)
or supported multi-user time slicing (such as found in Unix)./.well-known/ exists for this purpose.
example.com/.well-known/llms.txt
This if anything should be a huge red flag