Andrej Karpathy: Software in the era of AI [video] (opens in new tab)

(youtube.com)

1481 pointssandslash9mo ago783 comments

783 comments

I think it's interesting to juxtapose traditional coding, neural network weights and prompts because in many areas -- like the example of the self driving module having code being replaced by neural networks tuned to the target dataset representing the domain -- this will be quite useful.

However I think it's important to make it clear that given the hardware constraints of many environments the applicability of what's being called software 2.0 and 3.0 will be severely limited.

So instead of being replacements, these paradigms are more like extra tools in the tool belt. Code and prompts will live side by side, being used when convenient, but none a panacea.

karpathy9mo ago

I kind of say it in words (agreeing with you) but I agree the versioning is a bit confusing analogy because it usually additionally implies some kind of improvement. When I’m just trying to distinguish them as very different software categories.

5 more replies

gyomu9mo ago

It's not just the hardware constraints - it's also the training constraints, and the legibility constraints.

Training constraints: you need lots, and lots of data to build complex neural network systems. There are plenty of situations where the data just isn't available to you (whether for legal reasons, technical reasons, or just because it doesn't exist).

Legibility constraints: it is extremely hard to precisely debug and fix those systems. Let's say you build a software system to fill out tax forms - one the "traditional" way, and one that's a neural network. Now your system exhibits a bug where line 58(b) gets sometimes improperly filled out for software engineers who are married, have children, and also declared a source of overseas income. In a traditionally implemented system, you can step through the code and pinpoint why those specific conditions lead to a bug. In a neural network system, not so much.

So totally agreed with you that those are extra tools in the toolbelt - but their applicability is much, much more constrained than that of traditional code.

In short, they excel at situations where we are trying to model an extremely complex system - one that is impossible to nail down as a list of formal requirements - and where we have lots of data available. Signal processing (like self driving, OCR, etc) and human language-related problems are great examples of such problems where traditional programming approaches have failed to yield the kind of results we wanted (ie, beyond human performance) in 70+ years of research and where the modern, neural network approach finally got us the kind of results we wanted.

But if you can define the problem you're trying to solve as formal requirements, then those tools are probably ill-suited.

radicalbyte9mo ago

Weights are code being replaced by data; something I've been making heavy use of since the early 00s. After coding for 10 years you start to see the benefits of it and understand where you should use it.

LLMs give us another tool only this time it's far more accessible and powerful.

dcsan9mo ago

LLMs have already replaced some code directly for me eg NLP stuff. Previously I might write a bunch of code to do clustering now I just ask the LLM to group things. Obviously this is a very basic feature native to LLMs but there will be more first class LLM callable functions over time.

practal9mo ago

Great talk, thanks for putting it online so quickly. I liked the idea of making the generation / verification loop go brrr, and one way to do this is to make verification not just a human task, but a machine task, where possible.

Yes, I am talking about formal verification, of course!

That also goes nicely together with "keeping the AI on a tight leash". It seems to clash though with "English is the new programming language". So the question is, can you hide the formal stuff under the hood, just like you can hide a calculator tool for arithmetic? Use informal English on the surface, while some of it is interpreted as a formal expression, put to work, and then reflected back in English? I think that is possible, if you have a formal language and logic that is flexible enough, and close enough to informal English.

Yes, I am talking about abstraction logic [1], of course :-)

So the goal would be to have English (German, ...) as the ONLY programming language, invisibly backed underneath by abstraction logic.

[1] http://abstractionlogic.com

AdieuToLogic9mo ago

> So the question is, can you hide the formal stuff under the hood, just like you can hide a calculator tool for arithmetic? Use informal English on the surface, while some of it is interpreted as a formal expression, put to work, and then reflected back in English?

The problem with trying to make "English -> formal language -> (anything else)" work is that informality is, by definition, not a formal specification and therefore subject to ambiguity. The inverse is not nearly as difficult to support.

Much like how a property in an API initially defined as being optional cannot be made mandatory without potentially breaking clients, whereas making a mandatory property optional can be backward compatible. IOW, the cardinality of "0 .. 1" is a strict superset of "1".