undefined | Better HN

0 pointsbick_nyers3y ago0 comments

John Carmack mentions this in the Lex Fridman Podcast, basically the argument is that AGI performance characteristics will be similar to that of LLM (which is another way of saying that AGI will NOT be hyper efficient, P != NP), and that the performance characteristics pose a problem for fast takeoff. The bottleneck to performance in training these models is GPU memory bandwidth when the entire model fits inside VRAM, which on modern cards is 1TB/s, when the model cannot fit in GPU memory, the performance now instead scales along PCIE, which is currently 32GB/s. AGI (or an LLM) attempting to replicate across mobile devices, or desktops, will be severely hamstrung by the network connection. So using all of planet Earth's computing resources is not necessarily better than say a single data center in this respect. The second piece of the argument here is essentially saying that a data center is also not enough, or rather, not enough in a detectable timeframe. Could AGI hack the entire data center for an entire month to perform it's training (and then execute it's strategy enough to gain say nuclear codes?). Unlikely. Is say 8 hours of training using an entire data center enough to go from intelligence to super intelligence? Intuition says no. I expanded a bit upon what I believe his argument to be, I definitely recommend watching that part of the podcast episode.

Here's my take on the implications of not having fast takeoff, a secretly antagonistic AGI will be constrained to cooperation, and slowly leeching compute resources towards its goal until it is able to aquire enough resources to confidently sprint towards the inflection point. This could mean teaching us how to build better semiconductor foundries, how to create nuclear fusion energy sources, how to educate our youth to better fit into those jobs, slightly alter our culture's value systems over time via astroturfing to be more sympathetic, cooperative, trusting, and less vigilant towards these AI systems, how to build more robust computing systems so we stop needing to detect when something goes wrong (because it already has 99.9999% uptime).

I believe slow takeoff is actually worse in some sense, because it is a hell of a lot harder to detect, and humans tend to get complacent and apathetic when something "just works" for decades, even if it has been plotting since the beginning.

0 comments

1 comments · 1 top-level

gonehome3y ago

Thanks - I saw he was on there, but I find lex insufferable as an interviewer so tend to avoid the pods (despite the great guests)

j / k navigate · click thread line to collapse