A Really Big Computer (opens in new tab)

(geohot.github.io)

60 pointscl3m2y ago34 comments

34 comments

Maybe I'm overly critical, but if fixing the Twitter search box is too hard, maybe buying solar (and batteries), fabbing chips, and building a giant data center to replicate the current generation of what another company already built is foolhardy.

ablyveiled2y ago

I armchair-diagnose geohot with bipolar disorder. Only a few weeks ago, he was writing about how he was slated to lose because of his low tolerance for bullshit.

bastawhiz2y ago

I am not going to try to come up with an underlying reason but he's really poisoned his reputation in my book. How can you so emphatically jump into a problem space that (in hindsight) you're not an expert at and then manage to distance yourself so vigorously from your failure like a month later? What am I supposed to think when you're faced with a real world hundreds-of-millions-of-dollars project spanning tons of large, complex industries? How am I supposed to believe you'll have better success with this than a few thousand lines of JavaScript?

georgehotz2y ago

Have you ever met someone with actual bipolar disorder? They think that they are Gods and see symbols and signs and shit. Or sometime they go negative and thing there's a conspiracy targeting them or something.

They don't usually think they are gonna slowly over the next 5 years tape out a chip and build a solar powered datacenter with the company they raised $5M for...all part of the tiny corp master plan. I'll write it up better as it comes together better.

We have 417 preorders for tinyboxes btw. https://tinygrad.org

2 more replies

hruzgar2y ago

haters always gonna hate

gary_02y ago

He doesn't mention yield, which I hear is currently pretty bad on 3nm. And "just route around the bad silicon" might be easier said than done; widespread defects might create too many roadblocks to efficiently move data around your wafer-chip.

If that's not factored into his 3450 wafer estimate, it could be double that.

rg1112y ago

This was my first thought reading this article. I am not an expert and my knowledge is limited.

When he mentions the prices of the wafers, does it come with the bad yields or without?

SomeHacker442y ago

A knowledgable friend recently told me the wafer price is, you get what you get. This was in reference to a recent Apple story that TSMC was giving Apple a discount for bad dies, or something like that, on the 3nm node. I have no references.

20wenty2y ago

Say what you want about him, but George is one of the few people I know that can actually go a mile wide and a mile deep into any field they choose. And he does most of his digging live on stream, so he makes an easy punching bag. Do I think he'll successfully raise a $400M round on a $2B valuation? No, something else will distract him before that - but I'm going to enjoy watching him (and pulling for him) regardless.

late2part2y ago

Feeling cute, may make up bullshit then change my mind.

w10-12y ago

Brainstorming can help identify limiting assumptions.

Mostly compute has piggy-backed off consumer-scale production (e.g., GPU's repurposed for crypto).

The suggestion is that an AI model can justify few-shot chip production.

His proposal is for development, i.e., to build the model, and depends mostly on such models being qualitatively better.

It seems more likely that chips would be built to offer model processing, instead of forcing users into a service (with its risk of confidentiality and IP leaks). To get GPT-100, you'd incorporate the chip into your device -- and then know for sure that nothing could leak. That eliminates the primary transaction cost for AI compute: the risk.

Which presents the question: does anyone know of research or companies working on such chip models?

leblancfg2y ago

I will refer to this as the moment when LLM hype jumped the shark.

rg1112y ago

AI is not just LLMs.

As I keep mentioning on HN.

AI is real results + hype as opposed to crypto of only hype.

A computer of this size will be immensely useful for training Vision models very quickly, as well as other kinds of models.

Why only AI? Any branch of Science (all of them) that benefit from fast, parallel compute, will benefit from this.

It is extremely narrow view to view AI only as LLMs.

When I say "real use" I mean solving real world problems that existed before AI. And I don't even count generative AI of any shape or form.

leblancfg2y ago

My comment is about how this man is seriously entertaining a 2B$ valuation for this napkin math, with no experience in the 4/5 different industries his plan entails – and sharing no definite plans of what to do with it except "be able to train GPT4 in a day". It is about prevailing market forces in VC that let him believe this plan might actually work.

It doesn't have anything to do with crypto, science, or Real World Problems.

1 more reply

nickdothutton2y ago

How would you distribute clock signal around that wafer-scale GPU? OR is he simply suggesting you buy the whole wafer and litho standard GPUs out of it? Apologies if this is a stupid question.

petesoper2y ago

How does he cool those wafers?

cushychicken2y ago

“Put the servers somewhere cool.”

No joke. That’s his plan.

georgehotz2y ago

And...it's a good plan.

There's two cooling problems, pulling heat off the wafer (water cooled loop or immersion cooling, this is easy) and dumping the heat somewhere. The second part is why the place being cool matters, I imagine a big radiator on the roof. Could also dump the heat into a river if it's chill.

cushychicken2y ago

You and I have real different meters for "easy", George.

1 more reply

Turskarama2y ago

I think the modern strategy is to have the whole thing under water.

aziaziazi2y ago

Let's build it on the crest of a lunar crater. Wafers on the shade side and solar panels on the sunny one. “Just” add some $Bs to send the stuff up in the space.

wolrah2y ago

Shaded or not, cooling is actually a problem in space because radiating heat in to a "vacuum" is hard. Even a warm atmosphere is easier to transfer heat to than almost nothing.

Now a ground source heat pump could be interesting as long as it's not too hard to drill for the plumbing.

nullc2y ago

How do these goals set out the needed ratios of memory to compute to interconnect bandwidth?

An ideal machine designed to train GPT4 in a day is likely very different to the ideal machine to train 50 GPT4s as once over a few weeks, which is very different from the ideal machine to train a model 100x bigger than GPT4 (perhaps the most interesting).

Havoc2y ago

Not sure about that computer plan but I do enjoy his brand of entertainment.

Also really hoping he makes more progress on amd ml

nullc2y ago

Where is the budget line item for defending it against bombing by crazed anti-ai doomsday cultists?

jebarker2y ago

Wyoming seems like a good place to build such a thing.

fancythat2y ago

How is the sun there for the solar part?

jebarker2y ago

~220 sunny days per year apparently. Plus there's plenty of space for wind energy.

NCAR has a supercomputing center there: https://en.m.wikipedia.org/wiki/NCAR-Wyoming_Supercomputing_...

dealuromanet2y ago

This is really cool. Godspeed!

j / k navigate · click thread line to collapse

34 comments

bastawhiz2y ago

ablyveiled2y ago

I armchair-diagnose geohot with bipolar disorder. Only a few weeks ago, he was writing about how he was slated to lose because of his low tolerance for bullshit.

bastawhiz2y ago

georgehotz2y ago

We have 417 preorders for tinyboxes btw. https://tinygrad.org

2 more replies

hruzgar2y ago

haters always gonna hate

gary_02y ago

If that's not factored into his 3450 wafer estimate, it could be double that.

rg1112y ago

This was my first thought reading this article. I am not an expert and my knowledge is limited.

When he mentions the prices of the wafers, does it come with the bad yields or without?

SomeHacker442y ago

20wenty2y ago

late2part2y ago

Feeling cute, may make up bullshit then change my mind.

w10-12y ago

Brainstorming can help identify limiting assumptions.

Mostly compute has piggy-backed off consumer-scale production (e.g., GPU's repurposed for crypto).

The suggestion is that an AI model can justify few-shot chip production.

His proposal is for development, i.e., to build the model, and depends mostly on such models being qualitatively better.

Which presents the question: does anyone know of research or companies working on such chip models?

leblancfg2y ago

I will refer to this as the moment when LLM hype jumped the shark.

rg1112y ago

AI is not just LLMs.

As I keep mentioning on HN.

AI is real results + hype as opposed to crypto of only hype.

A computer of this size will be immensely useful for training Vision models very quickly, as well as other kinds of models.

Why only AI? Any branch of Science (all of them) that benefit from fast, parallel compute, will benefit from this.

It is extremely narrow view to view AI only as LLMs.

When I say "real use" I mean solving real world problems that existed before AI. And I don't even count generative AI of any shape or form.

leblancfg2y ago

It doesn't have anything to do with crypto, science, or Real World Problems.

1 more reply

nickdothutton2y ago

How would you distribute clock signal around that wafer-scale GPU? OR is he simply suggesting you buy the whole wafer and litho standard GPUs out of it? Apologies if this is a stupid question.

petesoper2y ago

How does he cool those wafers?

cushychicken2y ago

“Put the servers somewhere cool.”

No joke. That’s his plan.

georgehotz2y ago

And...it's a good plan.

cushychicken2y ago

You and I have real different meters for "easy", George.

1 more reply

Turskarama2y ago

I think the modern strategy is to have the whole thing under water.

aziaziazi2y ago

Let's build it on the crest of a lunar crater. Wafers on the shade side and solar panels on the sunny one. “Just” add some $Bs to send the stuff up in the space.

wolrah2y ago

Shaded or not, cooling is actually a problem in space because radiating heat in to a "vacuum" is hard. Even a warm atmosphere is easier to transfer heat to than almost nothing.

Now a ground source heat pump could be interesting as long as it's not too hard to drill for the plumbing.

nullc2y ago

How do these goals set out the needed ratios of memory to compute to interconnect bandwidth?

Havoc2y ago

Not sure about that computer plan but I do enjoy his brand of entertainment.

Also really hoping he makes more progress on amd ml

nullc2y ago

Where is the budget line item for defending it against bombing by crazed anti-ai doomsday cultists?

jebarker2y ago

Wyoming seems like a good place to build such a thing.

fancythat2y ago

How is the sun there for the solar part?

jebarker2y ago

~220 sunny days per year apparently. Plus there's plenty of space for wind energy.

NCAR has a supercomputing center there: https://en.m.wikipedia.org/wiki/NCAR-Wyoming_Supercomputing_...

dealuromanet2y ago

This is really cool. Godspeed!

j / k navigate · click thread line to collapse