How AlphaChip transformed computer chip design (opens in new tab)

(deepmind.google)

301 pointsisof4ult1y ago194 comments

194 comments

This work from Google (original Nature paper: https://www.nature.com/articles/s41586-021-03544-w) has been credibly criticized by several researchers in the EDA CAD discipline. These papers are of interest:

- A rebuttal by a researcher within Google who wrote this at the same time as the "AlphaChip" work was going on ("Stronger Baselines for Evaluating Deep Reinforcement Learning in Chip Placement"): http://47.190.89.225/pub/education/MLcontra.pdf

- The 2023 ISPD paper from a group at UCSD ("Assessment of Reinforcement Learning for Macro Placement"): https://vlsicad.ucsd.edu/Publications/Conferences/396/c396.p...

- A paper from Igor Markov which critically evaluates the "AlphaChip" algorithm ("The False Dawn: Reevaluating Google's Reinforcement Learning for Chip Macro Placement"): https://arxiv.org/pdf/2306.09633

In short, the Google authors did not fairly evaluate their RL macro placement algorithm against other SOTA algorithms: rather they claim to perform better than a human at macro placement, which is far short of what mixed-placement algorithms are capable of today. The RL technique also requires significantly more compute than other algorithms and ultimately is learning a surrogate function for placement iteration rather than learning any novel representation of the placement problem itself.

In full disclosure, I am quite skeptical of their work and wrote a detailed post on my website: https://vighneshiyer.com/misc/ml-for-placement/

negativeonehalf1y ago

FD: I have been following this whole thing for a while, and know personally a number of the people involved.

The AlphaChip authors address criticism in their addendum, and in a prior statement from the co-lead authors: https://www.nature.com/articles/s41586-024-08032-5 , https://www.annagoldie.com/home/statement

- The 2023 ISPD paper didn't pre-train at all. This means no learning from experience, for a learning-based algorithm. I feel like you can stop reading there.

- The ISPD paper and the MLcontra paper both used much larger older technology node sizes, which have pretty different physical properties. TPU has a sub 10nm technology node size, whereas ISPD uses 45nm and 12nm. These are really different from a physical design perspective. Even worse, MLcontra uses a truly ancient benchmark with >100nm technology node size.

Markov's paper just summarizes the other two.

(Incidentally, none of ISPD / MLcontra / Markov were peer reviewed - ISPD 2023 was an invited paper.)

There's a lot of other stuff wrong with the ISPD paper and the MLcontra paper - happy to go into it - and a ton of weird financial incentives lurking in the background. Commercial EDA companies do NOT want a free open-source tool like AlphaChip to take over.

Reading your post, I appreciate the thoroughness, but it seems like you are too quick to let ISPD 2023 off the hook for failing to pre-train and using less compute. The code for pre-training is just the code for training --- you train on some chips, and you save and reuse the weights between runs. There's really no excuse for failing to do this, and the original Nature paper described at length how valuable pre-training was. Given how different TPU is from the chips they were evaluating on, they should have done their own pre-training, regardless of whether the AlphaChip team released a pre-trained checkpoint on TPU.

(Using less compute isn't just about making it take longer - ISPD 2023 used half as many GPUs and 1/20th as many RL experience collectors, which may screw with the dynamics of the RL job. And... why not just match the original authors' compute, anyway? Isn't this supposed to be a reproduction attempt? I really do not understand their decisions here.)

isotypic1y ago

Why does pretraining or not matter in the ISPD 2023 paper? The circuit_training repo, as noted in the rebuttal of the rebuttal by the ISPD 2023 paper authors, claims training from scratch is "comparable or better" than fine-tuning the pre-trained model. So no matter your opinion on the importance of the pretraining step, this result isn't replicable, at which point the ball is in Google's court to release code/checkpoints to show otherwise.

negativeonehalf1y ago

The quick-start guide in the repo that said you don't have to pre-train for the sample test case, meaning that you can validate your setup without pre-training. That does not mean you don't need to pre-train! Again, the paper talks at length about the importance of pre-training.

2 more replies

wegfawefgawefg1y ago

In reinforcement learning pre-training reduces peak performance. We can argue about this, but it is not a sufficiently strong point to stop reading from alone.

3170701y ago

Do you have a citation for this? I did my Phd on this topic 8 years ago, and I didn't completely follow the field after. I'm curious to learn more.

1 more reply

negativeonehalf1y ago

See this ISPD 2022 paper where the AlphaChip authors dive more into the value of pre-training (Figure 7, Figure 8): https://dl.acm.org/doi/pdf/10.1145/3505170.3511478

clickwiseorange1y ago

Oh, man... this is the same old stuff from the 2023 Anna Goldie statement (is this Anna Goldie's comment?). This was all addressed by Kahng in 2023 - no valid criticisms. Where do I start?

Kahng's ISPD 2023 paper is not in dispute - no established experts objected to it. The Nature paper is in dispute. Dozens of experts objected to it: Kahng, Cheng, Markov, Madden, Lienig, Swartz objected publically.

The fact that Kahng's paper was invited doesn't mean it wasn't peer reviewed. I checked with ISPD chairs in 2023 - Kahng's paper was thoroughly reviewed and went through multiple rounds of comments. Do you accept it now? Would you accept peer-reviewed versions of other papers?

Kahng is the most prominent active researcher in this field. If anyone knows this stuff, it's Kahng. There were also five other authors in that paper, including another celebrated professor, Cheng.

The pre-training thing was disclaimed in the Google release. No code, data or instructions for pretraining were given by Google for years. The instructions said clearly: you can get results comparable to Nature without pre-training.

The "much older technology" is also a bogus issue because the HPWL scales linearly and is reported by all commercial tools. Rectangles are rectangles. This is textbook material. But Kahng etc al prepared some very fresh examples, including NVDLA, with two recent technologies. Guess what, RL did poorly on those. Are you accepting this result?

The bit about financial incentives and open-source is blatantly bogus, as Kahng leads OpenROAD - the main open-source EDA framework. He is not employed by any EDA companies. It is Google who has huge incentives here, see Demis Hassabis tweet "our chips are so good...".

The "Stronger Baselines" matched compute resources exactly. Kahng and his coauthors performed fair comparisons between annealing and RL, giving the same resources to each. Giving greater resources is unlikely to change results. This was thoroughly addressed in Kahng's FAQ - if you only could read that.

The resources used by Google were huge. Cadence tools in Kahng's paper ran hundreds times faster and produced better results. That is as conclusive as it gets.

It doesn't take a Ph.D. to understand fair comparisons.

3 more replies

dogleg771y ago

The problem with the Google Nature paper is that its results were not reproduced outside Google. You can attack attempts to reproduce but that only reinforces the point: those claimed results cannot be trusted.

Other commenters already addressed the pre-training issue. Please kindly include a link to Kahng's 2023 discussion addressing your complaints. Otherwise, you are unfairly supporting those people you know.

Kahng's placer is open-source and was used in the Nature paper. It does not make sense to accuse Kahng of colluding with companies against open-source.

negativeonehalf1y ago

For a more thorough discussion on pre-training, see this ISPD 2022 paper by the AlphaChip people: https://dl.acm.org/doi/pdf/10.1145/3505170.3511478

As for external usage of the method - MediaTek is one of the largest chip design companies in the world, and they built on AlphaChip. There's a quote from a MediaTek SVP at the bottom of the GDM blog post:

"AlphaChip's groundbreaking AI approach revolutionizes a key phase of chip design. At MediaTek, we've been pioneering chip design's floorplanning and macro placement by extending this technique in combination with the industry's best practices. This paradigm shift not only enhances design efficiency, but also sets new benchmarks for effectiveness, propelling the industry towards future breakthroughs."

2 more replies

porphyra1y ago

The Deepmind chess paper was also criticized for unfair evaluation, as they were using an older version of Stockfish for comparison. Apparently, the gap between AlphaZero and that old version of Stockfish (about 50 elo iirc) was about the same as the gap between consecutive versions of Stockfish.

lacker1y ago

Indeed, six years later, the AlphaZero algorithm is not the best performing algorithm for chess. LCZero (uses AlphaZero algorithm) won some TCECs after it came out but for the past few years Stockfish (does not use AlphaZero algorithm) has been winning consistently.

https://en.wikipedia.org/wiki/Top_Chess_Engine_Championship

So perhaps the critics had a point there.

2 more replies

Workaccount21y ago

To be fair, some of these criticisms are a few years old. Which normally would be fair game, but the progress in AI has been breakneck. Criticism of other AI tech from 2021 or 2022 are pretty dated today.

jeffbee1y ago

It certainly looks like the criticism at the end of the rebuttal that DeepMind has abandoned their EDA efforts is a bit stale in this context.

anna-gabriella1y ago

Dated or not, if half of the criticisms are right, the original paper may need to be retracted. No progress on RL for chip design was published by Google since 2022, as far as I can tell. So, it looks like most if not all criticisms remain valid.

2 more replies

jeffbee1y ago

It seems like this is multiple parties pursuing distinct arguments. Is Google saying that this technique is applicable in the way that the rebuttals are saying it is not? When I read the paper and the update I did not feel as though Google claimed that it is general, that you can just rip it off and run it and get a win. They trained it to make TPUs, then they used it to make TPUs. The fact that it doesn't optimize whatever "ibm14" is seems beside the point.

clickwiseorange1y ago

Good question. It's not just ibm14, but everything people outside Google tried shows that RL is much worse than prior methods. NVDLA, BlackParrot, etc. There is a strong possibility that Google pre-trained RL on certain TPU designs then tested in them, and submitted to Nature.

smokel1y ago

I don't really understand all the fuss about this particular paper. Nearly all papers on AI techniques are pretty much impossible to reproduce, due to details that the authors don't understand or are trying to cover up.

This is what you get if you make academic researchers compete for citation counts.

Pretraining seems to be an important aspect here, and it makes sense that such pretraining requires good examples, which unfortunately for the free lunch people, is not available to the public.

That's what you get when you let big companies do fundamental research. Would it be better if the companies did not publish anything about their research at all?

It all feels a bit unproductive to attack one another.

1 more reply

gdiamos1y ago

Criticism is an important part of the scientific process.

Whichever approach ends up winning is improved by careful evaluation and replication of results

s-macke1y ago

When I first read about AlphaChip yesterday, my first question was how it compares to other optimization algorithms such as genetic algorithms or simulated annealing. Thank you for confirming that my questions are valid.

nemonemo1y ago

What is your opinion of the addendum? I think the addendum and the pre-trained checkpoint are the substance of the announcement, and it is surprising to see little mention of those here.

dogleg771y ago

Fair point. I looked at the addendum, and it doesn't really address the critiques. They show an "ablation study" on one additional circuit without describing how big it is, etc. The few numbers they give suggest that this circuit is unlike those in the Nature paper: possibly a newer technology (good) but much smaller in the total length of wires and hence components (bad!). They are trying to debunk the Kahng work, but they aren't addressing many other complaints. Without showing results on public benchmarks in a reproducible way, they are just rehashing some excuses. Maybe their results are no good, maybe they stopped working on this. But with claimed runtimes under 6 hours, they don't need 3 years to add thorough benchmarking results. Anyone who designs competitive chips is doing benchmarking, but Google isn't. Draw your own conclusions.

bsder1y ago

EDA claims in the digital domain are fairly easy to evaluate. Look at the picture of the layout.

When you see a chip that has the datapath identified and laid out properly by a computer algorithm, you've got something. If not, it's vapor.

So, if your layout still looks like a random rat's nest? Nope.

If even a random person can see that your layout actually follows the obvious symmetric patterns from bit 0 to bit 63, maybe you've got something worth looking at.

Analog/RF is a little tougher to evaluate because the smaller number of building blocks means you can use Moore's Law to brute force things much more exhaustively, but if things "looks pretty" then you've got something. If it looks weird, you don't.

glitchc1y ago

That doesn't mean the fabricated netlist doesn't work. I'm not supporting Google by any means, but the test should be: Does it fabricate and function as intended? If not, clearly gibberish. If so, we now have computers building computers, which is one step closer to SkyNet. The truth is probably somewhere in between. But even if some of the samples, with the terrible layouts, are actually functional, then we might learn something new. Maybe the gibberish design has reduced crosstalk, which would be fascinating.

lordswork1y ago

Some interesting context on this work: 2 researchers were bullied to the point of leaving Google for Anthropic by a senior researcher (who has now been terminated himself): https://www.wired.com/story/google-brain-ai-researcher-fired...

They must feel vindicated by their work turning out to be so fruitful now.

gabegobblegoldi1y ago

Vindicated indeed. The senior researcher and others on the project were bullied for raising concerns of fraud by the two researchers [1]. They filed a lawsuit against Google that has a lot of detailed allegations of fraud [2].

[1] https://www.theregister.com/AMP/2023/03/27/google_ai_chip_pa...

[2] https://regmedia.co.uk/2023/03/26/satrajit_vs_google.pdf

negativeonehalf1y ago

You are now using multiple new accounts based on the name of one of the authors (Anna Goldie) and her husband (Gabriel). First this one ('gabegobblegoldi'), and then 'anna-gabriella'.

I think it is time for you to take a deep breath and think about what you are doing and why.

You seem to be obsessed with the idea that this work is overrated. MediaTek and Google don't think so, and use it in production for their chips, including TPU, Dimensity, Axion, and others. If you're right and they're wrong, using this method loses them money. If it's the other way around, then using this method makes them gain money.

Please read PG's post and ask yourself if it applies to you: https://www.paulgraham.com/fh.html

Chatterjee settled his case. He has moved on. This is not some product being sold -- it is a free, open-source tool. People who see value in it use it; others don't, and so they don't. This is how it always works, and it's fine.

3 more replies

clickwiseorange1y ago

It's actually not clear who was bullied. The two researchers ganged up on Chatterjee and got him fired because he used the word "fraud" - wrongful termination of a whistleblower. Only recently Google settled with Chatterjee for an undisclosed amount.

hinkley1y ago

TSMC made a point of calling out that their latest generation of software for automating chip design has features that allow you to select logic designs for TDP over raw speed. I think that’s our answer to keep Dennard scaling alive in spirit if not in body. Speed of light is still going to matter, so physical proximity of communicating components will always matter, but I wonder how many wins this will represent versus avoiding thermal throttling.

therealcamino1y ago

EDA software has long allowed trading off power, delay, and area during optimization . But TSMC doesn't produce those tools, as far as I'm aware.

hinkley1y ago

https://www.tsmc.com/english/dedicatedFoundry/oip/eda_allian...

They don’t produce but they are tailored for them just the same. “We have” doesn’t have to mean “we made”. They don’t say it as such here but elsewhere they refer to the IP they can make available, which can also be made in house or cross licensed and still count as “we have”.

1 more reply

pfisherman1y ago

Questions for those in the know about chip design. How are they measuring the quality of a chip design? Does the metric that Google is reporting make sense? Or is it just something to make themselves look good?

Without knowing much, my guess is that “quality” of a chip design is multifaceted and heavily dependent on the use case. That is the ideal chip for a data center would look very different from those for a mobile phone camera or automobile.

So again what does “better” mean in the context of this particular problem / task.

Drunk_Engineer1y ago

I have not read the latest paper, but their previous work was really unclear about metrics being used. Researchers trying to replicate results had a hard time getting reliable details/benchmarks out of Google. Also, my recollection is that Google did not even compute timing, just wirelength and congestion; i.e. extremely primitive metrics.

Floorplanning/placement/synthesis is a billion dollar industry, so if their approach were really revolutionary they would be selling the technology, not wasting their time writing blog posts about it.

rossjudson1y ago

Like when Google wasted its time writing publicly about Spanner?

https://research.google/pubs/spanner-googles-globally-distri...

or Bigtable?

https://research.google/pubs/bigtable-a-distributed-storage-...

or GFS?

or MapReduce?

or Borg?

or...I think you get the idea.

thenoblesunfish1y ago

I am not sure these publications were intended to generate sales of these technologies. My assumption is that they mostly help the company in terms of recruitment. This lets potential employees see cool stuff Google is doing, and see them as an industry leader.

1 more reply

bushbaba1y ago

Spanner research paper was in 2012. Bigtable was in 2006. GFS 2003. The last decade has been a 'lost decade' of google. Not much innovation to be honest.

1 more reply

IshKebab1y ago

> Floorplanning/placement/synthesis is a billion dollar industry

Maybe all together, but I don't think automatic placement algorithms are a billion dollar industry. There's so much more to it than that.

1 more reply

negativeonehalf1y ago

The original paper reports P&R metrics (WNS, TNS, area, power, wirelength, horizontal congestion, vertical congestion) - https://www.nature.com/articles/s41586-021-03544-w

(no paywall): https://www.cl.cam.ac.uk/~ey204/teaching/ACS/R244_2021_2022/...

Drunk_Engineer1y ago

From what I saw in the rebuttal papers, the Google cost-function is wirelength based. You can still get good TNS from that if your timing is very simplistic -- or if you choose your benchmark carefully.

2 more replies

q3k1y ago

This is just floorplanning, which is a problem with fairly well defined quality metrics (max speed and chip area used).

Drunk_Engineer1y ago

Oh man, if only it were that simple. A floorplanner has to guestimate what the P&R tools are going to do with the initial layout. That can be very hard to predict -- even if the floorplanner and P&R tool are from the same vendor.

thesz1y ago

Eurisco [1], if I remember correctly, was once used to perform placement-and-route task and was pretty good at it.

[1] https://en.wikipedia.org/wiki/Eurisko

What's more, Eurisco was then used in designing Traveler TCS' game fleet of battle spaceships. And Eurisco used symmetry-based placement learned from VLSI design in the design of the spaceships' fleet.

Can AlphaChip's heuistics be used anywhere else?

gabegobblegoldi1y ago

Doesn’t look like it. In fact the original paper claimed that their RL method could be used for all sorts of combinatorial optimization problems. Yet they chose an obscure problem in chip design and showed their results on proprietary data instead of standard public benchmarks.

Instead they could have demonstrated their amazing method on any number of standard NP hard optimization problems e.g. traveling salesman, bin packing, ILP, etc. where we can generate tons of examples and verify easily whether it produces better results than other solvers or not.

This is why many in the chip design and optimization community felt that the paper was suspicious. Even with this addendum they adamantly refuse to share any results that can be independently verified.

AshamedCaptain1y ago

> Yet they chose an obscure problem in chip design

It is not obscure (in chip design). If anything it is one of the most easily reachable problems. Almost every other PhD student in the field has implemented a macro placer, even if just for fun, and there are frequent academic competitions. A lot of design houses also roll their own macro placers since it's not a difficult problem and generally adding a bit of knowledge of your design style can help you gain an extra % over the generic commercial tools.

It does not surprise me at all that they decided to start with this for their foray into chip EDA. It's the minimum effort route.

gabegobblegoldi1y ago

Sorry. I meant obscure relative to the large space of combinatorial optimization problems not just chip design.

Most design houses don’t write their own macro placers but customize commercial flows for their designs.

The problem with macro placement as an RL technology demonstrator is that to evaluate quality you need to go through large parts of the design flow which involves using other commercial tools. This makes it incredibly hard to evaluate superiority since all those steps and tools add noise.

Easier problems would have been to use RL to minimize the number of gates in a logic circuit or just focus on placement with half perimeter wirelength (I think this is what you mean with your grad student example). Essentially solving point problems in the design flow and evaluating quality improvements locally.

They evaluated quality globally and only globally and that destroys credibility in this business due to the noise involved unless you have lots of examples, can show statistical significance, and (unfortunately for the authors) also local improvements.

That’s what the follow on studies did and that’s why the community has lost faith in this particular algorithm.

1 more reply

AshamedCaptain1y ago

What is Google doing here? At best, the quality of their "computer chip design" work can be described as "controversial" https://spectrum.ieee.org/chip-design-controversy . What is there to gain by just making a PR now without doing anything new?

negativeonehalf1y ago

In the blog post, they announce MediaTek's widespread usage, the deployment in multiple generations of TPU with increasing performance each generation, Axion, etc.

Chips designed with the help of AlphaChip are in datacenters and Samsung phones, right now. That's pretty neat!

1 more reply

yeahwhatever101y ago

Why do they keep saying "superhuman"? Algorithms are used for these tasks, humans aren't laying out trillions of transistors by hand.

fph1y ago

My state-of-art bubblesort implementation is also superhuman at sorting numbers.

xanderlewis1y ago

Nice. Do you offer API access for a monthly fee?

1 more reply

HPsquared1y ago

Nice. Still true though! We are in the bubble sort era of AI.

kevindamm1y ago

When we get better quantum computers we can start using spaghetti sort.

jeffbee1y ago

This is floorplanning the blocks, not every feature. We are talking dozens to hundreds of blocks, not billions-trillions of gates and wires.

I assume that the human benchmark is a human using existing EDA tools, not a guy with a pocket protector and a roll of tape.

yeahwhatever101y ago

Floorplanning algorithms and solvers already exist https://limsk.ece.gatech.edu/course/ece6133/slides/floorplan...

1 more reply

thomasahle1y ago

Believe it or not, but there was a time where algorithms were worse than humans at layout out transistors. In particular at the higher level design decisions.

justsid1y ago

That’s somewhat still the case, humans could do a much better job at efficient layouting. The problem is that humans don’t scale as well, laying out billions of transistors is hard for humans. But computers can do it if you forego some efficiency by switching to standard cells and then throw compute at the problem.

epistasis1y ago

Google is good at many things, but perhaps their strongest skill is media positioning.

jonas211y ago

I feel like they're particularly bad at this, especially compared to other large companies.

pinewurst1y ago

Familiarity breeds contempt. They've been pushing the Google==Superhuman thing since the Internet Boom with declining efficacy.

1 more reply

lordswork1y ago

The media hates Google.

epistasis1y ago

It a love/hate relationship. Which benefits Google and the media greatly.

deelowe1y ago

I read the paper. Superhuman is a metric they defined in the paper which has to do with how long it takes a human to do certain tasks.

anna-gabriella1y ago

Does this make any sense, really? - Define some common words and then let the media run wild with them. How about we redefine "better" and "revolutionize"? Oh, wait, I think people are doing that already...

negativeonehalf1y ago

Prior to AlphaChip, macro placement was done manually by human engineers in any production setting. Prior algorithmic methods especially struggled to manage congestion, resulting in chips that weren't manufacturable.

AshamedCaptain1y ago

> macro placement was done manually by human engineers in any production setting

To quote certain popular TV series .... Sorry, are you from the past? Do your "production" chips only have a couple dozen macros or what?

jayd161y ago

"superhuman or comparable"

What nonsense! XD

Upvoter331y ago

To me, there is an underlying issue: why are so many DeepX papers being sent to Nature, instead of appropriate CS forums? If you are doing better work in chip design, send it to IPSD or ISCA or whatever, and then you will get the types of reviews needed for this work. I have no idea what Nature does with a paper like this.

negativeonehalf1y ago

Chips are the limiting factor for AI, and now we have AIs making chips better than human engineers. This feels like an infinite compute cheat code, or at least a way to get us very, very quickly to the physical optimum.

pptr1y ago

It's 6% shorter wire length. Hardly an infinite compute glitch.

negativeonehalf1y ago

6% is just the latest one - this is a real-deal engineering task in the chip design process, that an AI can do better than a human expert, and the gap is growing with time. I'm sure there's a limit, but we don't know what it is yet, especially as they hand over more of the chip design process to AI.

cobrabyte1y ago

I'd love a tool like this for PCB design/layout

onjectic1y ago

First thing my mind went to as well, I’m sure this is already being worked on, I think it would be more impactful than even this.

bgnn1y ago

why do you think that?

bittwiddle1y ago

Far more people / companies are designing PCBs than there are designing custom chips.

1 more reply

dsv3099i1y ago

https://www.quilter.ai/

dreamcompiler1y ago

Looks like this is only about placement. I wonder if it can be applied to routing?

amelius1y ago

Exactly what I was thinking.

Also: when is this coming to KiCad? :)

PS: It would also be nice to apply a similar algorithm to graph drawing (e.g. trying to optimize for human readability instead of electrical performance).

tdullien1y ago

The issue is that in order to optimize for human readability you'll need a huge number of human evaluations of graphs?

amelius1y ago

Maybe start with minimization of some metric based on number of edge crossings, edge lengths and edge bends?

ilaksh1y ago

How far are we from memory-based computing going from research into competitive products? I get the impression that we are already well passed the point where it makes sense to invest very aggressively to scale up experiments with things like memristors. Because they are talking about how many new nuclear reactors they are going to need just for the AI datacenters.

mikewarot1y ago

The cognitive mismatch between Von Neumann's folly and other compute architectures is vast. He slowed down the ENIAC by 66% when he got ahold of it.

We're in the timeline that took the wrong path. The other world has isolinear memory, which can be used for compute, or as memory, down to the LUT level. Everything runs at a consistent speed, and hardware faults LUTs can be routed around easily.

sroussey1y ago

The problem is that the competition (our current von neumann architecture) has billions of dollars of R&D per year invested.

Better architectures without the yearly investment train will no longer be better quite quickly.

You would need to be 100x to 1000x better in order to pull the investment train onto your tracks.

Don’t has been impossible for decades.

Even so, I think we will see such a change in my lifetime.

AI could be that use case that has a strong enough demand pull to make it happen.

We will see.

therealcamino1y ago

If you don't worry about the programming model, it's pretty easy to be way better than than existing methodologies in terms of pure compute.

But if you do pay attention to the programming model, they're unusable. You'll see that dozens of these approaches have come and gone, because it's impossible to write software for them.

sanxiyn1y ago

GPGPU is instructive. It is not easy, but possible to write software for it. That's why it succeeded.

ilaksh1y ago

I think it's just ignorance and timidity on the part of investors. Memristor or memory-computing startups are surely the next trend in investing within a few years.

I don't think it's necessarily demand or any particular calculation that makes things happen. I think people including investors are just herd animals. They aren't enthusiastic until they see the herd moving and then they want in.

foota1y ago

I don't think it's ignorant to not invest in something that has a decade long path towards even having a market, much less a large market.

1 more reply

HPsquared1y ago

And think of the embedded applications.

ninetyninenine1y ago

What occupation is there that is purely intellectual that has no chance of an AI ever progressing to a point where it can take it over?

Zamiel_Snawley1y ago

I think only sentimentality can prevent take over by a sufficiently competent AI.

I don’t want art that wasn’t made by a human, no matter how visually stunning or indistinguishable it is.

ninetyninenine1y ago

Discounting fraud... what if the AI produces something genuinely better. Genuinely moving you to tears? What then?

Imagine your favorite movie, the most moving book. You read it, it changed you, then you found out it was an AI that generated it in a mere 10 seconds.

Artificial sentimentality is useless in the face of reality. That human endeavor is simply data points along an multi-dimensional best fit curve.

Zamiel_Snawley1y ago

That’s a challenging hypothetical.

I think it would feel hollowed out, disingenuous.

It feels too close to being a rat with a dopamine button, meaningless hedonism.

I haven’t thought it through particularly thoroughly though, I’d been interested in hearing other opinions. These philosophical questions quickly approach unanswerable.

1 more reply

alexyz121y ago

anything that needs very real-time info. AI's will always be limited by us feeding them info, or them collecting it themselves. But humans can travel to more places than an AI can, until robots are everywhere too I suppose

mirchiseth1y ago

I must be old because first thing I thought reading AlphaChip was why is deepmind talking about chips in DEC Alpha :-) https://en.wikipedia.org/wiki/DEC_Alpha.

sedatk1y ago

I first used Windows NT on a PC with a DEC Alpha AXP CPU.

lamontcg1y ago

I miss Digital Unix, too (I don't really miss the "Tru64" rebrand...)

kQq9oHeAz6wLLS1y ago

Same!

mdtancsa1y ago

haha, same!

red75prime1y ago

I hope I'll still be alive when they'll announce AlephZero.

QuadrupleA1y ago

How good are TPUs in comparison with state of the art Nvidia datacenter GPUs, or Groq's ASICs? Per watt, per chip, total cost, etc.? Is there any published data?

jeffbee1y ago

MLPerf is a good place to start. The only problem is you don't have any verifiable information about TPU energy consumption. https://mlcommons.org/benchmarks/inference-datacenter/

wslh1y ago

I have some company notes from early 2024 which cannot be accurate but could help,

TPU v5e [1]: not available for purchase, only through GCP, storage=5B, LLM-Model=7B, efficiency=393TFLOP.

[1] https://cloud.google.com/tpu/docs/v5e

FrustratedMonky1y ago

So AI designing it's own chips. Now that is moving towards exponential growth. Like at the end of "Colossus" the movie.

Forget LLM's. What DeepMind is doing seems more like how an AI will rule, in the world. Building real world models, and applying game logic like winning.

LLM's will just be the text/voice interface to what DeepMind is building.

anna-gabriella1y ago

I can tell you get excited by SciFi, that's where Google's work belongs - people have been unable to reproduce it outside Google by a long shot.

FrustratedMonky1y ago

Alpha-GO was not sci-fi. And that was 2016

Protein Folding? That was against a defined data set and other organizations.

Nobody can re-produce? Isn't that the definition of a competitive advantage?

They are building something others can't, and that is bad? That is what companies do.

1 more reply

ur-whale1y ago

Seems to me the article is claiming a lot of things, but is very light on actual comparisons that matter to you and me, namely: how does one of those fabled AI-designed chop compare to their competition ?

For example, how much better are these latest gen TPU's when compared to NVidia's equivalent offering ?

gabegobblegoldi1y ago

Good question. I thought the tpus were a way for Google to apply pricing pressure to nvidia by having an alternative. They are not particularly better (it’s hard to get utilization), and I believe Google continues to be a big buyer of nvidia chips.

colesantiago1y ago

A marvellous achievement from DeepMind as usual, I am quite surprised that Google acquired them for a significant discount of $400M, when I would have expected it to be in the range of $20BN, but then again Deepmind wasn’t making any money back then.

dharma11y ago

it was very early. probably one of their all time best acquisitions in addition to YouTube.

Re:using RL and other types AI assistance for chip design, Nvidia and others are doing this too

sroussey1y ago

Applied Semantics for $100m which gave them their advertising business seems like their best deal.

hanwenn1y ago

don't forget Android.

loandbehold1y ago

Every generation of chips is used to design next generation. That seems to be the root of exponential growth in Moore's law.

AshamedCaptain1y ago

I'm only tangential to the area, but my impression over the decades is that what is going to happen is that, eventually, designing the next generation is going to require more resources than the current generation can provide, thereby putting a hard stop at the exponential growth stage.

I'd even dare to claim we are already at the point where the growth has stopped, but even then you will only see the effect in a decade or so as there are still many small low-hanging fruits you can fix, but no big improvements.

negativeonehalf1y ago

Definitely a big part of it. Chips enable better EDA tools, which enable better chips. First it was analytic solvers and simulated annealing, now ML. Exciting times!

bgnn1y ago

That's wrong. Chip design and Moore's law have nothing to do with each other.

smaddox1y ago

To clarify what the parent is getting at: Moore's law is an observation about the density (and, really about the cost) of transistors. So it's about the fabrication process, not about the logic design.

Practically speaking, though, maintaining Moore's law would have been economically prohibitive if circuit design and layout had not been automated.

1 more reply

bankcust083851y ago

Technology singularity is around the corner as soon as the chips (mostly) design themselves. There will be a few engineers, zillions of semiskilled maintenance people making a pittance, and most of the world will be underemployed or unemployed. Technical people better understand this and unionize or they will find themselves going the way of piano tuners and Russian physicists. Slow boiling frog...

amelius1y ago

Can this be abstracted and generalized into a more generally applicable optimization method?

kayson1y ago

I'm pretty sure Cadence and Synopsys have both released reinforcement-learning-based placing and floor planning tools. How do they compare...?

RicoElectrico1y ago

Synopsys tools can use ML, but not for the layout itself, rather tuning variables that go into the physical design flow.

> Synopsys DSO.ai autonomously explores multiple design spaces to optimize PPA metrics while minimizing tradeoffs for the target application. It uses AI to navigate the design-technology solution space by automatically adjusting or fine-tuning the inputs to the design (e.g., settings, constraints, process, flow, hierarchy, and library) to find the best PPA targets.

negativeonehalf1y ago

Unfortunately, commercial EDA companies generally have restrictive licensing agreements that prohibit direct public comparison.

Still, the fact that Google uses it for TPU is pretty telling - this is a multi-billion dollar, mission-critical chip design effort, and there's no way they'd make TPU worse just to prop up a research paper. MediaTek's production use is also a good indicator.

hulitu1y ago

They don't. You cannot compare reality (Cadence, Synopsys) with hype (Google).

pelorat1y ago

So you're basically saying that Google should have used existing tools to layout their chip designs, instead of their ML solution, and that these existing tools would have produced even better chips than the ones they are actually manufacturing?

hulitu1y ago

> So you're basically saying that Google should have used existing tools to layout their chip designs, instead of their ML solution

Did they tested their ML solution ? With real world chips ? Are there any "benchmarks" that show that their chip performs better ?

dsv3099i1y ago

It’s more like no one outside of Google has been able to reproduce Google’s results. And not for lack of trying. So if you’re outside of Google, at this moment, it’s vapor.

bachback1y ago

Deepmind is producing science vapourware while OpenAI is changing the world

2 more replies

idunnoman12221y ago

So one other designer plus Google is using alpha chip for their layouts? - not sure on that title, call me when amd and nvidia are using it

7e1y ago

Did it, though? Google’s chips still aren’t very good compared with competitors.

negativeonehalf1y ago

There's a lot of... passionate discussion in this thread, but we shouldn't lose sight of the big picture -- Google has used AlphaChip in multiple generations of TPU, their flagship AI accelerator. This is a multi-billion dollar project that is strategically critical for the success of the company. The idea that they're secretly making TPUs worse in order to prop up a research paper is just absurd. Google has even expanded their of AlphaChip use to other chips (e.g. Axion).

Meanwhile, MediaTek built on AlphaChip and is using it widely, and announced that it was used to help design Dimensity 5G (4nm technology node size).

I can understand that, when this open-source method first came out, there were some who were skeptical, but we are way beyond that now -- the evidence is just overwhelming.

I'm going to paste here the quotes from the bottom of the blog post, as it seems like a lot of people have missed them:

“AlphaChip’s groundbreaking AI approach revolutionizes a key phase of chip design. At MediaTek, we’ve been pioneering chip design’s floorplanning and macro placement by extending this technique in combination with the industry’s best practices. This paradigm shift not only enhances design efficiency, but also sets new benchmarks for effectiveness, propelling the industry towards future breakthroughs.” --SR Tsai, Senior Vice President of MediaTek

“AlphaChip has inspired an entirely new line of research on reinforcement learning for chip design, cutting across the design flow from logic synthesis to floor planning, timing optimization and beyond. While the details vary, key ideas in the paper including pretrained agents that help guide online search and graph network based circuit representations continue to influence the field, including my own work on RL for logic synthesis. If not already, this work is poised to be one of the landmark papers in machine learning for hardware design.” --Siddharth Garg, Professor of Electrical and Computer Engineering, NYU

"AlphaChip demonstrates the remarkable transformative potential of Reinforcement Learning (RL) in tackling one of the most complex hardware optimization challenges: chip floorplanning. This research not only extends the application of RL beyond its established success in game-playing scenarios to practical, high-impact industrial challenges, but also establishes a robust baseline environment for benchmarking future advancements at the intersection of AI and full-stack chip design. The work's long-term implications are far-reaching, illustrating how hard engineering tasks can be reframed as new avenues for AI-driven optimization in semiconductor technology." --Vijay Janapa Reddi, John L. Loeb Associate Professor of Engineering and Applied Sciences, Harvard University

“Reinforcement learning has profoundly influenced electronic design automation (EDA), particularly by addressing the challenge of data scarcity in AI-driven methods. Despite obstacles including delayed rewards and limited generalization, research has proven reinforcement learning's capability in complex electronic design automation tasks such as floorplanning. This seminal paper has become a cornerstone in reinforcement learning-electronic design automation research and is frequently cited, including in my own work that received the Best Paper Award at the 2023 ACM Design Automation Conference.” --Professor Sung-Kyu Lim, Georgia Institute of Technology

"There are two major forces that are playing a pivotal role in the modern era: semiconductor chip design and AI. This research charted a new path and demonstrated ideas that enabled the electronic design automation (EDA) community to see the power of AI and reinforcement learning for IC design. It has had a seminal impact in the field of AI for chip design and has been critical in influencing our thinking and efforts around establishing a major research conference like IEEE LLM-Aided Design (LAD) for discussion of such impactful ideas." --Ruchir Puri, Chief Scientist, IBM Research; IBM Fellow

1 more reply

DrNosferatu1y ago

Yet, their “frontier” LLM lags all the others…

abc-11y ago

Why aren’t they using this technique to design better transformer architectures or completely novel machine learning architectures in general? Are plain or mostly plain transformers really peak? I find that hard to believe.

jebarker1y ago

Because chip placement and the design of neural network architectures are entirely different problems, so this solution won't magically transfer from one to the other.

abc-11y ago

And AlphaGo is trained to play Go? The point is training a model through self play to build neural network architectures. If it can play Go and architect chip placements, I don’t see why it couldn’t be trained to build novel ML architectures.

jebarker1y ago

Sure, they could choose to work on that problem. But why do you think that's a more important/worthwhile problem than chip design or any other problem they might choose to work on? My point was that it's not trivial to make self-play for some other problem work, so given all the problems in the world why did you single our neural network architecture design? Especially since it's not the transformer architecture that is really holding back AI progress.

1 more reply

mikewarot1y ago

I understand the achievement, but can't square it with my belief that uniform systolic arrays will prove to be the best general purpose compute engine for neural networks. Those are almost trivial to route, by nature.

ilaksh1y ago

Isn't this already the case for large portions of GPUs? Like, many of the blocks would be systolic arrays?

I think the next step is arrays of memory-based compute.

mikewarot1y ago

Imagine a bit level systolic array. Just a sea of LUTs, with latches to allow the magic of graph coloring to remove all timing concerns by clocking everything in 2 phases.

GPUs still treat memory as separate from compute, they just have wider bottlenecks than CPUs.

j / k navigate · click thread line to collapse

194 comments

vighneshiyer1y ago

- The 2023 ISPD paper from a group at UCSD ("Assessment of Reinforcement Learning for Macro Placement"): https://vlsicad.ucsd.edu/Publications/Conferences/396/c396.p...

In full disclosure, I am quite skeptical of their work and wrote a detailed post on my website: https://vighneshiyer.com/misc/ml-for-placement/

negativeonehalf1y ago

FD: I have been following this whole thing for a while, and know personally a number of the people involved.

- The 2023 ISPD paper didn't pre-train at all. This means no learning from experience, for a learning-based algorithm. I feel like you can stop reading there.

Markov's paper just summarizes the other two.

(Incidentally, none of ISPD / MLcontra / Markov were peer reviewed - ISPD 2023 was an invited paper.)

isotypic1y ago

negativeonehalf1y ago

2 more replies

wegfawefgawefg1y ago

In reinforcement learning pre-training reduces peak performance. We can argue about this, but it is not a sufficiently strong point to stop reading from alone.

3170701y ago

Do you have a citation for this? I did my Phd on this topic 8 years ago, and I didn't completely follow the field after. I'm curious to learn more.

1 more reply

negativeonehalf1y ago

See this ISPD 2022 paper where the AlphaChip authors dive more into the value of pre-training (Figure 7, Figure 8): https://dl.acm.org/doi/pdf/10.1145/3505170.3511478

clickwiseorange1y ago

Oh, man... this is the same old stuff from the 2023 Anna Goldie statement (is this Anna Goldie's comment?). This was all addressed by Kahng in 2023 - no valid criticisms. Where do I start?

Kahng is the most prominent active researcher in this field. If anyone knows this stuff, it's Kahng. There were also five other authors in that paper, including another celebrated professor, Cheng.

The resources used by Google were huge. Cadence tools in Kahng's paper ran hundreds times faster and produced better results. That is as conclusive as it gets.

It doesn't take a Ph.D. to understand fair comparisons.

3 more replies

dogleg771y ago

Kahng's placer is open-source and was used in the Nature paper. It does not make sense to accuse Kahng of colluding with companies against open-source.

negativeonehalf1y ago

For a more thorough discussion on pre-training, see this ISPD 2022 paper by the AlphaChip people: https://dl.acm.org/doi/pdf/10.1145/3505170.3511478

2 more replies

porphyra1y ago

lacker1y ago

https://en.wikipedia.org/wiki/Top_Chess_Engine_Championship

So perhaps the critics had a point there.

2 more replies

Workaccount21y ago

jeffbee1y ago

It certainly looks like the criticism at the end of the rebuttal that DeepMind has abandoned their EDA efforts is a bit stale in this context.

anna-gabriella1y ago

2 more replies

jeffbee1y ago

clickwiseorange1y ago

smokel1y ago

This is what you get if you make academic researchers compete for citation counts.

Pretraining seems to be an important aspect here, and it makes sense that such pretraining requires good examples, which unfortunately for the free lunch people, is not available to the public.

That's what you get when you let big companies do fundamental research. Would it be better if the companies did not publish anything about their research at all?

It all feels a bit unproductive to attack one another.

1 more reply

gdiamos1y ago

Criticism is an important part of the scientific process.

Whichever approach ends up winning is improved by careful evaluation and replication of results

s-macke1y ago

nemonemo1y ago

What is your opinion of the addendum? I think the addendum and the pre-trained checkpoint are the substance of the announcement, and it is surprising to see little mention of those here.

dogleg771y ago

bsder1y ago

EDA claims in the digital domain are fairly easy to evaluate. Look at the picture of the layout.

When you see a chip that has the datapath identified and laid out properly by a computer algorithm, you've got something. If not, it's vapor.

So, if your layout still looks like a random rat's nest? Nope.

If even a random person can see that your layout actually follows the obvious symmetric patterns from bit 0 to bit 63, maybe you've got something worth looking at.

glitchc1y ago

lordswork1y ago

They must feel vindicated by their work turning out to be so fruitful now.

gabegobblegoldi1y ago

[1] https://www.theregister.com/AMP/2023/03/27/google_ai_chip_pa...

[2] https://regmedia.co.uk/2023/03/26/satrajit_vs_google.pdf

negativeonehalf1y ago

You are now using multiple new accounts based on the name of one of the authors (Anna Goldie) and her husband (Gabriel). First this one ('gabegobblegoldi'), and then 'anna-gabriella'.

I think it is time for you to take a deep breath and think about what you are doing and why.

Please read PG's post and ask yourself if it applies to you: https://www.paulgraham.com/fh.html

3 more replies

clickwiseorange1y ago

hinkley1y ago

therealcamino1y ago

EDA software has long allowed trading off power, delay, and area during optimization . But TSMC doesn't produce those tools, as far as I'm aware.

hinkley1y ago

https://www.tsmc.com/english/dedicatedFoundry/oip/eda_allian...

1 more reply

pfisherman1y ago

So again what does “better” mean in the context of this particular problem / task.

Drunk_Engineer1y ago

rossjudson1y ago

Like when Google wasted its time writing publicly about Spanner?

https://research.google/pubs/spanner-googles-globally-distri...

or Bigtable?

https://research.google/pubs/bigtable-a-distributed-storage-...

or GFS?

or MapReduce?

or Borg?

or...I think you get the idea.

thenoblesunfish1y ago

1 more reply

bushbaba1y ago

Spanner research paper was in 2012. Bigtable was in 2006. GFS 2003. The last decade has been a 'lost decade' of google. Not much innovation to be honest.

1 more reply

IshKebab1y ago

> Floorplanning/placement/synthesis is a billion dollar industry

Maybe all together, but I don't think automatic placement algorithms are a billion dollar industry. There's so much more to it than that.

1 more reply

negativeonehalf1y ago

The original paper reports P&R metrics (WNS, TNS, area, power, wirelength, horizontal congestion, vertical congestion) - https://www.nature.com/articles/s41586-021-03544-w

(no paywall): https://www.cl.cam.ac.uk/~ey204/teaching/ACS/R244_2021_2022/...

Drunk_Engineer1y ago

2 more replies

q3k1y ago

This is just floorplanning, which is a problem with fairly well defined quality metrics (max speed and chip area used).

Drunk_Engineer1y ago

thesz1y ago

Eurisco [1], if I remember correctly, was once used to perform placement-and-route task and was pretty good at it.

[1] https://en.wikipedia.org/wiki/Eurisko

Can AlphaChip's heuistics be used anywhere else?

gabegobblegoldi1y ago

AshamedCaptain1y ago

> Yet they chose an obscure problem in chip design

It does not surprise me at all that they decided to start with this for their foray into chip EDA. It's the minimum effort route.

gabegobblegoldi1y ago

Sorry. I meant obscure relative to the large space of combinatorial optimization problems not just chip design.

Most design houses don’t write their own macro placers but customize commercial flows for their designs.

That’s what the follow on studies did and that’s why the community has lost faith in this particular algorithm.

1 more reply

AshamedCaptain1y ago

negativeonehalf1y ago

In the blog post, they announce MediaTek's widespread usage, the deployment in multiple generations of TPU with increasing performance each generation, Axion, etc.

Chips designed with the help of AlphaChip are in datacenters and Samsung phones, right now. That's pretty neat!

1 more reply

yeahwhatever101y ago

Why do they keep saying "superhuman"? Algorithms are used for these tasks, humans aren't laying out trillions of transistors by hand.

fph1y ago

My state-of-art bubblesort implementation is also superhuman at sorting numbers.

xanderlewis1y ago

Nice. Do you offer API access for a monthly fee?

1 more reply

HPsquared1y ago

Nice. Still true though! We are in the bubble sort era of AI.

kevindamm1y ago

When we get better quantum computers we can start using spaghetti sort.

jeffbee1y ago

This is floorplanning the blocks, not every feature. We are talking dozens to hundreds of blocks, not billions-trillions of gates and wires.

I assume that the human benchmark is a human using existing EDA tools, not a guy with a pocket protector and a roll of tape.

yeahwhatever101y ago

Floorplanning algorithms and solvers already exist https://limsk.ece.gatech.edu/course/ece6133/slides/floorplan...

1 more reply

thomasahle1y ago

Believe it or not, but there was a time where algorithms were worse than humans at layout out transistors. In particular at the higher level design decisions.

justsid1y ago

epistasis1y ago

Google is good at many things, but perhaps their strongest skill is media positioning.

jonas211y ago

I feel like they're particularly bad at this, especially compared to other large companies.

pinewurst1y ago

Familiarity breeds contempt. They've been pushing the Google==Superhuman thing since the Internet Boom with declining efficacy.

1 more reply

lordswork1y ago

The media hates Google.

epistasis1y ago

It a love/hate relationship. Which benefits Google and the media greatly.

deelowe1y ago

I read the paper. Superhuman is a metric they defined in the paper which has to do with how long it takes a human to do certain tasks.

anna-gabriella1y ago

negativeonehalf1y ago

AshamedCaptain1y ago

> macro placement was done manually by human engineers in any production setting

To quote certain popular TV series .... Sorry, are you from the past? Do your "production" chips only have a couple dozen macros or what?

jayd161y ago

"superhuman or comparable"

What nonsense! XD

Upvoter331y ago

negativeonehalf1y ago

pptr1y ago

It's 6% shorter wire length. Hardly an infinite compute glitch.

negativeonehalf1y ago

cobrabyte1y ago

I'd love a tool like this for PCB design/layout

onjectic1y ago

First thing my mind went to as well, I’m sure this is already being worked on, I think it would be more impactful than even this.

bgnn1y ago

why do you think that?

bittwiddle1y ago

Far more people / companies are designing PCBs than there are designing custom chips.

1 more reply

dsv3099i1y ago

https://www.quilter.ai/

dreamcompiler1y ago

Looks like this is only about placement. I wonder if it can be applied to routing?

amelius1y ago

Exactly what I was thinking.

Also: when is this coming to KiCad? :)

PS: It would also be nice to apply a similar algorithm to graph drawing (e.g. trying to optimize for human readability instead of electrical performance).

tdullien1y ago

The issue is that in order to optimize for human readability you'll need a huge number of human evaluations of graphs?

amelius1y ago

Maybe start with minimization of some metric based on number of edge crossings, edge lengths and edge bends?

ilaksh1y ago

mikewarot1y ago

The cognitive mismatch between Von Neumann's folly and other compute architectures is vast. He slowed down the ENIAC by 66% when he got ahold of it.

sroussey1y ago

The problem is that the competition (our current von neumann architecture) has billions of dollars of R&D per year invested.

Better architectures without the yearly investment train will no longer be better quite quickly.

You would need to be 100x to 1000x better in order to pull the investment train onto your tracks.

Don’t has been impossible for decades.

Even so, I think we will see such a change in my lifetime.

AI could be that use case that has a strong enough demand pull to make it happen.

We will see.

therealcamino1y ago

If you don't worry about the programming model, it's pretty easy to be way better than than existing methodologies in terms of pure compute.

But if you do pay attention to the programming model, they're unusable. You'll see that dozens of these approaches have come and gone, because it's impossible to write software for them.

sanxiyn1y ago

GPGPU is instructive. It is not easy, but possible to write software for it. That's why it succeeded.

ilaksh1y ago

I think it's just ignorance and timidity on the part of investors. Memristor or memory-computing startups are surely the next trend in investing within a few years.

foota1y ago

I don't think it's ignorant to not invest in something that has a decade long path towards even having a market, much less a large market.

1 more reply

HPsquared1y ago

And think of the embedded applications.

ninetyninenine1y ago

What occupation is there that is purely intellectual that has no chance of an AI ever progressing to a point where it can take it over?

Zamiel_Snawley1y ago

I think only sentimentality can prevent take over by a sufficiently competent AI.

I don’t want art that wasn’t made by a human, no matter how visually stunning or indistinguishable it is.

ninetyninenine1y ago

Discounting fraud... what if the AI produces something genuinely better. Genuinely moving you to tears? What then?

Imagine your favorite movie, the most moving book. You read it, it changed you, then you found out it was an AI that generated it in a mere 10 seconds.

Artificial sentimentality is useless in the face of reality. That human endeavor is simply data points along an multi-dimensional best fit curve.

Zamiel_Snawley1y ago

That’s a challenging hypothetical.

I think it would feel hollowed out, disingenuous.

It feels too close to being a rat with a dopamine button, meaningless hedonism.

I haven’t thought it through particularly thoroughly though, I’d been interested in hearing other opinions. These philosophical questions quickly approach unanswerable.

1 more reply

alexyz121y ago

mirchiseth1y ago

I must be old because first thing I thought reading AlphaChip was why is deepmind talking about chips in DEC Alpha :-) https://en.wikipedia.org/wiki/DEC_Alpha.

sedatk1y ago

I first used Windows NT on a PC with a DEC Alpha AXP CPU.

lamontcg1y ago

I miss Digital Unix, too (I don't really miss the "Tru64" rebrand...)

kQq9oHeAz6wLLS1y ago

Same!

mdtancsa1y ago

haha, same!

red75prime1y ago

I hope I'll still be alive when they'll announce AlephZero.

QuadrupleA1y ago

How good are TPUs in comparison with state of the art Nvidia datacenter GPUs, or Groq's ASICs? Per watt, per chip, total cost, etc.? Is there any published data?

jeffbee1y ago

MLPerf is a good place to start. The only problem is you don't have any verifiable information about TPU energy consumption. https://mlcommons.org/benchmarks/inference-datacenter/

wslh1y ago

I have some company notes from early 2024 which cannot be accurate but could help,

TPU v5e [1]: not available for purchase, only through GCP, storage=5B, LLM-Model=7B, efficiency=393TFLOP.

[1] https://cloud.google.com/tpu/docs/v5e

FrustratedMonky1y ago

So AI designing it's own chips. Now that is moving towards exponential growth. Like at the end of "Colossus" the movie.

Forget LLM's. What DeepMind is doing seems more like how an AI will rule, in the world. Building real world models, and applying game logic like winning.

LLM's will just be the text/voice interface to what DeepMind is building.

anna-gabriella1y ago

I can tell you get excited by SciFi, that's where Google's work belongs - people have been unable to reproduce it outside Google by a long shot.

FrustratedMonky1y ago

Alpha-GO was not sci-fi. And that was 2016

Protein Folding? That was against a defined data set and other organizations.

Nobody can re-produce? Isn't that the definition of a competitive advantage?

They are building something others can't, and that is bad? That is what companies do.

1 more reply

ur-whale1y ago

For example, how much better are these latest gen TPU's when compared to NVidia's equivalent offering ?

gabegobblegoldi1y ago

colesantiago1y ago

dharma11y ago

it was very early. probably one of their all time best acquisitions in addition to YouTube.

Re:using RL and other types AI assistance for chip design, Nvidia and others are doing this too

sroussey1y ago

Applied Semantics for $100m which gave them their advertising business seems like their best deal.

hanwenn1y ago

don't forget Android.

loandbehold1y ago

Every generation of chips is used to design next generation. That seems to be the root of exponential growth in Moore's law.

AshamedCaptain1y ago

negativeonehalf1y ago

Definitely a big part of it. Chips enable better EDA tools, which enable better chips. First it was analytic solvers and simulated annealing, now ML. Exciting times!

bgnn1y ago

That's wrong. Chip design and Moore's law have nothing to do with each other.

smaddox1y ago

Practically speaking, though, maintaining Moore's law would have been economically prohibitive if circuit design and layout had not been automated.

1 more reply

bankcust083851y ago

amelius1y ago

Can this be abstracted and generalized into a more generally applicable optimization method?

kayson1y ago

I'm pretty sure Cadence and Synopsys have both released reinforcement-learning-based placing and floor planning tools. How do they compare...?

RicoElectrico1y ago

Synopsys tools can use ML, but not for the layout itself, rather tuning variables that go into the physical design flow.

negativeonehalf1y ago

Unfortunately, commercial EDA companies generally have restrictive licensing agreements that prohibit direct public comparison.

hulitu1y ago

They don't. You cannot compare reality (Cadence, Synopsys) with hype (Google).

pelorat1y ago

hulitu1y ago

> So you're basically saying that Google should have used existing tools to layout their chip designs, instead of their ML solution

Did they tested their ML solution ? With real world chips ? Are there any "benchmarks" that show that their chip performs better ?

dsv3099i1y ago

It’s more like no one outside of Google has been able to reproduce Google’s results. And not for lack of trying. So if you’re outside of Google, at this moment, it’s vapor.

bachback1y ago

Deepmind is producing science vapourware while OpenAI is changing the world

2 more replies

idunnoman12221y ago

So one other designer plus Google is using alpha chip for their layouts? - not sure on that title, call me when amd and nvidia are using it

7e1y ago

Did it, though? Google’s chips still aren’t very good compared with competitors.

negativeonehalf1y ago

Meanwhile, MediaTek built on AlphaChip and is using it widely, and announced that it was used to help design Dimensity 5G (4nm technology node size).

I can understand that, when this open-source method first came out, there were some who were skeptical, but we are way beyond that now -- the evidence is just overwhelming.

I'm going to paste here the quotes from the bottom of the blog post, as it seems like a lot of people have missed them:

1 more reply

DrNosferatu1y ago

Yet, their “frontier” LLM lags all the others…

abc-11y ago

jebarker1y ago

Because chip placement and the design of neural network architectures are entirely different problems, so this solution won't magically transfer from one to the other.

abc-11y ago

jebarker1y ago

1 more reply

mikewarot1y ago

ilaksh1y ago

Isn't this already the case for large portions of GPUs? Like, many of the blocks would be systolic arrays?

I think the next step is arrays of memory-based compute.

mikewarot1y ago

Imagine a bit level systolic array. Just a sea of LUTs, with latches to allow the magic of graph coloring to remove all timing concerns by clocking everything in 2 phases.

GPUs still treat memory as separate from compute, they just have wider bottlenecks than CPUs.

j / k navigate · click thread line to collapse