Testing Generative AI for Circuit Board Design (opens in new tab)

(blog.jitx.com)

353 pointsDHaldane2y ago168 comments

168 comments

79 comments · 23 top-level

HanClinto2y ago· 18 in thread

This feels like an excellent demonstration of the limitation of zero-shot LLMs. It feels like the wrong way to approach this.

I'm no expert in the matter, but for "holistic" things (where there are a lot of cross-connections and inter-dependencies) it feels like a diffusion-based generative structure would be better-suited than next-token-prediction. I've felt this way about poetry-generation, and I feel like it might apply in these sorts of cases as well.

Additionally, this is a highly-specialized field. From the conclusion of the article:

> Overall we have some promising directions. Using LLMs for circuit board design looks a lot like using them for other complex tasks. They work well for pulling concrete data out of human-shaped data sources, they can do slightly more difficult tasks if they can solve that task by writing code, but eventually their capabilities break down in domains too far out of the training distribution.

> We only tested the frontier models in this work, but I predict similar results from the open-source Llama or Mistral models. Some fine tuning on netlist creation would likely make the generation capabilities more useful.

I agree with the authors here.

While it's nice to imagine that AGI would be able to generalize skills to work competently in domain-specific tasks, I think this shows very clearly that we're not there yet, and if one wants to use LLMs in such an area, one would need to fine-tune for it. Would like to see round 2 of this made using a fine-tuning approach.

DHaldaneOP2y ago

My gut agrees with you that LLMs shouldn't do this well on a specialty domain.

But I think there's also the bitter lesson to be learned here: many times people say LLMs won't do well on a task, they are often surprised either immediately or a few months later.

Overall not sure what to expect, but fine tuning experiments would be interesting regardless.

cjk22y ago

I doubt it'd work any better. Most of EE time I have spent is swearing at stuff that looked like it'd work on paper but didn't due to various nuances.

I have my own library of nuances but how would you even fine tune anything to understand the black box abstraction of an IC to work out if a nuance applies or not between it and a load or what a transmission line or edge would look like between the IC and the load?

This is where understanding trumps generative AI instantly.

2 more replies

anonymoushn2y ago

We have 0 y/o/y progress on Advent of Code, for example. Maybe we'll have some progress 6 months from now :) https://www.themotte.org/post/797/chatgpt-vs-advent-of-code

1 more reply

HanClinto2y ago

> But I think there's also the bitter lesson to be learned here: many times people say LLMs won't do well on a task, they are often surprised either immediately or a few months later.

Heh. This is very true. I think perhaps the thing I'm most amazed by is that simple next-token prediction seems to work unreasonably well for a great many tasks.

I just don't know how well that will scale into more complex tasks. With simple next-token prediction there is little mechanism for the model to iterate or to revise or refine as it goes.

There have been some experiments with things like speculative generation (where multiple branches are evaluated in parallel) to give a bit of a lookahead effect and help avoid the LLM locking itself into dead-ends, but they don't seem super popular overall -- people just prefer to increase the power and accuracy of the base model and keep chugging forward.

I can't help feeling like a fundamental shift something more akin to a diffusion-based approach would be helpful for such things. I just want some sort of mechanism where the model can "think" longer about harder problems. If you present a simple chess board to an LLM or a complex board to an LLM and ask it to generate the next move, it always responds in the same amount of time. That alone should tell us that LLMs are not intelligent, and they are not "thinking", and they will be insufficient for this going forward.

I believe Yann LeCun is right -- simply scaling LLMs is not going to get us to AGI. We need a fundamental structural shift to something new, but until we stop seeing such insane advancements in the quality of generation with LLMs (looking at you, Claude!!), I don't think we will move beyond. We have to get bored with LLMs first.

2 more replies

sweezyjeezy2y ago

Some research to the contrary [1] - tldr is that they didn't find evidence that generative models really do zero shot well at all yet, if you show it something it literally hasn't seen before, it isn't "generally intelligent" enough to do it well. This isn't an issue for a lot of use-cases, but does seem to add some weight to the "giga-scale memorization" hypothesis.

[1] https://arxiv.org/html/2404.04125v2

surfingdino2y ago

> This feels like an excellent demonstration of the limitation of zero-shot LLMs. It feels like the wrong way to approach this.

There is one posted on HN every week. How many more do we need to accept the fact this tech is not what it is sold at and we are bored waiting for it get good? I am not say "get better", because it keeps getting better, but somehow doesn't get good.

refulgentis2y ago

There's this odd strain of thought that there's some general thing that will pop for hucksters and the unwashed masses, who are sheep led along by huckster wolves who won't admit LLMs aint ???, because they're profiting off it

It's frustrating because it's infantalizing, it derails the potential of an interesting technical discussion (ex. Here, diffusion), and it misses the mark substantially.

At the end of the day, it's useful in a thousand ways day to day, and the vast majority of people feel this way. The only people I see vehemently arguing the opposite seem to assume only things with 0 error rate are useful or are upset about money in some form.

But is that really it? I'm all ears. I'm on a 5 hour flight. I'm genuinely unclear on whats going on that leads people to take this absolutist position that they're waiting for ??? to admit ??? about LLMs.

Yes, the prose machine didnt nail circuit design, that doesn't mean whatever They you're imagining needs to give up and accept ???

2 more replies

makk2y ago

That’s a perception and the problem isn’t the AI it’s human nature: 1. every time AI is able to do a thing we move the goalposts and say, yeah, but it can’t do that other thing over there; 2. We are impatient, so our ability to get bored tends to outpace the rate of change.

6 more replies

exe342y ago

how long does it take for a child to start doing surgery? publishing novel theorems? how long has the humble transformer been around?

2 more replies

Kiro2y ago

This post supports your case way less than you think. I've sent it to several EE friends and none have expressed your discontent. The general consensus has been "amazing what AI can do nowadays", and I agree. This would have been complete science-fiction just a couple of years ago.

echelon2y ago

I'm in awe of the progress in AI images, music, and video. This is probably where AI shines the most.

Soon everything you see and hear will be built up through a myriad of AI models and pipelines.

3 more replies

omgJustTest2y ago

I asked this question of Duncan Dec 22!

If you are interested I highly recommend this + your favorite llm. It does not do everything but is far superior to some highly expensive tools, in flexibility and repeatability. https://github.com/devbisme/skidl

HanClinto2y ago

This tool looks really powerful, thanks for the link!

One thing I've been personally really intrigued by is the possibility of using self-play and adversarial learning as a way to advance beyond our current stage of imitation-only LLMs.

Having a strong rules-based framework to be able to be able to measure quality and correctness of solutions is necessary for any RL training setup to proceed. I think that skidl could be a really nice framework to be part of an RL-trained LLM's curriculum!

I've written down a bunch of thoughts [1] on using games or code-generation in an adversarial training setup, but I could see circuit design being a good training ground as well!

* [1] https://github.com/HanClinto/MENTAT

1 more reply

yousif_1231232y ago

One downside for diffusion based systems (and I'm very noob in this) is that the model won't be able to see it's input and output in the same space, therefore wouldn't be able to do follow-up instructions to fix things or improve on it. Where as an LLM generating html could follow instructions to modify it as well. It's input and output are the same format.

HanClinto2y ago

Oh? I would think that the input prompt to drive generation is not lost during generation iterations -- but I also don't know much about the architectural details.

eimrine2y ago

I like how you called it holistic, it is maybe the first time I see this word not in a "bad" context.

What about the topic, it is impossible to synthesize STEM things not in the manner an engineer does this. I mean thou shalt to know some typical solutions and have all the calculations for all what's happening in the schematic being developed.

Textbooks are not a joke and no matter who are you - a human or a device.

hoosieree2y ago

I agree diffusion makes more sense for optimizing code-like things. The tricky part is coming up with a reasonable set of "add noise" transformations.

HanClinto2y ago

> The tricky part is coming up with a reasonable set of "add noise" transformations.

Yes, as well as dealing with a variable-length window.

When generating images with diffusion, one specifies the image ahead-of-time. When generating text with diffusion, it's a bit more open-ended. How long do we want this paragraph to go? Well, that depends on what goes into it -- so how do we adjust for that? Do we use a hierarchical tree-structure approach? Chunk it and do a chain of overlapping segments that are all of fixed-length (could possibly be combined with a transformer model)?

Hard to say what would finally work in the end, but I think this is the sort of thing that YLC is talking about when he encourages students to look beyond LLMs. [1]

* [1] https://x.com/ylecun/status/1793326904692428907

seveibar2y ago· 7 in thread

I work on generative AI for circuit board design with tscircuit, IMO it's definitely going to be the dominant form of bootstrapping or combining circuit designs in the near future (<5 years)

Most people are wrong that AI won't be able to do this soon. The same way you can't expect an AI to generate a website in assembly, but you CAN expect it to generate a website with React/tailwind, you can't expect an AI to generate circuits without having strong functional blocks to work with.

Great work from the author studying existing solutions/models- I'll post some of my findings soon as well! The more you play with it, the more inevitable it feels!

maccard2y ago

> The same way you can't expect an AI to generate a website in assembly, but you CAN expect it to generate a website with React/tailwind

Can you? Because last time I tried (probably about February) it still wasn’t a thing

jamesralph85552y ago

I tried GPT-4o in May and had good results asking it to generate react+tailwind components for me. It might not get things right the first time but it is generally able to respond to feedback well.

1 more reply

mewpmewp22y ago

Depends on the website, right. Because a single index.html can easily be a website which it cam generate.

1 more reply

crote2y ago

The problem is going to be getting those functional blocks in the first place.

The industry does not like sharing, and the openly available datasets are full of mistakes. As a junior EE you learn quite quickly to never trust third-party symbols and footprints - if you can find them at all. Even when they come directly from the manufacturer there's a decent chance they don't 100% agree with the datasheet PDF. And good luck if that datasheet is locked behind a NDA!

If we can't even get basic stuff like that done properly, I don't think we can reasonably expect manufacturers to provide ready-to-use "building blocks" any time soon. It would require the manufacturers to invest a lot of engineer-hours into manually writing those, for essentially zero gain to them. After all, the information is already available to customers via the datasheet...

seveibar2y ago

This is why me and even some YC backed companies are working toward datasheet-to-component ai. We don’t trust third party, but we do trust datasheets (at least, trust enough to test for a revision)

1 more reply

HanClinto2y ago

I'd be interested in reading more of your findings!

Are you able to accomplish this with prompt-engineering, or are you doing fine-tuning of LLMs / custom-trained models?

seveibar2y ago

No fine tuning needed, as long as the target language/DSL is fairly natural, just give eg a couple examples of tscircuit React, atopile JotX etc and it can generate compliant circuits. It can hallucinate imports, but if you give it an import list you can improve that a lot.

1 more reply

bottlepalm2y ago· 6 in thread

It'd be interesting to see how Sonnet 3.5 does at this. I've found Sonnet a step change better than Opus, and for a fraction of the cost. Opus for me is already far better than GPT-4. And same as the poster found, GPT-4o is plain worse at reasoning.

Edit: Better at chain of thought, long running agentic tasks, following rigid directions.

DHaldaneOP2y ago

That's an interesting question - I'll take a few pokes at it now to see if there's improvement.

DHaldaneOP2y ago

Update: Sonnet 3.5 is better than any other model for the circuit design and part finding tasks. Going to iterate a bit on the prompts to see how much I can push the new model on performance.

Figures that any article written on LLM limits is immediately out of date. I'll write an update piece to summarize new findings.

1 more reply

stavros2y ago

Opus is better than GPT-4? I've heard mixed experiences.

imperio592y ago

That's because the sample size is probably small and for niche prompts or topics.

It's very hard to evaluate whether a model is better than another, especially doing it in a scientifically sound way is time consuming and hard.

This is why I find these types of comments like "model X is so much better than model Y" to be about as useful as "chocolate ice cream is so much better than vanilla"

2 more replies

DHaldaneOP2y ago

It really depends on the type of question, but generally I'm between Gemini and Claude these days for most things.

anticensor2y ago

Opus 3.5 is not yet released.

1 more reply

cjk22y ago· 6 in thread

Ex EE here

> The AI generated circuit was three times the cost and size of the design created by that expert engineer at TI. It is also missing many of the necessary connections.

Exactly what I expected.

Edit: to clarify this is even below the expectations of a junior EE who had a heavy weekend on the vodka.

rzzzt2y ago

I read an article on evolutionary algorithm-based designs a long time ago -- they are effectively indecipherable by humans and rely on the imperfections of the very FPGA that they are synthesized on, but work great otherwise.

- https://www.damninteresting.com/on-the-origin-of-circuits/

- https://www.sciencedirect.com/science/article/abs/pii/S03784...

FourierEnvy2y ago

Why do people think inserting an LLM into the mix will make it better than just an evolutionary or reinforcement model applied? Who cares if you can talk to it like a human?

Terr_2y ago

Yeah, when the author was writing about that initial query about delay-per-unit-length, I'm thinking: "This doesn't tell us whether an LLM can apply the concepts, only whether relevant text was included in its training data."

It's a distinction I fear many people will have trouble keeping in-mind, faced with the misleading eloquence of LLM output.

1 more reply

m-hilgendorf2y ago

imo, it's the same reason that Grace Hopper designed COBOL to write programs instead of math notation.

What natural language processing does is just make a much smarter (and dumber, in many ways) parser that can make an attempt to infer the intent, as well as be instructed how to recover from mistakes.

Personally I'm a skeptic since I've seen some hilariously bad hallucinations in generated code (and unlike a human engineer who will say "idk but I think this might work" instead of "yessir this is the solution!"). If you have to double check every output manually it's not that much better than learning yourself. However, at least with programming tasks, LLMs are fantastic at giving wrong answers with the right vocabulary - which makes it possible to check and find a solution through authoritative sources and references instead of blindly analyzing a problem or paying a human a lot of money to tell you the answer to your query.

For example, I don't use LLMs to give me answers. I use them to help explore a design space, particularly by giving me the vocabulary to ask better questions. And that's the real value of a conversational model today.

1 more reply

shrimp_emoji2y ago

It's like a generated image with an eye missing but for circuits. :D

cjk22y ago

AI proceeds to use 2n3904 as a thyristor.

AI happy as it worked the first 10ns of the cycle.

1 more reply

AdamH121132y ago· 5 in thread

The conclusions are very optimistic given the results. The LLMs:

* Failed to properly understand and respond to the requirements for component selection, which were already pretty generic.

* Succeeded in parsing the pinout for an IC but produced an incomplete footprint with incorrect dimensions.

* Added extra components to a parsed reference schematic.

* Produced very basic errors in a description of filter topologies and chose the wrong one given the requirements.

* Generated utterly broken schematics for several simple circuits, with missing connections and aggressively-incorrect placement of decoupling capacitors.

Any one of these failures, individually, would break the entire design. The article's conclusion for this section buries the lede slightly:

> The AI generated circuit was three times the cost and size of the design created by that expert engineer at TI. It is also missing many of the necessary connections.

Cost and size are irrelevant if the design doesn't work. LLMs aren't a third as good as a human at this task, they just fail.

The LLMs do much better converting high-level requirements into (very) high-level source code. This make sense (it's fundamentally a language task), but also isn't very useful. Turning "I need an inverting amplifier with a gain of 20" into "amp = inverting_amplifier('amp1', gain=-20.0)" is pretty trivial.

The fact that LLMs apparently perform better if you literally offer them a cookie is, uh... something.

doe_eyes2y ago

Yes, this seemed pretty striking to me: the author clearly wanted the LLM to perform well. They started with a problem for which solutions are pretty much readily available on the internet, and then provided a pretty favorable take on the model's mistakes.

But the bottom line is that it's a task that a novice could have solved with a Google search or two, and the LLM fumbled it in ways that'd be difficult for a non-expert to spot and rectify. LLMs are generally pretty good at information retrieval, so it's quite disappointing.

The cookie thing... well, they learn statistical patterns. People on the internet often try harder if there is a quid-pro-quo, so the LLMs copy that, and it slips past RLHF because "performs as well with or without a cookie" is probably not one of the things they optimize for.

neltnerb2y ago

I think the only bit that looked handy in there would be if it could parse PDF datasheets and help you sort them by some hidden parameter. If I give it 100 datasheets for microphones it really should be able to sort them by mechanical height. Maybe I'm too optimistic.

The number of times I've had to entirely redo a circuit because of one misplaced connection, yeah, none of those circuits worked for any price before I fixed every single error.

DHaldaneOP2y ago

Agree that PDF digesting was the most useful.

I think Gemini could definitely do that microphone study. Good test case! I remember spending 8 hours on DigiKey in the bad old times, looking for an audio jack that was 0.5mm shorter.

3 more replies

oscillonoscope2y ago

I don't know enough about LLMs to understand if its feasible or not but it seems like it would be useful to make certain tasks hard-coded or add some fundamental constraints on it. Like when making footprints, it should always check that the number of pads is never less than the number of schematic symbol pins. Otherwise, the AI just feels like your worst coworker

lemonlime0x3C332y ago

thank you for summarizing the results, I feel much better about my job security. Now if AI could make a competent auto router for fine pitch BGA components that would be really nice :)

kristopolous2y ago· 4 in thread

Just the other day I came up with an idea of doing a flatbed scan of a circuit board and then using machine learning and a bit of text promoting to get to a schematic

I don't know how feasible it is. This would probably take low $millions or so of training, data collection and research to get not trash results.

I'd certainly love it for trying to diagnose circuits.

It's probably not really that possible even at higher end consumer grade 1200dpi.

cmbuck2y ago

This would be an interesting idea if you were able to solve the problem of inner layers. Currently to reverse engineer a board with more than 2 layers an x-ray machine is required to glean information about internal routing. Otherwise you're making inferences based on surface copper only.

kristopolous2y ago

Maybe not. I scanned a bluetooth aux transceiver yesterday as a test of how well a flatbed can pick up details. There's a bunch of these on the market and the cheap ones, they are more or less equivalent. It's a CSR 8365 based device, which you can read from the scan. The industry is generally convergent on the major design decisions for some hardware purpose for some given time period.

And the devices, in this case, bluetooth aux transceivers, they all do the same things. They've even more or less converged on all being 3 buttons. When optimizing for cost reduction with the commodity chips that everyone is using to do the same things, the manufacturer variation isn't that vast.

In the same way you can get 3d models from 2d photos because you can identify the object based on a database of samples and then guess the 3d contours, the hypothesis to test is whether with enough scans and schematics, a sufficiently large statistical model will be good enough to make decent guesses.

If you've got say 40 devices with 80% of the same chips doing the same things for the same purpose, a 41st device might have lots of guessable things that you can't necessarily capture on a cheap flatbed

This will probably work but it's a couple million away from becoming a reality. There's shortcuts that might make this a couple $100,000s project (essentially data contracts with bespoke chip printers) but I'd have to make those connections. And even then, it's just a hobbyist product. The chances of recouping that investment is probably zero although the tech would certainly be cool and useful. Just not "I'll pay you money" level useful.

contingencies2y ago

I think good RE houses have long since likely repurposed rapid PCB testing machines to determine common nets using flying CNC probes. The good ones probably don't need to depopulate to do it.

catherd2y ago

As long as you are OK with destructive methods, grinding/sanding the board down gives you all layers. "PCB delayering" is the search term.

dindobre2y ago· 2 in thread

Using neural networks to solve combinatorial or discrete problems is a waste of time imo, but I'd be more than happy if somebody could convince me of the opposite.

utkuumur2y ago

There are recent papers based on diffusion that perform quite well. Here's an example of a recent paper https://arxiv.org/pdf/2406.01661. I am also working on ML-based CO. My approach has a close 1% gap on hard instances with 800-1200 nodes and less than 0.1% for 200-300 nodes on Maximum Cut, Minimum Independent Set, and Maximum Clique problems. I think these are very promising times for neural network-based discrete optimization.

dindobre2y ago

Thanks, will try to give it a read this weekend. Would you say that diffusion is the architectural change that opened up CO for neural nets? Haven't followed this particular niche in a while

1 more reply

guidoism2y ago· 2 in thread

This reminds me of my professor's (probably very poor) description of NP-complete problems where the computer would provide an answer that may or may not be correct and you just had to check that it was correct and you do test for correctness in polynomial time.

It kind of grosses me out that we are entering a world where programming could be just testing (to me) random permutations of programs for correctness.

moffkalast2y ago

Well we had to keep increasing inefficiency somehow, right? Otherwise how would Wirth's law continue to hold?

thechao2y ago

Most of the HW engineers I work with consider the webstack to be far more efficient than the HW-synthesis stack; ie, there's more room for improvement in HW implementation than in SW optimization.

amelius2y ago· 2 in thread

Can we have an AI that reads datasheets and produces Spice circuits? With the goal of building a library of simulation components.

klysm2y ago

That's the kind of thing where verification is really hard, and things will look plausible even if incorrect.

amelius2y ago

The LLM can verify e.g. transistors by looking at the curves in the datasheet.

1 more reply

roody152y ago· 1 in thread

It makes me think of the saying “a jack of all trades a master of none”.

I cannot help but think there are some similarities between large model generative AI and human reasoning abilities.

For example if I ask a physician with a really high IQ some general questions about say anything like fixing shocks on my mini van … he may have some better ideas than me.

However he may be wrong since he specialized in medicine, although he may have provided some good overall info.

Let’s take a lower IQ mechanic who has worked as a mechanic for 15 years. Despite this human having less IQ, less overall knowledge on general topics … he gives a much better answer of fixing my shocks.

So with LLM AI fine tuning looks to be key as it is with human beings. Large data sets that are filtered / summarized with specific fields as the focus.

pylua2y ago

That’s not really reasoning, right ? Maybe humans rely disproportionate on association in general.

sehugg2y ago· 1 in thread

How does this compare to Flux.ai? https://docs.flux.ai/tutorials/ai-for-hardware-design

built_with_flux2y ago

flux.ai founder here

Agree with OP that the raw models aren't that useful for schematic/pcb design.

It's why we build flux from the ground up to provide the models with the right context. The models are great moderators but poor sources of great knowledge.

Here are some great use cases:

https://www.youtube.com/watch?v=XdH075ClrYk

https://www.youtube.com/watch?v=J0CHG_fPxzw&t=276s

https://www.youtube.com/watch?v=iGJOzVf0o7o&t=2s

and here a great example of levering AI to go from idea to full design https://x.com/BuildWithFlux/status/1804219703264706578

cushychicken2y ago· 1 in thread

I'm terrified that JITX will get into the LLM / Generative AI for boards business. (Don't make me homeless, Duncan!)

They are already far ahead of many others with respect to next generation EE CAD.

Judicious application of AI would be a big win for them.

Edit: adding "TL;DRN'T" to my vocabulary XD

DHaldaneOP2y ago

I promise that we want to stay a software company that helps people design things!

Adding Skynetn't to company charter...

Terr_2y ago· 1 in thread

To recycle a rant, there's a whole bunch of hype and investor money riding on a very questionable idea here, namely:

"If we make a really really good specialty text-prediction engine, it could be able to productively mimic an imaginary general AI, and if it can do that then it can productively mimic other specialty AIs, because it's all just intelligence, right?"

ai4ever2y ago

investor money is seduced by the possibilities and many of the investors are in it for FOMO.

few really understand what the limits of the tech are. and if it will even unlock the usecases for which it is being touted.

rkagerer2y ago

Any discussion of evolved circuits would be incomplete without mentioning Dr. Adrian Thompson's pioneering work in the 90's:

https://www.damninteresting.com/on-the-origin-of-circuits/

al2o3cr2y ago

TBH the LLM seems worse than useless for a lot of these tasks - entering a netlist from a datasheet is tedious, but CHECKING a netlist that's mostly correct (except for some hallucinated resistors) seems even more tedious.

shrubble2y ago

Reminds me of this, an earlier expert-system method for CPU design, which was not used in subsequent designs for some reason: https://en.wikipedia.org/wiki/VAX_9000#SID_Scalar_and_Vector...

MOARDONGZPLZ2y ago

Author mentions prompting techniques to get better results, presumable “you are an expert EE” or “do this and you get a digital cookie” are among these. Can anyone point me to non-SEO article that outlines the latest and greatest in the promoting techniques domain?

amelius2y ago

The whole approach reminds me of:

https://gpt-unicorn.adamkdean.co.uk/

ncrmro2y ago

I had it generate some opencad but never looked into it further.

teleforce2y ago

Too Lazy To Click (TLTC):

TLDR: We test LLMs to figure out how helpful they are for designing a circuit board. We focus on utility of frontier models (GPT4o, Claude 3 Opus, Gemini 1.5) across a set of design tasks, to find where they are and are not useful. They look pretty good for building skills, writing code, and getting useful data out of datasheets.

TLDRN'T: We do not explore any proprietary copilots, or how to apply a things like a diffusion model to the place and route problem.

blueyes2y ago

See Quilter: https://www.quilter.ai

djaouen2y ago

Sure, this will end well lol

surfingdino2y ago

Look! You can design thousands of shit appliances at scale! /s

j / k navigate · click thread line to collapse

168 comments

79 comments · 23 top-level

HanClinto2y ago· 18 in thread

This feels like an excellent demonstration of the limitation of zero-shot LLMs. It feels like the wrong way to approach this.

Additionally, this is a highly-specialized field. From the conclusion of the article:

I agree with the authors here.

DHaldaneOP2y ago

My gut agrees with you that LLMs shouldn't do this well on a specialty domain.

But I think there's also the bitter lesson to be learned here: many times people say LLMs won't do well on a task, they are often surprised either immediately or a few months later.

Overall not sure what to expect, but fine tuning experiments would be interesting regardless.

cjk22y ago

I doubt it'd work any better. Most of EE time I have spent is swearing at stuff that looked like it'd work on paper but didn't due to various nuances.

This is where understanding trumps generative AI instantly.

2 more replies

anonymoushn2y ago

We have 0 y/o/y progress on Advent of Code, for example. Maybe we'll have some progress 6 months from now :) https://www.themotte.org/post/797/chatgpt-vs-advent-of-code

1 more reply

HanClinto2y ago

> But I think there's also the bitter lesson to be learned here: many times people say LLMs won't do well on a task, they are often surprised either immediately or a few months later.

Heh. This is very true. I think perhaps the thing I'm most amazed by is that simple next-token prediction seems to work unreasonably well for a great many tasks.

I just don't know how well that will scale into more complex tasks. With simple next-token prediction there is little mechanism for the model to iterate or to revise or refine as it goes.

2 more replies

sweezyjeezy2y ago

[1] https://arxiv.org/html/2404.04125v2

surfingdino2y ago

> This feels like an excellent demonstration of the limitation of zero-shot LLMs. It feels like the wrong way to approach this.

refulgentis2y ago

It's frustrating because it's infantalizing, it derails the potential of an interesting technical discussion (ex. Here, diffusion), and it misses the mark substantially.

Yes, the prose machine didnt nail circuit design, that doesn't mean whatever They you're imagining needs to give up and accept ???

2 more replies

makk2y ago

6 more replies

exe342y ago

how long does it take for a child to start doing surgery? publishing novel theorems? how long has the humble transformer been around?

2 more replies

Kiro2y ago

echelon2y ago

I'm in awe of the progress in AI images, music, and video. This is probably where AI shines the most.

Soon everything you see and hear will be built up through a myriad of AI models and pipelines.

3 more replies

omgJustTest2y ago

I asked this question of Duncan Dec 22!

HanClinto2y ago

This tool looks really powerful, thanks for the link!

One thing I've been personally really intrigued by is the possibility of using self-play and adversarial learning as a way to advance beyond our current stage of imitation-only LLMs.

I've written down a bunch of thoughts [1] on using games or code-generation in an adversarial training setup, but I could see circuit design being a good training ground as well!

* [1] https://github.com/HanClinto/MENTAT

1 more reply

yousif_1231232y ago

HanClinto2y ago

Oh? I would think that the input prompt to drive generation is not lost during generation iterations -- but I also don't know much about the architectural details.

eimrine2y ago

I like how you called it holistic, it is maybe the first time I see this word not in a "bad" context.

Textbooks are not a joke and no matter who are you - a human or a device.

hoosieree2y ago

I agree diffusion makes more sense for optimizing code-like things. The tricky part is coming up with a reasonable set of "add noise" transformations.

HanClinto2y ago

> The tricky part is coming up with a reasonable set of "add noise" transformations.

Yes, as well as dealing with a variable-length window.

Hard to say what would finally work in the end, but I think this is the sort of thing that YLC is talking about when he encourages students to look beyond LLMs. [1]

* [1] https://x.com/ylecun/status/1793326904692428907

seveibar2y ago· 7 in thread

I work on generative AI for circuit board design with tscircuit, IMO it's definitely going to be the dominant form of bootstrapping or combining circuit designs in the near future (<5 years)

Great work from the author studying existing solutions/models- I'll post some of my findings soon as well! The more you play with it, the more inevitable it feels!

maccard2y ago

> The same way you can't expect an AI to generate a website in assembly, but you CAN expect it to generate a website with React/tailwind

Can you? Because last time I tried (probably about February) it still wasn’t a thing

jamesralph85552y ago

I tried GPT-4o in May and had good results asking it to generate react+tailwind components for me. It might not get things right the first time but it is generally able to respond to feedback well.

1 more reply

mewpmewp22y ago

Depends on the website, right. Because a single index.html can easily be a website which it cam generate.

1 more reply

crote2y ago

The problem is going to be getting those functional blocks in the first place.

seveibar2y ago

This is why me and even some YC backed companies are working toward datasheet-to-component ai. We don’t trust third party, but we do trust datasheets (at least, trust enough to test for a revision)

1 more reply

HanClinto2y ago

I'd be interested in reading more of your findings!

Are you able to accomplish this with prompt-engineering, or are you doing fine-tuning of LLMs / custom-trained models?

seveibar2y ago

1 more reply

bottlepalm2y ago· 6 in thread

Edit: Better at chain of thought, long running agentic tasks, following rigid directions.

DHaldaneOP2y ago

That's an interesting question - I'll take a few pokes at it now to see if there's improvement.

DHaldaneOP2y ago

Update: Sonnet 3.5 is better than any other model for the circuit design and part finding tasks. Going to iterate a bit on the prompts to see how much I can push the new model on performance.

Figures that any article written on LLM limits is immediately out of date. I'll write an update piece to summarize new findings.

1 more reply

stavros2y ago

Opus is better than GPT-4? I've heard mixed experiences.

imperio592y ago

That's because the sample size is probably small and for niche prompts or topics.

It's very hard to evaluate whether a model is better than another, especially doing it in a scientifically sound way is time consuming and hard.

This is why I find these types of comments like "model X is so much better than model Y" to be about as useful as "chocolate ice cream is so much better than vanilla"

2 more replies

DHaldaneOP2y ago

It really depends on the type of question, but generally I'm between Gemini and Claude these days for most things.

anticensor2y ago

Opus 3.5 is not yet released.

1 more reply

cjk22y ago· 6 in thread

Ex EE here

> The AI generated circuit was three times the cost and size of the design created by that expert engineer at TI. It is also missing many of the necessary connections.

Exactly what I expected.

Edit: to clarify this is even below the expectations of a junior EE who had a heavy weekend on the vodka.

rzzzt2y ago

- https://www.damninteresting.com/on-the-origin-of-circuits/

- https://www.sciencedirect.com/science/article/abs/pii/S03784...

FourierEnvy2y ago

Why do people think inserting an LLM into the mix will make it better than just an evolutionary or reinforcement model applied? Who cares if you can talk to it like a human?

Terr_2y ago

It's a distinction I fear many people will have trouble keeping in-mind, faced with the misleading eloquence of LLM output.

1 more reply

m-hilgendorf2y ago

imo, it's the same reason that Grace Hopper designed COBOL to write programs instead of math notation.

1 more reply

shrimp_emoji2y ago

It's like a generated image with an eye missing but for circuits. :D

cjk22y ago

AI proceeds to use 2n3904 as a thyristor.

AI happy as it worked the first 10ns of the cycle.

1 more reply

AdamH121132y ago· 5 in thread

The conclusions are very optimistic given the results. The LLMs:

* Failed to properly understand and respond to the requirements for component selection, which were already pretty generic.

* Succeeded in parsing the pinout for an IC but produced an incomplete footprint with incorrect dimensions.

* Added extra components to a parsed reference schematic.

* Produced very basic errors in a description of filter topologies and chose the wrong one given the requirements.

* Generated utterly broken schematics for several simple circuits, with missing connections and aggressively-incorrect placement of decoupling capacitors.

Any one of these failures, individually, would break the entire design. The article's conclusion for this section buries the lede slightly:

> The AI generated circuit was three times the cost and size of the design created by that expert engineer at TI. It is also missing many of the necessary connections.

Cost and size are irrelevant if the design doesn't work. LLMs aren't a third as good as a human at this task, they just fail.

The fact that LLMs apparently perform better if you literally offer them a cookie is, uh... something.

doe_eyes2y ago

neltnerb2y ago

The number of times I've had to entirely redo a circuit because of one misplaced connection, yeah, none of those circuits worked for any price before I fixed every single error.

DHaldaneOP2y ago

Agree that PDF digesting was the most useful.

I think Gemini could definitely do that microphone study. Good test case! I remember spending 8 hours on DigiKey in the bad old times, looking for an audio jack that was 0.5mm shorter.

3 more replies

oscillonoscope2y ago

lemonlime0x3C332y ago

thank you for summarizing the results, I feel much better about my job security. Now if AI could make a competent auto router for fine pitch BGA components that would be really nice :)

kristopolous2y ago· 4 in thread

Just the other day I came up with an idea of doing a flatbed scan of a circuit board and then using machine learning and a bit of text promoting to get to a schematic

I don't know how feasible it is. This would probably take low $millions or so of training, data collection and research to get not trash results.

I'd certainly love it for trying to diagnose circuits.

It's probably not really that possible even at higher end consumer grade 1200dpi.

cmbuck2y ago

kristopolous2y ago

contingencies2y ago

I think good RE houses have long since likely repurposed rapid PCB testing machines to determine common nets using flying CNC probes. The good ones probably don't need to depopulate to do it.

catherd2y ago

As long as you are OK with destructive methods, grinding/sanding the board down gives you all layers. "PCB delayering" is the search term.

dindobre2y ago· 2 in thread

Using neural networks to solve combinatorial or discrete problems is a waste of time imo, but I'd be more than happy if somebody could convince me of the opposite.

utkuumur2y ago

dindobre2y ago

Thanks, will try to give it a read this weekend. Would you say that diffusion is the architectural change that opened up CO for neural nets? Haven't followed this particular niche in a while

1 more reply

guidoism2y ago· 2 in thread

It kind of grosses me out that we are entering a world where programming could be just testing (to me) random permutations of programs for correctness.

moffkalast2y ago

Well we had to keep increasing inefficiency somehow, right? Otherwise how would Wirth's law continue to hold?

thechao2y ago

Most of the HW engineers I work with consider the webstack to be far more efficient than the HW-synthesis stack; ie, there's more room for improvement in HW implementation than in SW optimization.

amelius2y ago· 2 in thread

Can we have an AI that reads datasheets and produces Spice circuits? With the goal of building a library of simulation components.

klysm2y ago

That's the kind of thing where verification is really hard, and things will look plausible even if incorrect.

amelius2y ago

The LLM can verify e.g. transistors by looking at the curves in the datasheet.

1 more reply

roody152y ago· 1 in thread

It makes me think of the saying “a jack of all trades a master of none”.

I cannot help but think there are some similarities between large model generative AI and human reasoning abilities.

For example if I ask a physician with a really high IQ some general questions about say anything like fixing shocks on my mini van … he may have some better ideas than me.

However he may be wrong since he specialized in medicine, although he may have provided some good overall info.

So with LLM AI fine tuning looks to be key as it is with human beings. Large data sets that are filtered / summarized with specific fields as the focus.

pylua2y ago

That’s not really reasoning, right ? Maybe humans rely disproportionate on association in general.

sehugg2y ago· 1 in thread

How does this compare to Flux.ai? https://docs.flux.ai/tutorials/ai-for-hardware-design

built_with_flux2y ago

flux.ai founder here

Agree with OP that the raw models aren't that useful for schematic/pcb design.

It's why we build flux from the ground up to provide the models with the right context. The models are great moderators but poor sources of great knowledge.

Here are some great use cases:

https://www.youtube.com/watch?v=XdH075ClrYk

https://www.youtube.com/watch?v=J0CHG_fPxzw&t=276s

https://www.youtube.com/watch?v=iGJOzVf0o7o&t=2s

and here a great example of levering AI to go from idea to full design https://x.com/BuildWithFlux/status/1804219703264706578

cushychicken2y ago· 1 in thread

I'm terrified that JITX will get into the LLM / Generative AI for boards business. (Don't make me homeless, Duncan!)

They are already far ahead of many others with respect to next generation EE CAD.

Judicious application of AI would be a big win for them.

Edit: adding "TL;DRN'T" to my vocabulary XD

DHaldaneOP2y ago

I promise that we want to stay a software company that helps people design things!

Adding Skynetn't to company charter...

Terr_2y ago· 1 in thread

To recycle a rant, there's a whole bunch of hype and investor money riding on a very questionable idea here, namely:

ai4ever2y ago

investor money is seduced by the possibilities and many of the investors are in it for FOMO.

few really understand what the limits of the tech are. and if it will even unlock the usecases for which it is being touted.

rkagerer2y ago

Any discussion of evolved circuits would be incomplete without mentioning Dr. Adrian Thompson's pioneering work in the 90's:

https://www.damninteresting.com/on-the-origin-of-circuits/

al2o3cr2y ago

shrubble2y ago

Reminds me of this, an earlier expert-system method for CPU design, which was not used in subsequent designs for some reason: https://en.wikipedia.org/wiki/VAX_9000#SID_Scalar_and_Vector...

MOARDONGZPLZ2y ago

amelius2y ago

The whole approach reminds me of:

https://gpt-unicorn.adamkdean.co.uk/

ncrmro2y ago

I had it generate some opencad but never looked into it further.

teleforce2y ago

Too Lazy To Click (TLTC):

TLDRN'T: We do not explore any proprietary copilots, or how to apply a things like a diffusion model to the place and route problem.

blueyes2y ago

See Quilter: https://www.quilter.ai

djaouen2y ago

Sure, this will end well lol

surfingdino2y ago

Look! You can design thousands of shit appliances at scale! /s

j / k navigate · click thread line to collapse