An FPGA Is an Impoverished Accelerator (opens in new tab)

(homes.cs.washington.edu)

75 pointssamps11y ago39 comments

39 comments

31 comments · 7 top-level

bsder11y ago· 10 in thread

What a useless article.

The big problem is software people thinking that they have any concept of actual hardware design.

If they understood hardware, they would understand that an FPGA is the least efficient way to accomplish anything.

Routing is sparser than any chip. You burn 10-100x the transistors to do the same task. FPGA's are hot and slow.

Even for signal processing, an FPGA is going to be quite hard pressed to beat a 2.0GHz ARM with Neon extensions unless it is very expensive and your algorithm is very dataflow oriented. How many ARM's can I put on a board for $10,000-$100,000 (the very highest end FPGA's)?

You use an FPGA because you have a low-volume application that you can't do any other way, and your application has enough margin that you can eat the cost of the FPGA. And you are always looking to wipe out that FPGA and replace it with a microprocessor because it is so much cheaper and easier to deal with.

analognoise11y ago

"FPGA is the least efficient way to accomplish anything" Define 'efficient'. If you're talking about cost, it costs far less than an asic below a certain volume, and it certainly costs less in tooling and development.

"FPGA's are hot and slow" - compared to a full custom IC? Sure. FPGA's improve with every process generation (like all silicon devices) and an ASIC design won't intrinsically take advantage of those advances; an FPGA design that didn't meet it's power or thermal envelope 5 years ago might easily do so now, without incurring the NRE of the ASIC - add to that the fact that the first stage of the ASIC design can be prototyped via the FPGA, and you have a viable product without the risk of a bad ASIC.

I've been involved in converting several Virtex-2 designs to newer devices - the huge reduction in power and increase in available logic has led to some extremely impressive gains. There is work to do in such a conversion, but it is understood work - there's no real mystery to updating the CoreGen components.

Agree it is a useless article though, because digital logic design is not programming (it is architectural work). There is no 'abstracting' that away - all attempts thus far have failed miserably (Vivado HLS, for example, turns out designs that work but are HUGE compared to what even a passable designer can do).

aninhumer11y ago

>There is no 'abstracting' that away

While hardware design often has awkward constraints that make generalised abstractions tricky, there is still a lot that can be done to improve over Verilog or VHDL. I've been working in Bluespec for a couple of years now, and the difference is night and day. Having a modern type system in our HDL makes experimentation and iteration so much easier.

minthd11y ago

You can still abstract lots of it. For example chisel[1] does so through high level abstractions and parametrized generators - while still offering results equal to verilog.It's also also used for the open source risc-V architecture[4], the recent highly-efficient parallel cpu[2] done by a small start-up(2 guys),and a floating-point alu generator[3] that explores the design space of fpu's and finds optimal designs.

And reading about these projects, one gets the impression chisel was critical for them.

[1]https://chisel.eecs.berkeley.edu/chisel-dac2012.pdf

[2]http://www.eetimes.com/document.asp?doc_id=1324759

[3]http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=654588...

[4]http://www.eetimes.com/author.asp?doc_id=1323406

bsder11y ago

> it costs far less than an asic below a certain volume

And that's it. That is the ONLY place where an FPGA makes sense--"I'd like to have an ASIC, but I can't afford the NRE."

There is no technological axis in which an FPGA is superior to anything.

wavegeek11y ago

He is right in the sense that for someone from a programming background, using FPGAs is far harder than it could be. I have looked into it a few times and it is indeed horrible.

It looks to me like a typical situation of legacy tools and some degree of oligopoly. I imagine a hardware person going into programming would have a similar experience (in reverse).

smilekzs11y ago

Legacy tools, IMHO, is the biggest problem. This article got it right -- you can't just get rid of them.

gaze11y ago

Okay then. I wanna know of a better way to take four 1 GSPS signals, demodulate them, and pump out another two 1 GSPS signals which encode decisions made every every four samples on the incoming signals. That's ONE of the problems in quantum control for stabilizing one qubit. We did it with an FPGA. If you know of a DSP or systolic array or processor or what have you for doing this, I'm ALL ears. Oh and the timing must be COMPLETELY deterministic down to the nanosecond.

In fact, if you know of ANY general purpose hardware that will talk to gigasample ADCs/DACs, I'd love to know about it.

bsder11y ago

ASIC. The timing will be better. Less power will be burned.

The only thing an FPGA wins on is NRE (non-recurring engineering).

The real problem is that the giga-sample DAC's/ADC's aren't willing to speak one of the actual high-speed interfaces or put a DSP directly on the ADC/DAC. So, everybody needs to use an FPGA to shoehorn the data into a useful form.

If somebody put an actual DSP on their ADC/DAC, FPGA's would evaporate for this application like they have evaporated for so many others.

Any FPGA application with volume eventually gets subsumed by special purpose hardware on a microcontroller. For example, people used to use FPGA's for PWM, motor control, etc. Now those blocks are standard on microcontrollers.

1 more reply

mng211y ago

The grandparent is responding to the article, which is talking about general purpose computation. There's no question that your application is a prime use case for an FPGA.

minthd11y ago

I'm not sure fpga's are the least efficient everywhere,aren't there micro+fpga chips(like xilinx zynq) used in high/medium volume embedded system ? and wouldn't better tools increase volumes, leading to lower costs ?

retroencabulato11y ago· 5 in thread

I wish he would comment more on what he finds wrong with HDLs?

I fail to understand why using a HDL for a digital ASIC is fine, but using one for a FPGA in the context of acceleration is not.

jjoonathan11y ago

He's annoyed that the HDL doesn't describe the entirety of a typical FPGA "program". Some things just can't be emulated efficiently by the FPGA fabric (or they are common enough patterns that it would be wasteful to do so) and the workaround that has become the de-facto standard is to include "ASIC chunks" in the middle of all the programmable gates. For instance, you might have a serial output that runs at 10Gbps while the rest of the FPGA runs at 500MHz. To bridge the gap between the slower programmable logic and the fast transceiver you need a shift register. The way you specify this in code is by importing a vendor-specific "library" -- except it's not really a HDL library at all, it's a black box that the proprietary back-end hooks up to to the "ASIC chunks" at compile time.

It's like compiling against a binary library, except that the binary isn't another piece of software, it's an etched pattern on your FPGA's wafer. Even if you did have the "source code" it wouldn't do you any good unless you have a foundry in your backyard :-)

I'm skeptical of the calls for a higher level of abstraction. How are you going to abstract away the fact that the FPGA has exactly 2 embedded memory controllers that have precisely A, B, and C inputs and X, Y, and Z outputs? Either you come up with a solution that's effectively just as ugly as what we have now (because it exposes the FPGA's resources explicitly) or you come up with a solution that hides these details and as a result becomes enormously fragile because it's easy to accidentally change something that prevents the compiler from inferring which embedded ASIC chunk you meant to use. You need to be aware of limitations to work within them, and the limitations seem to be stuck with us for the foreseeable future.

al2o3cr11y ago

"How are you going to abstract away the fact that the FPGA has exactly 2 embedded memory controllers that have precisely A, B, and C inputs and X, Y, and Z outputs?"

The same thing we do every time, Pinky: assume the existence of a sufficiently advanced compiler. ;)

mng211y ago

Wouldn't an ASIC have hard macros anyway? It sounds to me more like the author is annoyed that "it's 2014 and we are still using HDL", and also that FPGAs are somehow supposed to be readily exploitable for general-purpose computation.

reacweb11y ago

The same way we have 3D printers now, I would dream to have a foundry in my backyard.

1 more reply

ryanmk11y ago

It sounds like he doesn't think the level of abstraction offered is high enough. If you are using an FPGA as a general purpose device, and not for prototyping, then a higher level of abstraction would be helpful. Otherwise, if you are prototyping, you may want a lower level of abstraction that may offer a closer approximation to your end goal.

thisrod11y ago· 3 in thread

I've heard several computational physicists make this complaint to NVIDIA sales reps. The standard response, which I'm sure is correct, goes as follows.

Designing a fast processor is very expensive, far beyond the means of the research community. The only way anyone can afford it is to sell millions of the things to gamers. To put $1 of special hardware on your numerical card, we have to put it on 1000 graphics cards too, so you'd have to pay $1000 for it. Bad luck: scientists are destined to hack hardware that was designed for larger markets.

jjoonathan11y ago

Yet somehow AMD manages to consistently offer better hardware (wrt double floating point performance) for a lower price. I'm sure it's because the fine folks at AMD are silicon wizards and not because of NVIDIA's cozy monopoly position due to shrewd marketing of CUDA + their early-mover advantage in academic markets.

duaneb11y ago

The tools are more valuable than raw performance, which can be bought with time and money.

1 more reply

JanneVee11y ago

> Bad luck: scientists are destined to hack hardware that was designed for larger markets.

Good luck: You get the perfomance of what was considered a supercomputer little over two decades ago, by hacking a consumer-technology product.

sklogic11y ago· 3 in thread

Yes, RTL level of abstraction is a way too low, even for most of the ASIC things. Yes, we need higher level HDLs (more abstract than the said Chisel and Bluespec). I'm working on it, stay tuned.

But what I cannot get from this article is what is exactly wrong with the current FPGAs design? They've got DSP slices (i.e., ALU macros), they've got block RAMs and all the routing facilities one can imagine. For the dataflow stuff it's more than enough.

Of course it would have been much better if the vendors published the detailed datasheets for all the available cells and the interconnect, for the bitfile formats, etc. - to make it possible for the alternative, open source toolchains to appear. Yes, their existing toolchains are, well, clumsy. But it is still quite possible to abstract away from the peculiarities of these toolchains.

minthd11y ago

Best of luck for your project. I'm curious about it and i'll wait.Instead i'll ask: what are your opinion regarding embedded/mcu software tools? do you see something better than rust that can automate the dev process ?

sklogic11y ago

Thanks. I've been a Forth fan, but recently, looking at the advances in the static code analysis, I'd suspect that the higher level languages have a chance to become very useful in the mcu limited resources environment too. Rust is a nice attempt, and there is also a possibility that something doing a proper region analysis can kick in (looking at languages like Harlan, I would not say it's impossible).

smilekzs11y ago

I once had a vision, but now I see rust, once matured, as the best contender here.

nullc11y ago· 2 in thread

FPGAs would be more attractive if they weren't so over priced... good thing that patents are around to almost completely eliminate competition in that space.

Alphasite_11y ago

Or from the other view, reduce entering the market from an extremely lengthy and risk R&D venture into a known fee. Which you can account for and drastically lowers risk.

sliverstorm11y ago

On what grounds are they overpriced? Because they are expensive?

socceroos11y ago· 1 in thread

Are there any attempts out there to build a better open standard than FPGA? I'd be interested to look into them if there were.

minthd11y ago

I believe menta licenses fpga cores(LUT architectures). But FPGA's have so much that isn't LUT which is critical for performance.

gioele11y ago

> FPGAs are legacy baggage in the same way that GPGPUs are.

I hoped the author would expand on this point.

It is also my impression that GPGPU are just "a hack": they should had been normal coprocessors to the main CPU, just like the FPU and the vector units are. It seems that now we are finally reaching that model (in Linux the graphics device is almost completely separated from the computational device, although they are on the same physical device most of the time) but we are still far from the "Comprocessor extension" opcode space of MIPS processors or to the "brain and arms" of CELL (1 generic CPU, many specialized coprocessors).

j / k navigate · click thread line to collapse

39 comments

31 comments · 7 top-level

bsder11y ago· 10 in thread

What a useless article.

The big problem is software people thinking that they have any concept of actual hardware design.

If they understood hardware, they would understand that an FPGA is the least efficient way to accomplish anything.

Routing is sparser than any chip. You burn 10-100x the transistors to do the same task. FPGA's are hot and slow.

analognoise11y ago

aninhumer11y ago

>There is no 'abstracting' that away

minthd11y ago

And reading about these projects, one gets the impression chisel was critical for them.

[1]https://chisel.eecs.berkeley.edu/chisel-dac2012.pdf

[2]http://www.eetimes.com/document.asp?doc_id=1324759

[3]http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=654588...

[4]http://www.eetimes.com/author.asp?doc_id=1323406

bsder11y ago

> it costs far less than an asic below a certain volume

And that's it. That is the ONLY place where an FPGA makes sense--"I'd like to have an ASIC, but I can't afford the NRE."

There is no technological axis in which an FPGA is superior to anything.

wavegeek11y ago

He is right in the sense that for someone from a programming background, using FPGAs is far harder than it could be. I have looked into it a few times and it is indeed horrible.

It looks to me like a typical situation of legacy tools and some degree of oligopoly. I imagine a hardware person going into programming would have a similar experience (in reverse).

smilekzs11y ago

Legacy tools, IMHO, is the biggest problem. This article got it right -- you can't just get rid of them.

gaze11y ago

In fact, if you know of ANY general purpose hardware that will talk to gigasample ADCs/DACs, I'd love to know about it.

bsder11y ago

ASIC. The timing will be better. Less power will be burned.

The only thing an FPGA wins on is NRE (non-recurring engineering).

If somebody put an actual DSP on their ADC/DAC, FPGA's would evaporate for this application like they have evaporated for so many others.

1 more reply

mng211y ago

The grandparent is responding to the article, which is talking about general purpose computation. There's no question that your application is a prime use case for an FPGA.

minthd11y ago

retroencabulato11y ago· 5 in thread

I wish he would comment more on what he finds wrong with HDLs?

I fail to understand why using a HDL for a digital ASIC is fine, but using one for a FPGA in the context of acceleration is not.

jjoonathan11y ago

al2o3cr11y ago

"How are you going to abstract away the fact that the FPGA has exactly 2 embedded memory controllers that have precisely A, B, and C inputs and X, Y, and Z outputs?"

The same thing we do every time, Pinky: assume the existence of a sufficiently advanced compiler. ;)

mng211y ago

reacweb11y ago

The same way we have 3D printers now, I would dream to have a foundry in my backyard.

1 more reply

ryanmk11y ago

thisrod11y ago· 3 in thread

I've heard several computational physicists make this complaint to NVIDIA sales reps. The standard response, which I'm sure is correct, goes as follows.

jjoonathan11y ago

duaneb11y ago

The tools are more valuable than raw performance, which can be bought with time and money.

1 more reply

JanneVee11y ago

> Bad luck: scientists are destined to hack hardware that was designed for larger markets.

Good luck: You get the perfomance of what was considered a supercomputer little over two decades ago, by hacking a consumer-technology product.

sklogic11y ago· 3 in thread

Yes, RTL level of abstraction is a way too low, even for most of the ASIC things. Yes, we need higher level HDLs (more abstract than the said Chisel and Bluespec). I'm working on it, stay tuned.

minthd11y ago

sklogic11y ago

smilekzs11y ago

I once had a vision, but now I see rust, once matured, as the best contender here.

nullc11y ago· 2 in thread

FPGAs would be more attractive if they weren't so over priced... good thing that patents are around to almost completely eliminate competition in that space.

Alphasite_11y ago

Or from the other view, reduce entering the market from an extremely lengthy and risk R&D venture into a known fee. Which you can account for and drastically lowers risk.

sliverstorm11y ago

On what grounds are they overpriced? Because they are expensive?

socceroos11y ago· 1 in thread

Are there any attempts out there to build a better open standard than FPGA? I'd be interested to look into them if there were.

minthd11y ago

I believe menta licenses fpga cores(LUT architectures). But FPGA's have so much that isn't LUT which is critical for performance.

gioele11y ago

> FPGAs are legacy baggage in the same way that GPGPUs are.

I hoped the author would expand on this point.

j / k navigate · click thread line to collapse