undefined | Better HN

0 pointsJoshTriplett9y ago0 comments

You can't just find hotspots; you have to find hotspots whose data set is sufficiently disjoint between the FPGA and the CPU. With the right architecture (such as Intel's Xeon+FPGA package), you can have an incredibly fast interconnect, but it's still not the speed of the CPU's register file, so you can't hand off data with that granularity. You can get more than enough bandwidth, but the latency would crater your performance. You want to stream larger amounts of data at a time, or let the FPGA directly access the data.

For instance, AES-NI accelerates encryption on a CPU by adding an instruction to process a step of the encryption algorithm. Compression or encryption offloading to an FPGA streams a buffer (or multiple buffers) to the FPGA. Entirely different approach. (GPU offloading has similar properties; you don't offload data to a GPU word-by-word either.)

But even if you find such hotspots, that still isn't the hardest part. You then have to generate an FPGA design that can beat optimized CPU code without hand generation. That's one of the holy grails of FPGA tool designers.

Right now, the state of the art there is writing code for a generic accelerator architecture (e.g. OpenCL, not C) and generating offloaded code with reasonable efficiency (beating the CPU, though not hitting the limits of the FPGA hardware).

0 comments

daxfohl9y ago

It's cool to know it's an area of active research. I wonder if there are also power consumption ramifications though. While e.g. AES-NI is incomparable performance-wise, my novice (perhaps incorrect) understanding is that ARM beats x86 power consumption by having a drastically simpler instruction set.

Could a simple ARM-like instruction set plus a generic "synthesize and send this loopy junk to FPGA" have power implications without a major performance impact on cloud servers? (Yeah I know this is likely a topic for hundreds of PhD theses, but is that something being investigated too?)

j / k navigate · click thread line to collapse

0 comments

daxfohl9y ago

j / k navigate · click thread line to collapse