To the best of our knowledge, state-of-the-art performance for forward propagation of CNNs on FPGAs was achieved by a team at Microsoft. Ovtcharov et al. have reported a throughput of 134 images/second on the ImageNet 1K dataset [28], which amounts to roughly 3x the throughput of the next closest competitor, while operating at 25 W on a Stratix V D5 [30]. This performance is projected to increase by using top-of-the-line FPGAs, with an estimated through- put of roughly 233 images/second while consuming roughly the same power on an Arria 10 GX1150. This is com- pared to high-performing GPU implementations (Caffe + cuDNN), which achieve 500-824 images/second, while con- suming 235 W. Interestingly, this was achieved using Micros oft- designed FPGA boards and servers, an experimental project which integrates FPGAs into datacenter applications.