Abstract: While FPGAs are traditionally considered hard to program, recently there are efforts to allow using high-level programming models intended for multi-core CPUs and GPUs to program FPGAs. For example, both Intel and Xilinx are now providing OpenCL-to-FPGA toolchains. However, since GPU and FPGA devices offer different parallelism models, OpenCL code optimized for GPU can prove inefficient on FPGA, in terms of both performance and hardware resource utilization.
In this poster, we explore this problem on an emerging workload: finite state automata traversal. Specifically, we explore a set of structural code changes, custom, and best-practice optimizations to retarget an OpenCL NFA engine designed for GPU to FPGA. Our evaluation, which covers traversal throughput and resource utilization, shows that our optimizations lead, on a single execution pipeline, to speedups up to 4x over an already optimized baseline that uses one of the proposed code changes to fit the original code on FPGA.
Best Poster Finalist (BP): no
Poster summary: PDF
Back to Poster Archive Listing