SC19 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

BSTC: A Novel Binarized-Soft-Tensor-Core Design for Accelerating Bit-Based Approximated Neural Nets


Authors: Ang Li (Pacific Northwest National Laboratory (PNNL)), Tong Geng (Boston University), Tianqi Wang (Boston University), Martin Herbordt (Boston University), Shuaiwen Song (University of Sydney, University of Washington), Kevin Barker (Pacific Northwest National Laboratory (PNNL))

Abstract: We propose binarized-soft-tensor-core as a software-hardware co-design approach to construct the bit-manipulation capability for modern GPUs to effectively harvest the emerging bit-level-parallelism from BNNs and a variety of domains. We propose intra- and inter-layer fusion techniques so that the entire BNN inference process can be realized in one GPU kernel, labeled as Singular-Binarized-Neural-Network. Experiments show that our design can achieve over 1000x speedup for raw inference latency and 10x for inference throughput over state-of-the-art full-precision simulated BNN inference for AlexNet on ImageNet.


Presentation: file


Back to Technical Papers Archive Listing