SC19 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Analytical Cache Modeling and Tilesize Optimization for Tensor Contractions


Authors: Rui Li (University of Utah), Aravind Sukumaran-Rajam (Ohio State University), Richard Veras (Louisiana State University), Tze Meng Low (Carnegie Mellon University), Fabrice Rastello (French Institute for Research in Computer Science and Automation (INRIA)), Atanas Rountev (Ohio State University), P. Sadayappan (University of Utah)

Abstract: Data movement between processor and memory hierarchy is a fundamental bottleneck that limits the performance of many applications on modern computer architectures. Tiling and loop permutation are key techniques for improving data locality. However, selecting effective tile-sizes and loop permutations is particularly challenging for tensor contractions due to the large number of loops. Even state-of-the-art compilers usually produce sub-optimal tile-sizes and loop permutations, as they rely on naïve cost models. In this paper, we provide an analytical model based approach to multi-level tile size optimization and permutation selection for tensor contractions. Our experimental results show that this approach achieves comparable or better performance than state-of-the-art frameworks and libraries for tensor contractions.


Presentation: file


Back to Technical Papers Archive Listing