Workshop: Enabling Low-Overhead Communication in Multi-Threaded OpenSHMEM Applications Using Contexts
Abstract: As the number of shared-memory cores per node in modern High Performance Computing (HPC) machines continues to grow, hybrid programming models like MPI+threads are becoming a preferred choice for scientific applications. While being able to utilize computation resources efficiently, threads in hybrid applications often compete with each other for communication resources, resulting in a negative impact on performance. The OpenSHMEM distributed programming model provides communication context objects that can be used to provide threads with isolated access to the network, thus reducing contention. In this work, we discuss a design for OpenSHMEM contexts and an implementation of the context construct to support hybrid multi-threaded applications and evaluate the performance of the implementation. In all our micro-benchmarks, threads show nearly identical communication performance compared to single-threaded OpenSHMEM processes. By using contexts in hybrid benchmarks, we have achieved up to 43.1% performance improvement for 3D halo exchange, 339% improvement for all-to-all communication, and 35.4% improvement for inter-node load balancing.