SC19 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Abstract: Understanding the performance characteristics of applications in modern HPC environments is becoming more challenging due to the increase in the architectural and programming complexities. HPC software developers rely on sources such as hardware counters and event traces to infer performance problems while focusing on designing mitigation strategies. A large number of in-house tools exist in the community, which indicate replicated effort. This paper presents a customizable framework for analyzing performance measurements and visualizing through a web-based interactive dashboard for interactively exploring a large volume of hierarchical information. In this paper, we analyze three ECP applications as use cases and identify as well as optimize problematic resource utilization behaviors exposed by our visualizations. This framework is a step towards a unified platform for visual identification of performance scaling bottlenecks to ease the collaboration between application developers, performance analysts, and hardware vendors.

