Presentation
Hatchet: Pruning the Overgrowth in Parallel Profiles
SessionPerformance Tools
Event Type
Paper
TP
Benchmarks
Data Analytics
Data Management
Memory
Performance
System Software
Tools
Visualization
TimeTuesday, 19 November 20192:30pm - 3pm
Location401-402-403-404
DescriptionPerformance analysis is critical for eliminating scalability bottlenecks in parallel codes. There are many profiling tools that can instrument codes and gather performance data, but general, easy to use, and programmable analytics and visualization tools are limited. In this paper, we focus on the analytics of structured profiling data, such as that obtained from calling context trees or nested region timers in code. We present a set of techniques and operations that build on the pandas data analysis library to enable parallel profile analysis. We have implemented these techniques in a Python-based library called Hatchet, which allows structured data to be filtered, aggregated, and pruned. Using performance datasets obtained from profiling parallel codes, we demonstrate how common performance analyses can be performed reproducibly with only a few lines of Hatchet code. Hatchet brings the power of modern data science tools to bear on performance analysis.
Download PDF
Archive