Authors:
Abstract: High-performance computing production support entails thorough testing in order to evaluate the efficacy of a system for production-grade workloads. There are various phases of a system’s life-cycle to assess, requiring different methods to accomplish effective evaluation of performance and correctness. Due to the unique and distributed nature of an HPC-system, the necessity for sophisticated tools to automatically harness and assess test results, all while interacting with schedulers and programming environment software, requires a customizable, extensible, and lightweight system to manage concurrent testing. Beginning with the recently refactored codebase of Pavilion 1.0, we assisted with the finishing touches on readying this software for open-source release and production usage. Pavilion 2.0 is a Python 3-based testing framework for HPC clusters that facilitates the building, running, and analysis of tests through an easy-to-use, flexible, YAML-based configuration system. This enables users to write their own tests by simply wrapping everything in Pavilion’s well-defined format.
Best Poster Finalist (BP): no
Poster: PDF
Poster summary: PDF
Back to Poster Archive Listing