SC19 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Revisiting I/O Behavior in Large-Scale Storage Systems: The Expected and the Unexpected

Authors: Tirthak Patel (Northeastern University), Suren Byna (Lawrence Berkeley National Laboratory), Glenn K. Lockwood (Lawrence Berkeley National Laboratory), Devesh Tiwari (Northeastern University)

Abstract: Large-scale applications typically spend a significant fraction of their execution time performing I/O to a parallel storage system. However, with rapid progress in compute and storage system stack of large-scale systems, it is critical to investigate and update our understanding of the I/O behavior of large-scale applications. Toward that end, in this work, we monitor, collect, and analyze a year's worth of storage system data from the NERSC parallel storage system which serves NERSC's two largest supercomputers, Cori and Edison. We perform temporal, spatial, and correlative analysis of the system as a whole, and of individual I/O and metadata servers, and uncover surprising patterns which defy existing assumptions about HPC I/O and have important implications for future systems.

Back to Technical Papers Archive Listing