Revisiting I/O Behavior in Large-Scale Storage Systems: The Expected and the Unexpected
TimeThursday, 21 November 201911:30am - 12pm
DescriptionLarge-scale applications typically spend a significant fraction of their execution time performing I/O to a parallel storage system. However, with rapid progress in compute and storage system stack of large-scale systems, it is critical to investigate and update our understanding of the I/O behavior of large-scale applications. Toward that end, in this work, we monitor, collect, and analyze a year's worth of storage system data from the NERSC parallel storage system which serves NERSC's two largest supercomputers, Cori and Edison. We perform temporal, spatial, and correlative analysis of the system as a whole, and of individual I/O and metadata servers, and uncover surprising patterns which defy existing assumptions about HPC I/O and have important implications for future systems.