DescriptionLeadership high performance computing (HPC) systems have the capability to execute workflows of scientific, research or industry applications. Complex HPC workflows can have significant data transfer and I/O requirements. Heterogeneous storage systems in supercomputers equipped with bleeding-edge non-volatile persistent storage devices can be leveraged to handle these data transfer and I/O requirements efficiently.
In this poster, we describe our efforts to extract the I/O characteristics of various HPC workflows and develop strategies to improve I/O performance by leveraging heterogeneous storage systems. We have implemented an emulator to mimic different types of I/O requirements posed by HPC application workflows. We have analyzed the workflow of Cancer Moonshot Pilot 2 (CMP2) project to determine possible I/O inefficiencies. To date, we have performed a systematic characterization and evaluation on the workloads generated by the workflow emulator and a small scale adaptation of the CMP2 workflow.