SC19 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Poster 110: Hierarchical Data Prefetching in Multi-Tiered Storage Environments

Authors: Hariharan Devarajan (Illinois Institute of Technology), Anthony Kougkas (Illinois Institute of Technology), Xian-He Sun (Illinois Institute of Technology)

Abstract: In the era of data-intensive computing, accessing data with a high-throughput and low-latency is very imperative. Data prefetching is used for hiding read latency by requesting data before it is needed to move it from a high-latency medium to a low-latency one. However, existing solutions do not consider the multi-tiered storage and also suffer from under-utilization of prefetching resources and unnecessary evictions. Additionally, existing approaches implement a client-pull model where understanding the application's I/O behavior drives prefetching decisions. Moving toward exascale, where machines run multiple applications concurrently by accessing files in a workflow, a more data-centric approach resolves challenges such as cache pollution and redundancy. In this study, we present HFetch, a truly hierarchical data prefetcher that adopts a server-push approach to data prefetching. We demonstrate the benefits of such an approach. Results show 10-35% performance gains over existing prefetchers and over 50% when compared to systems with no prefetching.

Best Poster Finalist (BP): yes

Poster: PDF
Poster summary: PDF

Back to Poster Archive Listing