SC19 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Making Speculative Scheduling Robust to Incomplete Data

Workshop: Making Speculative Scheduling Robust to Incomplete Data

Abstract: We study in this work the robustness of Speculative Scheduling to the incompleteness of data. Speculative scheduling has been introduce as a solution to incorporate future types of applications into the design of HPC schedulers, specifically applications whose runtime is not perfectly known but can be modeled with probability distributions. Preliminary studies show the importance of speculative scheduling when dealing with stochastic applications when the application runtime model is completely known. In this work we show how one can extract even from incomplete data on the behavior of HPC applications enough information so that speculative scheduling performs well.

Specifically, we show that for synthetic runtimes who follow usual probability distributions such as truncated normal distribution, we can extract enough data from as little as 10 previous runs, to be within 5\% of the solution which has all the exact information. For real traces of applications, the performance with 10 data points varies with the applications (within 20\% of the full-knowledge solution), but converges fast (5\% with 100 previous samples).

Finally a side effect of this study is to show the importance of the theoretical results obtained on continuous probability distributions for speculative scheduling. Indeed, we observe that the solutions for such distributions are more robust to incomplete data than the solutions for discrete distributions.

Back to 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems Archive Listing

Back to Full Workshop Archive Listing