SC19 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Abstract: Reproducibility is a core tenet of the scientific process, yet it remains elusive for much of the sophisticated analysis required in modern science. In this paper, we describe how reproducibility is addressed in the KBase platform, a web-based platform for performing sophisticated analysis of biological data with the goal of enabling reproducible, predictive biology. We give an overview of the architecture and some of the key design considerations. Containers play a key role in the KBase design and how it achieves a measure of strong reproducibility. We explain how containers are utilized in the platform and some of the additional considerations that aid in the goal for reproducibility. Finally, we compare KBase with other similar platforms and systems and discuss future plans.

