SC19 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Cloud and Open Infrastructure Solutions To Run HPC Workloads

Authors: Martial Michel (Data Machines Corporation), Stig Telfer (StackHPC Ltd), Blair Bethwaite (New Zealand eScience Infrastructure), Micheal Lowe (Indiana University), Timothy Randles (Los Alamos National Laboratory), Robert Budden (NASA Goddard Space Flight Center, ASRC Federal), Chris Monson (Data Machines Corporation)

Abstract: Virtualization and containers have grown to see more prominent use within the realm of HPC. Adoption of these tools has enabled IT Organizations to reduce costs all the while making it easier to manage large pools of compute, storage and networking resources. However, performance overheads, networking integrations, and system complexity pose daunting architectural challenges.

OpenStack, Containers, and the orchestration thereof, all pose their own set of unique benefits and challenges. This BoF is aimed at architects, administrators, software engineers, and scientists interested in designing and deploying cloud infrastructure solutions to run HPC workloads.

Long Description: Cloud Computing represents one of the most significant shifts in IT, and the group of projects that comprise Open Infrastructure clouds is the new standard for putting cloud technologies and methodologies within reach. The level of interest in the application of OpenStack, Container and Container Orchestration technologies in the High-Performance and Research Computing space reflects the already strong representation of scientific cloud deployments amongst the research community.

As cloud computing has matured and adoption increased, the industry has turned to the challenges posed by application portability and service orchestration across multiple IaaS platforms and other cloud platforms. Containers and container-orchestration platforms provide solutions to these problems which are well suited to web-service developers and operators but often feel alien to HPC users and operators. However, several HPC-centric container solutions exist, and industry leaders such as Docker are now paying attention to HPC use-cases and needs. Though significant challenges remain when considering the production deployment and support of these technologies alongside the typical shared data infrastructure found in Supercomputing environments.

The intent of this BoF is to provide the broader HPC community an overview of the challenges of supporting HPC workloads with OpenStack, Containers, and Kubernetes, among others, as well as introduce the best practices adopted by members of the scientific cloud community.

HPC-centric topics revolve around accounting and scheduling, including practical resource allocation approaches with the on-demand IaaS model. Through an open and thoughtful exchange, we intend to begin developing a shared understanding and vision of how open cloud computing solutions can best support existing and emerging uses in a range of research disciplines.

This meeting has already happened at SC16, SC17 and SC18 in the form of an OpenStack BoF and more recently the Cloud Infrastructure Solutions To Run HPC Workloads BoF. This year we intend to continue the conversation we started last year to discuss Open Infrastructures.

Back to Birds of a Feather Archive Listing