Machine-Learning Hardware: Architecture, System Interfaces, and Programming Models

Authors: Pete Beckman (Argonne National Laboratory), Prasanna Balaprakash (Argonne National Laboratory), Swann Perarnau (Argonne National Laboratory), Valentin Reis (Argonne National Laboratory)

Abstract: Recent years have seen a surge of investment in AI chip companies worldwide. These companies are however mostly targeting applications outside of the scientific computing community. As the use of ML accelerates in the HPC field itself, there is concern that the scientific community should influence the design of this new specialized hardware. In this BoF, we propose to let the community and select vendors engage on questions related to programming models, system interfaces, and architecture trade-offs of these chips.

Long Description: The main goal for this session is to foster exchange between ML Hardware (also called AI Chip) companies and their potential HPC users. There are many design questions in this highly heterogeneous chip landscape. We believe that both industrial players and the HPC field would benefit from having early discussions about how these chips can fit scientific HPC workloads. Indeed, it is potentially risky for the HPC community as a whole to let industrial machine learning users dominate the discussion with nascent hardware vendors. HPC ML workloads may have different needs, which we hope this session will help identify. We will focus the session on technical aspects of the hardware and software associated with these chips. The structure of the session will be twofold.

First, we will have speakers from select AI chip companies present their architectures with a focus on HPC use. We have secured Cerebras, SambaNova and NexSilicon’s participation, and are discussing with several other companies in that space (Graphcore, Groq, and others). This means working with them in advance to help them outline the aspects that we, the HPC community, are often interested in. Among these specifics, we are asking them to describe the architecture and its power/performance tradeoffs, the programming models and compiler toolchains they envision for their platforms, as well as the low-level interfaces (explicit memory discoverability and access, RDMA, etc) they plan to expose.

Second, we will encourage the scientific computing attendance to give feedback to the speakers about the relevance of these architectures to their scientific HPC workloads. The session organisers will drive informal discussions about how these architectures could benefit HPC. This discussion will encompass both the needs of the workloads themselves as well as the software stacks and programming models future accelerators would have to interact with in the HPC space. The goal is to identify gaps in the current design of ML hardware or its software interfaces, and discuss how the scientific computing community could help close those gaps. In particular, we would like to identify relevant HPC efforts that could provide benchmarks, design feedback, and integration with vendors APIs.

As an outcome for the session, we will publish the slides to the extent possible and share a short technical report on the programming models and hardware capabilities presented in the BoF. More importantly, we hope that this BoF will start in-depth interactions between the AI Chip companies and the scientific computing community at large, and that such discussions will drive forward R&D efforts from both sides.

Back to Birds of a Feather Archive Listing