Workshop: Performance Portable Implementation of a Kinetic Plasma Simulation Mini-App
Abstract: Performance portability is considered to be an inevitable re- quirement in the exascale era. We explore a performance portable ap- proach for fusion plasma turbulence simulation code employing kinetic model, namely GYSELA code. For this purpose, we extract the key features of GYSELA such as high dimensionality and Semi-Lagrangian scheme, and encapsulate them into a mini-application which solves the similar but simplified Vlasov-Poisson system. We implement the mini- app with a mixed OpenACC/OpenMP and Kokkos implementation, where we suppress unnecessary duplications of code lines. For a reference case with the problem size of 128 to the 4, the Skylake (Kokkos), Nvidia Tesla P100 (OpenACC), and P100 (Kokkos) versions achieve an acceleration of 1.45, 12.95, and 17.83, respectively, with respect to the baseline OpenMP version on Intel Skylake. In addition to the performance portability, we discuss the code readability and productivity of each implementation. Based on our experience, Kokkos can offer a readable and productive code at the cost of initial porting efforts, which would be enormous for a large scale simulation code like GYSELA.