Presentation

· Contributors · Organizations · Search Program · Flagged · Happening Now · Maps

Posters

Research Posters

: Poster 73: Accelerating Large-Scale GW Calculations on Hybrid CPU-GPU Architectures

SessionResearch Posters Display

Authors

Event Type

Posters

Research Posters

Registration Categories

TimeThursday, 21 November 20198:30am - 5pm

LocationE Concourse

DescriptionIn this poster, we present the strategy, progress, and performance while GPU porting one of the major modules, epsilon, of the electronic structure code BerkeleyGW. Epsilon represents the most time-consuming routines in the BerkeleyGW workflow for large-scale material science simulations. Some of the porting/optimization strategies include, changing our original data layout to efficiently use libraries such as cuBLAS and cuFFT, implementation of specific CUDA kernels to minimize data copies between host/device and keeping data on device, efficient use of data streams to leverage high concurrency on the device, asynchronous memory copies and overlapping (MPI) communication on the host and computation on the device. Preliminary results are presented in terms of the speedup compare to the CPU-only implementation, strong/weak scaling, and power efficiency. Excellent acceleration is demonstrated: up to 30x for specific kernels. Our port also exhibits good scalability and about 16x higher FLOPs/watt efficiency compared to the CPU-only implementation.

Archive

Authors

Mauro Del Ben

Lawrence Berkeley National Laboratory

Charlene Yang

National Energy Research Scientific Computing Center (NERSC)

Felipe Jornada

University of California, Berkeley

Lawrence Berkeley National Laboratory

Steven G. Louie

University of California, Berkeley

Lawrence Berkeley National Laboratory

Jack Deslippe

National Energy Research Scientific Computing Center (NERSC)