BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20200129T163229Z
LOCATION:205-207
DTSTART;TZID=America/Denver:20191120T161500
DTEND;TZID=America/Denver:20191120T170000
UID:submissions.supercomputing.org_SC19_sess236_gb102@linklings.com
SUMMARY:Fast, Scalable and Accurate Finite-Element Based Ab Initio Calcula
tions Using Mixed Precision Computing: 46 PFLOPS Simulation of a Metallic
Dislocation System
DESCRIPTION:ACM Gordon Bell Finalist, Awards Presentation\n\nFast, Scalabl
e and Accurate Finite-Element Based Ab Initio Calculations Using Mixed Pre
cision Computing: 46 PFLOPS Simulation of a Metallic Dislocation System\n\
nDas, Motamarri, Gavini, Turcksin, Li...\n\nAccurate large-scale first pri
nciples calculations based on density functional theory (DFT) in metallic
systems are prohibitively expensive due to the asymptotic cubic scaling co
mputational complexity with number of electrons. Using algorithmic advance
s in employing finite-element discretization for DFT (DFT-FE) in conjuncti
on with efficient computational methodologies and mixed precision strategi
es, we delay the onset of this cubic scaling by significantly reducing the
computational prefactor while increasing the arithmetic intensity and low
ering the data movement costs. This has enabled fast, accurate, and massiv
ely parallel DFT calculations on large-scale metallic systems on both many
-core and heterogeneous architectures, with time-to-solution being an orde
r of magnitude faster than state-of-the-art plane-wave DFT codes. We demon
strate an unprecedented sustained performance of 46 PFLOPS (27.8% peak FP6
4 performance) on a dislocation system in Magnesium containing 105,080 ele
ctrons using 3,800 GPU nodes of Summit supercomputer, which is the highest
performance to date among DFT codes.\n\nRegistration Category: Tech Progr
am Reg Pass
URL:https://sc19.supercomputing.org/presentation/?id=gb102&sess=sess236
END:VEVENT
END:VCALENDAR