SC19 Proceedings

Paper Presentations

  1. Adaptive Neural Network-Based Approximation to Accelerate Eulerian Fluid Simulation Wenqian Dong, Jie Liu, Zhen Xie, and Dong Li (University of California, Merced)

  2. Addressing Data Resiliency for Staging Based Scientific Workflows Shaohua Duan, Pradeep Subedi, Philip E. Davis, and Manish Parashar (Rutgers University, Rutgers Discovery Informatics Institute)

  3. Almost Deterministic Work Stealing Shumpei Shiina and Kenjiro Taura (University of Tokyo)

  4. Analytical Cache Modeling and Tilesize Optimization for Tensor Contractions Rui Li (University of Utah), Aravind Sukumaran-Rajam (Ohio State University), Richard Veras (Louisiana State University), Tze Meng Low (Carnegie Mellon University), Fabrice Rastello (French Institute for Research in Computer Science and Automation (INRIA)), Atanas Rountev (Ohio State University), and P. Sadayappan (University of Utah)

  5. Assessing the Impact of Timing Errors on HPC Applications Chun-Kai Chang, Wenqi Yin, and Mattan Erez (University of Texas)

  6. AutoFFT: A Template-Based FFT Codes Auto-Generation Framework for ARM and X86 CPUs Zhihao Li, Haipeng Jia, Yunquan Zhang, Tun Chen, Liang Yuan, Luning Cao, and Xiao Wang (Institute of Computing Technology, Chinese Academy of Sciences)

  7. Bandwidth Steering for HPC Using Silicon Nanophotonics George Michelogiannakis (Lawrence Berkeley National Laboratory, Stanford University); Yiwen Shen, Min Yee Teh, Xiang Meng, and Benjamin Aivazi (Columbia University); Taylor Groves and John Shalf (Lawrence Berkeley National Laboratory); Madeleine Glick (Columbia University); Manya Ghobadi (Massachusetts Institute of Technology (MIT)); Larry Dennison (Nvidia Corporation); and Keren Bergman (Columbia University)

  8. BinFI: An Efficient Fault Injector for Safety-Critical Machine Learning Systems Zitao Chen, Guanpeng Li, and Karthik Pattabiraman (University of British Columbia) and Nathan DeBardeleben (Los Alamos National Laboratory)

  9. BSTC: A Novel Binarized-Soft-Tensor-Core Design for Accelerating Bit-Based Approximated Neural Nets Ang Li (Pacific Northwest National Laboratory (PNNL)); Tong Geng, Tianqi Wang, and Martin Herbordt (Boston University); Shuaiwen Song (University of Sydney, University of Washington); and Kevin Barker (Pacific Northwest National Laboratory (PNNL))

  10. CARE: Compiler-Assisted Recovery from Soft Failures Chao Chen, Greg Eisenhauer, and Santosh Pande (Georgia Institute of Technology) and Qiang Guan (Kent State University)

  11. Channel and Filter Parallelism for Large-Scale CNN Training Nikoli Dryden (University of Illinois, Lawrence Livermore National Laboratory); Naoya Maruyama, Tim Moon, and Tom Benson (Lawrence Livermore National Laboratory); Marc Snir (University of Illinois); and Brian Van Essen (Lawrence Livermore National Laboratory)

  12. Code Generation for Massively Parallel Phase-Field Simulations Martin Bauer (University of Erlangen-Nuremberg); Johannes Hötzer (University of Applied Science Karlsruhe, Karlsruhe Institute of Technology); Domink Ernst and Julian Hammer (University of Erlangen-Nuremberg); Marco Seiz and Henrik Hierl (Karlsruhe Institute of Technology); Jan Hönig, Harald Köstler, and Gerhard Wellein (University of Erlangen-Nuremberg); Britta Nestler (Karlsruhe Institute of Technology, University of Applied Science Karlsruhe); and Ulrich Rüde (University of Erlangen-Nuremberg, CERFACS)

  13. ComDetective: A Lightweight Communication Detection Tool for Threads Muhammad Aditya Sasongko (Koc University, Turkey); Milind Chabbi (Scalable Machines Research); and Palwisha Akhtar and Didem Unat (Koc University, Turkey)

  14. Compiler Assisted Hybrid Implicit and Explicit GPU Memory Management Under Unified Address Space Lingda Li (Brookhaven National Laboratory) and Barbara Chapman (Brookhaven National Laboratory, Stony Brook University)

  15. Conflict-Free Symmetric Sparse Matrix-Vector Multiplication on Multicore Architectures Athena Elafrou, Georgios Goumas, and Nectarios Koziris (National Technical University of Athens)

  16. Consensus Equilibrium Framework for Super-Resolution and Extreme-Scale CT Reconstruction Xiao Wang (Harvard Medical School, Boston Children's Hospital); Venkatesh Sridhar (Purdue University); Zahra Ronaghi (Nvidia Corporation); Rollin Thomas, Jack Deslippe, and Dilworth Parkinson (Lawrence Berkeley National Laboratory); Gregery T. Buzzard (Purdue University); Samuel P. Midkiff (Purdue University, Boston Children's Hospital); Charles A. Bouman (Purdue University); and Simon K. Warfield (Harvard Medical School, Boston Children's Hospital)

  17. A Constraint-Based Approach to Automatic Data Partitioning for Distributed Memory Execution Wonchan Lee and Manolis Papadakis (Stanford University), Elliott Slaughter (SLAC National Accelerator Laboratory), and Alex Aiken (Stanford University)

  18. D2P: From Recursive Formulations to Distributed-Memory Codes Nikhil Hegde (Indian Institute of Technology Dharwad, Purdue University) and Qifan Chang and Milind Kulkarni (Purdue University)

  19. A Data-Centric Approach to Extreme-Scale Ab Initio Dissipative Quantum Transport Simulations Alexandros Nikolaos Ziogas, Tal Ben-Nun, Guillermo Indalecio Fernandez, Timo Schneider, Mathieu Luisier, and Torsten Hoefler (ETH Zurich)

  20. Diogenes: Looking for an Honest CPU/GPU Performance Measurement Tool Benjamin R. Welton and Barton P. Miller (University of Wisconsin)

  21. Distributed Enhanced Suffix Arrays: Efficient Algorithms for Construction and Querying Patrick Flick and Srinivas Aluru (Georgia Institute of Technology)

  22. An Early Evaluation of Intel’s Optane DC Persistent Memory Module and Its Impact on High-Performance Scientific Applications Michèle Weiland (Edinburgh Parallel Computing Centre), Holger Brunst (Technical University Dresden), Tiago Quintino (European Centre for Medium-Range Weather Forecasts), Nick Johnson (Edinburgh Parallel Computing Centre), Olivier Iffrig and Simon Smart (European Centre for Medium-Range Weather Forecasts), Christian Herold (Technical University Dresden), Antonino Bonanni (European Centre for Medium-Range Weather Forecasts), and Adrian Jackson and Mark Parsons (Edinburgh Parallel Computing Centre)

  23. An Efficient Mixed-Mode Representation of Sparse Tensors Israt Nisa (Ohio State University), Jiajia Li (Pacific Northwest National Laboratory (PNNL)), Aravind Sukumaran-Rajam and Prashant Rawat (Ohio State University), Sriram Krishnamoorthy (Pacific Northwest National Laboratory (PNNL)), and P. (Saday) Sadayappan (University of Utah)

  24. End-to-End I/O Portfolio for the Summit Supercomputing Ecosystem Sarp Oral (Oak Ridge National Laboratory, OpenSFS Inc) and Sudharshan S. Vazhkudai, Feiyi Wang, Christopher Zimmer, Christopher Brumgard, Jesse Hanley, George Markomanolis, Ross Miller, Dustin Leverman, Scott Atchley, and Verónica G. Melesse Vergara (Oak Ridge National Laboratory)

  25. Etalumis: Bringing Probabilistic Programming to Scientific Simulators at Scale Atilim Gunes Baydin (University of Oxford), Lei Shao (Intel Corporation), Wahid Bhimji (Lawrence Berkeley National Laboratory), Lukas Heinrich (European Organization for Nuclear Research (CERN)), Lawrence F. Meadows (Intel Corporation), Jialin Liu (Lawrence Berkeley National Laboratory), Andreas Munk and Saeid Naderiparizi (University of British Columbia), Bradley Gram-Hansen (University of Oxford), Gilles Louppe (University of Liege), Mingfei Ma and Xiaohui Zhao (Intel Corporation), Philip Torr (University of Oxford), Victor Lee (Intel Corporation), Kyle Cranmer (New York University), Mr Prabhat (Lawrence Berkeley National Laboratory), and Frank Wood (University of British Columbia)

  26. An Evaluation of the CORAL Interconnects Christopher Zimmer and Scott Atchley (Oak Ridge National Laboratory); Ramesh Pankajakshan (Lawrence Livermore National Laboratory); Brian E. Smith (Oak Ridge National Laboratory); Ian Karlin, Matt Leininger, Adam Bertsch, Brian S. Ryujin, and Jason Burmark (Lawrence Livermore National Laboratory); André Walker-Loud (Lawrence Berkeley National Laboratory); M. A. Clark (Nvidia Corporation); and Olga Pearce (Lawrence Livermore National Laboratory)

  27. Exploiting Reuse and Vectorization in Blocked Stencil Computations on CPUs and GPUs Tuowen Zhao (University of Utah), Protonu Basu (Facebook), Samuel Williams (Lawrence Berkeley National Laboratory), Mary Hall (University of Utah), and Hans Johansen (Lawrence Berkeley National Laboratory)

  28. Fast, Scalable and Accurate Finite-Element Based Ab Initio Calculations Using Mixed Precision Computing: 46 PFLOPS Simulation of a Metallic Dislocation System Sambit Das, Phani Motamarri, and Vikram Gavini (University of Michigan); Bruno Turcksin (Oak Ridge National Laboratory); Ying Wai Li (Los Alamos National Laboratory, Oak Ridge National Laboratory); and Brent Leback (Nvidia Corporation)

  29. From Facility to Application Sensor Data: Modular, Continuous and Holistic Monitoring with DCDB Alessio Netti and Micha Mueller (Leibniz Supercomputing Centre, Technical University Munich); Axel Auweter (MEGWARE Computer); Carla Guillen, Michael Ott, and Daniele Tafani (Leibniz Supercomputing Centre); and Martin Schulz (Technical University Munich)

  30. From Piz Daint to the Stars: Simulation of Stellar Mergers Using High-Level Abstractions Gregor Daiß (University of Stuttgart), Parsa Amini (Louisiana State University), John Biddiscombe (Swiss National Supercomputing Centre (CSCS)), Patrick Diehl and Juhan Frank (Louisiana State University), Kevin Huck (University of Oregon), Hartmut Kaiser and Dominic Marcello (Louisiana State University), and David Pfander and Dirk Pflüger (University of Stuttgart)

  31. FT-iSort: Efficient Fault Tolerance for Introsort Sihuan Li, Hongbo Li, and Xin Liang (University of California, Riverside); Jieyang Chen (Oak Ridge National Laboratory); Elisabeth Giem, Kaiming Ouyang, and Kai Zhao (University of California, Riverside); Sheng Di and Franck Cappello (Argonne National Laboratory); and Zizhong Chen (University of California, Riverside)

  32. Full-State Quantum Circuit Simulation by Using Data Compression Xin-Chuan Wu (University of Chicago); Sheng Di (Argonne National Laboratory); Emma Maitreyee Dasgupta (University of Chicago); Franck Cappello, Hal Finkel, and Yuri Alexeev (Argonne National Laboratory); and Frederic T. Chong (University of Chicago)

  33. Fully Integrated FPGA Molecular Dynamics Simulations Chen Yang, Tong Geng, Tianqi Wang, Rushi Patel, Qingqing Xiong, Ahmed Sanaullah, and Chunshu Wu (Boston University); Jiayi Sheng (Falcon Computing Solutions Inc); Charles Lin, Vipin Sachdeva, and Woody Sherman (Silicon Therapeutics); and Martin Herbordt (Boston University)

  34. GPCNeT: Designing a Benchmark Suite for Inducing and Measuring Contention in HPC Networks Sudheer Chunduri (Argonne National Laboratory), Taylor Groves (Lawrence Berkeley National Laboratory), Peter Mendygral (Cray Inc), Brian Austin (Lawrence Berkeley National Laboratory), Jacob Balma and Krishna Kandalla (Cray Inc), Kalyan Kumaran (Argonne National Laboratory), Glenn Lockwood (Lawrence Berkeley National Laboratory), Scott Parker (Argonne National Laboratory), Steven Warren and Nathan Wichmann (Cray Inc), and Nicholas Wright (Lawrence Berkeley National Laboratory)

  35. GPU Acceleration of Extreme Scale Pseudo-Spectral Simulations of Turbulence Using Asynchronism Kiran Ravikumar (Georgia Institute of Technology), David Appelhans (IBM Corporation), and P.K. Yeung (Georgia Institute of Technology)

  36. GraphM: An Efficient Storage System for High Throughput of Concurrent Graph Processing Jin Zhao, Yu Zhang, and Xiaofei Liao (Huazhong University of Science and Technology); Ligang He (University of Warwick); Bingsheng He (National University of Singapore); and Hai Jin, Haikun Liu, and Yicheng Chen (Huazhong University of Science and Technology)

  37. Hatchet: Pruning the Overgrowth in Parallel Profiles Abhinav Bhatele (University of Maryland, Lawrence Livermore National Laboratory) and Stephanie Brink and Todd Gamblin (Lawrence Livermore National Laboratory)

  38. High Performance Monte Carlo Simulation of Ising Model on TPU Clusters Kun Yang, Yi-Fan Chen, Georgios Roumpos, Chris Colby, and John Anderson (Google LLC)

  39. HyperX Topology: First At-Scale Implementation and Comparison to the Fat-Tree Jens Domke and Satoshi Matsuoka (RIKEN Center for Computational Science (R-CCS), RIKEN); Ivan Radanov Ivanov, Yuki Tsushima, Tomoya Yuki, Akihiro Nomura, and Shin'ichi Miura (Tokyo Institute of Technology); and Nic McDonald, Dennis Lee Floyd, and Nicolas Dubé (Hewlett Packard Enterprise)

  40. iFDK: A Scalable Framework for Instant High-Resolution Image Reconstruction Peng Chen (Tokyo Institute of Technology, National Institute of Advanced Industrial Science and Technology (AIST)); Mohamed Wahib, Shinichiro Takizawa, and Ryousei Takano (National Institute of Advanced Industrial Science and Technology (AIST)); and Satoshi Matsuoka (RIKEN Center for Computational Science (R-CCS), Tokyo Institute of Technology)

  41. INCA: In-Network Compute Assistance Whit Schonbein, Ryan E. Grant, and Matthew G. F. Dosanjh (Sandia National Laboratories) and Dorian Arnold (Emory University)

  42. Large-Batch Training for LSTM and Beyond Yang You (University of California, Berkeley; Google LLC); Jonathan Hseu and Chris Ying (Google LLC); James Demmel and Kurt Keutzer (University of California, Berkeley); and Cho-Jui Hsieh (University of California, Los Angeles (UCLA); Google LLC)

  43. A Large-Scale Study of MPI Usage in Open-Source HPC Applications Ignacio Laguna (Lawrence Livermore National Laboratory); Ryan Marshall (University of Tennessee, Chattanooga); Kathryn Mohror (Lawrence Livermore National Laboratory); Martin Ruefenacht and Anthony Skjellum (University of Tennessee, Chattanooga); and Nawrin Sultana (Auburn University)

  44. Legate NumPy: Accelerated and Distributed Array Computing Michael Bauer and Michael Garland (Nvidia Corporation)

  45. Local-Global Merge Tree Computation with Local Exchanges Arnur Nigmetov (Graz University of Technology) and Dmitriy Morozov (Lawrence Berkeley National Laboratory)

  46. LPCC: Hierarchical Persistent Client Caching for Lustre Yingjin Qian, Xi Li, and Shuichi Ihara (DataDirect Networks (DDN)); Andreas Dilger (Whamcloud Inc); Carlos Thomaz and Shilong Wang (DataDirect Networks (DDN)); Wen Cheng, Chunyan Li, Lingfang Zeng, Fang Wang, and Dan Feng (Huazhong University of Science and Technology); and Tim Suesst and Andre Brinkmann (Johannes Gutenberg University Mainz)

  47. A Massively Parallel Infrastructure for Adaptive Multiscale Simulations: Modeling RAS Initiation Pathway for Cancer Francesco Di Natale, Harsh Bhatia, and Timothy S. Carpenter (Lawrence Livermore National Laboratory); Chris Neale (Los Alamos National Laboratory); Sara Kokkila Schumacher (IBM Research); Tomas Oppelstrup (Lawrence Livermore National Laboratory); Liam Stanton (San Jose State University); Xiaohua Zhang, Shiv Sundram, Thomas R. W. Scogland, Gautham Dharuman, Michael P. Surh, and Yue Yang (Lawrence Livermore National Laboratory); Claudia Misale (IBM Research); Lars Schneidenbach, Carlos Costa, Changhoan Kim, and Bruce D'Amora (IBM Corporation); Sandrasegaram Gnanakaran (Los Alamos National Laboratory); Dwight V. Nissley (Frederick National Laboratory for Cancer Research); and Fred Streitz, Felice C. Lightstone, Peer-Timo Bremer, James N. Glosli, and Helgi I. Ingolfsson (Lawrence Livermore National Laboratory)

  48. MemXCT: Memory-Centric X-Ray CT Reconstruction with Massive Parallelization Mert Hidayetoglu (University of Illinois); Tekin Bicer (Argonne National Laboratory); Simon Garcia de Gonzalo (University of Illinois); Bin Ren (College of William & Mary); Doga Gursoy, Rajkumar Kettimuthu, and Ian T. Foster (Argonne National Laboratory); and Wen-Mei W. Hwu (University of Illinois)

  49. MIQS: Metadata Indexing and Querying Service for Self-Describing File Formats Wei Zhang (Texas Tech University), Suren Byna and Houjun Tang (Lawrence Berkeley National Laboratory), and Brody Williams and Yong Chen (Texas Tech University)

  50. Mitigating Network Noise on Dragonfly Networks through Application-Aware Routing Daniele De Sensi (University of Pisa, ETH Zurich) and Salvatore Di Girolamo and Torsten Hoefler (ETH Zurich)

  51. Moment Representation in the Lattice Boltzmann Method on Massively Parallel Hardware Madhurima Vardhan (Duke University), John Gounley (Oak Ridge National Laboratory), Luiz Hegele (Santa Catarina State University), Erik Draeger (Lawrence Livermore National Laboratory), and Amanda Randles (Duke University)

  52. Near-Memory Data Transformation for Efficient Sparse Matrix Multi-Vector Multiplication Daichi Fujiki (University of Michigan); Niladrish Chatterjee and Donghyuk Lee (Nvidia Corporation); and Mike O'Connor (Nvidia Corporation, University of Texas)

  53. Network-Accelerated Non-Contiguous Memory Transfers Salvatore Di Girolamo (ETH Zurich, Cray Inc); Konstantin Taranov, Andreas Kurth, Michael Schaffner, and Timo Schneider (ETH Zurich); Jakub Beranek (IT4Innovations, Czech Republic); Maciej Besta and Luca Benini (ETH Zurich); Duncan Roweth (Cray Inc); and Torsten Hoefler (ETH Zurich)

  54. OpenKMC: a KMC Design for a Hundred-Billion-Atom Simulation Using Millions of Cores on Sunway Taihulight Kun Li (Institute of Computing Technology, Chinese Academy of Sciences; Chinese Academy of Sciences); Honghui Shang and Yunquan Zhang (Institute of Computing Technology, Chinese Academy of Sciences); Shigang Li (ETH Zurich); Baodong Wu (Institute of Computing Technology, Chinese Academy of Sciences; SenseTime Research); Dong Wang (Dalian Ocean University); Libo Zhang and Fang Li (Wuxi Jiangnan Institute of Computing Technology); Dexun Chen (National Supercomputing Center, Wuxi); and Zhiqiang Wei (Qingdao National Laboratory for Marine Science and Technology)

  55. Optimizing the Data Movement in Quantum Transport Simulations via Data-Centric Parallel Programming Alexandros Nikolaos Ziogas, Tal Ben-Nun, Guillermo Indalecio Fernandez, Timo Schneider, Mathieu Luisier, and Torsten Hoefler (ETH Zurich)

  56. Parallel Transport Time Dependent Density Functional Theory Calculations with Hybrid Functional on Summit Weile Jia (University of California, Berkeley); Lin-Wang Wang (Lawrence Berkeley National Laboratory); and Lin Lin (University of California, Berkeley; Lawrence Berkeley National Laboratory)

  57. Performance Optimality or Reproducibility: That Is the Question Tapasya Patki and Jayaraman J. Thiagarajan (Lawrence Livermore National Laboratory) and Alexis Ayala and Tanzima Z. Islam (Western Washington University)

  58. Pinpointing Performance Inefficiencies via Lightweight Variance Profiling Pengfei Su and Shuyin Jiao (College of William & Mary), Milind Chabbi (Scalable Machines Research), and Xu Liu (College of William & Mary)

  59. PoDD: Power-Capping Dependent Distributed Applications Huazhe Zhang and Henry Hoffmann (University of Chicago)

  60. Practical and Efficient Incremental Adaptive Routing for HyperX Networks Nic McDonald (Google LLC), Mikhail Isaev (Georgia Institute of Technology), Adriana Flores (Nvidia Corporation), Al Davis (Hewlett Packard Enterprise), and John Kim (Korea Advanced Institute of Science and Technology (KAIST))

  61. Predicting Faults in High Performance Computing Systems: An In-Depth Survey of the State-of-the-Practice David Jauk, Dai Yang, and Martin Schulz (Technical University Munich)

  62. Preparation and Optimization of a Diverse Workload for a Large-Scale Heterogeneous System Ian Karlin (Lawrence Livermore National Laboratory); Yoonho Park (IBM Research); Bronis R. de Supinski (Lawrence Livermore National Laboratory); Peng Wang (Nvidia Corporation); Bert Still, David Beckingsale, and Robert Blake (Lawrence Livermore National Laboratory); Tong Chen, Guojing Cong, Carlos Costa, Johann Dahm, and Giacomo Domeniconi (IBM Research); Thomas Epperly and Aaron Fisher (Lawrence Livermore National Laboratory); Sara Kokkila Schumacher (IBM Research); Steven Langer and Hai Le (Lawrence Livermore National Laboratory); Eun Kyung Lee (IBM Research); Naoya Maruyama (Lawrence Livermore National Laboratory); Xinyu Que (IBM Research); David Richards, Bjorn Sjogreen, Jonathan Wong, Carol Woodward, Ulrike Yang, Xiaohua Zhang, and Bob Anderson (Lawrence Livermore National Laboratory); David Appelhans (IBM Research); Levi Barnes (Nvidia Corporation); Peter Barnes, Sorin Bastea, David Boehme, Jamie A. Bramwell, and Jim Brase (Lawrence Livermore National Laboratory); Jose Brunheroto (IBM Research); Barry Chen, Charway R. Cooper, Tony DeGroot, Rob Falgout, Todd Gamblin, David Gardner, and James Gosli (Lawrence Livermore National Laboratory); John Gunnels (IBM Research); Max Katz (Nvidia Corporation); Tzanio Kolev, I-Feng W. Kuo, Matthew P. Legendre, Ruipeng Li, and Pei-Hung Lin (Lawrence Livermore National Laboratory); Shelby Lockhart (University of Illinois); Kathleen McCandless (Lawrence Livermore National Laboratory); Claudia Misale and Jaime Moreno (IBM Research); Rob Neely, Jarom Nelson, and Rao Nimmakayala (Lawrence Livermore National Laboratory); Kathryn O'Brien and Kevin O'Brien (IBM Research); Ramesh Pankajakshan and Roger Pearce (Lawrence Livermore National Laboratory); Slaven Peles (Pacific Northwest National Laboratory (PNNL)); Phil Regier (Lawrence Livermore National Laboratory); Steve Rennich (Nvidia Corporation); Martin Schulz (Technical University Munich); Howard Scott (Lawrence Livermore National Laboratory); James Sexton (IBM Research); Kathleen Shoga and Shiv Sundram (Lawrence Livermore National Laboratory); Guillaume Thomas-Collignon (Nvidia Corporation); Brian van Essen and Alexey Voronin (Lawrence Livermore National Laboratory); Bob Walkup (IBM Research); Lu Wang (Lawrence Livermore National Laboratory); Chris Ward (IBM Research, UK); Hui-Fang Wen (IBM Research); Dan White and Christopher Young (Lawrence Livermore National Laboratory); Cyril Zeller (Nvidia Corporation); and Ed Zywicz (Lawrence Livermore National Laboratory)

  63. PruneTrain: Fast Neural Network Training by Dynamic Sparse Model Reconfiguration Sangkug Lym, Esha Choukse, and Siavash Zangeneh (University of Texas); Wei Wen (Duke University); and Sujay Sanghavi and Mattan Erez (University of Texas)

  64. Red-Blue Pebbling Revisited: Near Optimal Parallel Matrix Multiplication Grzegorz Kwasniewski (ETH Zurich); Marko Kabic (ETH Zurich, Swiss National Supercomputing Centre (CSCS)); Maciej Besta (ETH Zurich); Raffaele Solca and Joost VandeVondele (ETH Zurich, Swiss National Supercomputing Centre (CSCS)); and Torsten Hoefler (ETH Zurich)

  65. Regularizing Irregularly Sparse Point-to-Point Communications Oguz Selvitopi (Lawrence Berkeley National Laboratory) and Cevdet Aykanat (Bilkent University, Turkey)

  66. Replication Is More Efficient Than You Think Anne Benoit (ENS Lyon); Thomas Herault (University of Tennessee); Valentin Le Fèvre (ENS Lyon); and Yves Robert (ENS Lyon, University of Tennessee)

  67. Revisiting I/O Behavior in Large-Scale Storage Systems: The Expected and the Unexpected Tirthak Patel (Northeastern University), Suren Byna and Glenn K. Lockwood (Lawrence Berkeley National Laboratory), and Devesh Tiwari (Northeastern University)

  68. Scalable Generation of Graphs for Benchmarking HPC Community-Detection Algorithms George M. Slota (Rensselaer Polytechnic Institute (RPI)) and Jonathan W. Berry, Simon D. Hammond, Stephen L. Olivier, Cynthia A. Phillips, and Sivasankaran Rajamanickam (Sandia National Laboratories)

  69. Scalable Reinforcement-Learning-Based Neural Architecture Search for Cancer Deep Learning Research Prasanna Balaprakash, Romain Egele, Misha Salim, Stefan Wild, Venkatram Vishwanath, Fangfang Xia, Tom Brettin, and Rick Stevens (Argonne National Laboratory)

  70. Scalable Simulation of Realistic Volume Fraction Red Blood Cell Flows through Vascular Networks Libin Lu and Matthew J. Morse (New York University, Courant Institute of Mathematical Sciences); Abtin Rahimian (University of Colorado); and Georg Stadler and Denis Zorin (New York University, Courant Institute of Mathematical Sciences)

  71. Semantic Query Transformations for Increased Parallelization in Distributed Knowledge Graph Query Processing HyeongSik Kim (Robert Bosch LLC) and Abhisha Bhattacharyya and Kemafor Anyanwu (North Carolina State University)

  72. Significantly Improving Lossy Compression Quality Based on an Optimized Hybrid Prediction Model Xin Liang (University of California, Riverside); Sheng Di (Argonne National Laboratory); Sihuan Li (University of California, Riverside); Dingwen Tao (University of Alabama); Bogdan Nicolae (Argonne National Laboratory); Zizhong Chen (University of California, Riverside); and Franck Cappello (Argonne National Laboratory)

  73. Slack Squeeze Coded Computing for Adaptive Straggler Mitigation Krishna Giri Narra, Zhifeng Lin, Mehrdad Kiamari, Salman Avestimehr, and Murali Annavaram (University of Southern California)

  74. SLATE: Design of a Modern Distributed and Accelerated Linear Algebra Library Mark Gates, Jakub Kurzak, Ali Charara, and Asim YarKhan (University of Tennessee) and Jack Dongarra (University of Tennessee; Oak Ridge National Laboratory, University of Manchester)

  75. Slim Graph: Practical Lossy Graph Compression for Approximate Graph Processing, Storage, and Analytics Maciej Besta, Simon Weber, Lukas Gianinazzi, Robert Gerstenberger, Andrey Ivanov, Yishai Oltchik, and Torsten Hoefler (ETH Zurich)

  76. Solving PDEs in Space-Time: 4D Tree-Based Adaptivity, Mesh-Free and Matrix-Free Approaches Masado Ishii and Milinda Fernando (University of Utah); Kumar Saurabh, Biswajit Khara, and Baskar Ganapathysubramanian (Iowa State University); and Hari Sundar (University of Utah)

  77. SparCML: High-Performance Sparse Communication for Machine Learning Cedric Renggli (ETH Zurich); Saleh Ashkboos (Institute of Science and Technology Austria); Mehdi Aghagolzadeh (Microsoft Corporation); Dan Alistarh (Institute of Science and Technology Austria, Neural Magic); and Torsten Hoefler (ETH Zurich)

  78. Spread-n-Share: Improving Application Performance and Cluster Throughput with Resource-Aware Job Placement Xiongchao Tang (Tsinghua University, China; Sangfor Technologies Inc.); Haojie Wang (Tsinghua University, China); Xiaosong Ma (Qatar Computing Research Institute); Nosayba El-Sayed (Emory University); Jidong Zhai and Wenguang Chen (Tsinghua University, China); and Ashraf Aboulnaga (Qatar Computing Research Institute)

  79. SSD Failures in the Field: Symptoms, Causes, and Prediction Models Jacob Alter and Ji Xue (College of William & Mary), Alma Dimnaku (NetApp Inc), and Evgenia Smirni (College of William & Mary)

  80. Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous Architectures Tal Ben-Nun, Johannes de Fine Licht, Alexandros Nikolaos Ziogas, Timo Schneider, and Torsten Hoefler (ETH Zurich)

  81. Streaming Message Interface: High-Performance Distributed Memory Programming on Reconfigurable Hardware Tiziano De Matteis and Johannes de Fine Licht (ETH Zurich); Jakub Beránek (IT4Innovations, Czech Republic); and Torsten Hoefler (ETH Zurich)

  82. SW_GROMACS: Accelerate GROMACS on SUNWAY TaihuLight Tingjian Zhang (Shandong University; National Supercomputing Center, Wuxi); Yuxuan Li (Tsinghua University, China; National Supercomputing Center, Wuxi); Ping Gao, Qi Shao, Mingshan Shao, and Meng Zhang (Shandong University; National Supercomputing Center, Wuxi); Jinxiao Zhang (Shandong University); Xiaohui Duan (Shandong University; National Supercomputing Center, Wuxi); Zhao Liu, Lin Gan, Haohuan Fu, and Wei Xue (Tsinghua University, China; National Supercomputing Center, Wuxi); Weiguo Liu (Shandong University; National Supercomputing Center, Wuxi); and Guangwen Yang (Tsinghua University, China; National Supercomputing Center, Wuxi)

  83. Swift Machine Learning Model Serving Scheduling: A Region Based Reinforcement Learning Approach Heyang Qin and Syed Zawad (University of Nevada, Reno); Yanqi Zhou (Google Brain); and Lei Yang, Dongfang Zhao, and Feng Yan (University of Nevada, Reno)

  84. Topology-Custom UGAL Routing on Dragonfly Md Shafayat Rahman, Saptarshi Bhowmik, Yevgeniy Ryasnianskiy, and Xin Yuan (Florida State University) and Michael Lang (Los Alamos National Laboratory)

  85. TriEC: Tripartite Graph Based Erasure Coding NIC Offload Haiyang Shi and Xiaoyi Lu (Ohio State University)

  86. Uncore Power Scavenger: A Runtime for Uncore Power Conservation on HPC Systems Neha Gholkar and Frank Mueller (North Carolina State University) and Barry Rountree (Lawrence Livermore National Laboratory)

  87. Understanding Congestion in High Performance Interconnection Networks Using Sampling Philip A. Taffet (Rice University, Lawrence Livermore National Laboratory) and John M. Mellor-Crummey (Rice University)

  88. Understanding Priority-Based Scheduling of Graph Algorithms on a Shared-Memory Platform Serif Yesil and Azin Heidarshenas (University of Illinois), Adam Morrison (Tel Aviv University), and Josep Torrellas (University of Illinois)

  89. A Versatile Software Systolic Execution Model for GPU Memory Bound Kernels Peng Chen (Tokyo Institute of Technology, National Institute of Advanced Industrial Science and Technology (AIST)); Mohamed Wahib, Shinichiro Takizawa, and Ryousei Takano (National Institute of Advanced Industrial Science and Technology (AIST)); and Satoshi Matsuoka (RIKEN Center for Computational Science (R-CCS), Tokyo Institute of Technology)

Back to SC19 Proceedings Archive