Bulletin of the American Physical Society

APS March Meeting 2022

Volume 67, Number 3

Monday–Friday, March 14–18, 2022; Chicago

Session B48: Building the Bridge to Exascale: Applications and Opportunities in Materials, Chemistry, and Biology I

Sponsoring Units: DCOMP DCP DPOLY DMP
Chair: Jack Deslippe, LBNL
Room: McCormick Place W-471A

Monday, March 14, 2022 11:30AM - 12:06PM	B48.00001: Towards a Realistic Description of H₃O+ and OH- Transport through Confined Environments using Machine Learning and an Order-N Framework for Condensed-Phase Hybrid Density Functional Theory Invited Speaker: Robert A Distasio By including a fraction of exact exchange (EXX), hybrid functionals reduce the self-interaction error in semi-local density functional theory (DFT), and thereby furnish a more accurate and reliable description of the electronic structure in systems throughout chemistry, physics, and materials science. However, the high computational cost associated with hybrid DFT limits its applicability when treating large-scale and complex condensed-phase systems. To overcome this limitation, we have devised a highly accurate and linear-scaling (order-N) approach based on a local representation of the occupied space that exploits sparsity when evaluating the EXX interaction in real space, and recently extended this framework to treat heterogeneous systems without the need for system-dependent parameters. In this work, we use this approach to generate high-quality training data at the dispersion-inclusive hybrid DFT level for training a reactive machine-learned potential to study how confinement affects the diffusion of H₃O+(aq) and OH-(aq) at experimentally relevant length and time scales. To enable such large-scale data generation, we have performed a comprehensive overhaul of our software to exploit next-generation high-performance computing architectures, including a three-pronged strategy to improve the computation (including GPU acceleration), communication, and workload balance. With these developments, this work brings us closer to understanding H₃O+/OH- transport through confined aqueous environments, which is of fundamental importance in the energy sciences (e.g., transport/conductivity in alkaline fuel cells).
Monday, March 14, 2022 12:06PM - 12:18PM	B48.00002: Fueling a Data-Driven Machine Learning Model for H₃O⁺ and OH^- Transport through Confined Aqueous Environments: A High-Throughput Order-N Framework for Condensed-Phase Hybrid Density Functional Theory at Work Hsin-Yu Ko, Marcos F Calegari Andrade, Zachary M Sparrow, Brian G Ernst, Jalen A Harris, Robert A Distasio By including a fraction of exact exchange (EXX), hybrid functionals reduce the self-interaction error in semi-local density functional theory (DFT), and thereby furnish a more accurate and reliable description of the electronic structure in systems throughout chemistry, physics, and materials science. In particular, it has been demonstrated that dispersion-inclusive hybrid DFT can provide a semi-quantitative description of H₃O⁺ and OH^- transport in bulk aqueous solutions. However, the high computational cost associated with hybrid DFT limits its applicability when treating such large-scale and complex condensed-phase systems. To overcome this limitation, we have developed a highly accurate linear-scaling (order-N) approach for treating finite-gap (homogeneous and heterogeneous) systems without system-dependent parameters. Furthermore, we have implemented and devised a GPU-accelerated implementation of this framework to generate high-quality dispersion-inclusive hybrid DFT data for building a deep neural network potential for aqueous H₃O⁺ and OH^- in bulk and confined environments. With these developments, this work brings us closer to understanding H₃O⁺ and OH^- transport through confined aqueous environments, which is of fundamental importance in the energy sciences.
Monday, March 14, 2022 12:18PM - 12:30PM	B48.00003: Optimization and performance of RMG DFT-based electronic structure software on exascale architectures Emil Briggs, Wenchang Lu, Jerry Bernholc Exascale computer architectures are close to deployment, with several systems expected to come online within the next 1-2 years. Exploiting their full computational power to study challenging scientific problems requires careful attention to machine design and its interaction with the algorithms used in scientific codes. The RMG software package for electronic structure calculations has been optimized for pre-exascale machines such as Summit at ORNL, and we describe the work being done to port RMG to exascale-class machines. RMG uses a real-space formulation of the Kohn-Sham equations that map well to distributed node architectures via domain decomposition. However, using mixed CPU-GPU architectures efficiently requires careful work to hide or reduce the latencies associated with CPU-GPU and internodedata transfers. We discuss the methods used to address these issues in base RMG and those emerging when implementing advanced features, such as hybrid functionals, semi-local pseudopotentials, and spin-orbit coupling. Tests on AMD testbeds for the exascale Frontier show very promising performance. RMG source code and build scripts for pre-exascale Summit, Cray XE-XK, clusters, Linux, and Windows workstations are at www.rmgdft.org, together with help files and examples.
Monday, March 14, 2022 12:30PM - 12:42PM	B48.00004: CPU-GPU optimization and performance of RMG linear-scaling module with optimally localized orbitals Wenchang Lu, Emil Briggs, Jerry Bernholc The open-source RMG (Real-Space Multigrid) package solves the Kohn-Sham equations directly on a real-space grid using multigrid acceleration. With a careful data-structure design, RMG is massively parallel on supercomputers with and without GPU accelerators. We recently released a new RMG module that can potentially lead to linear scaling with the system size, i.e., the number of atoms or electrons. In contrast to the main module, in which the wave functions are delocalized and directly represented on real-space grids, the localized-orbitals module expands the wave functions as a linear combination of strictly-localized orbitals that are optimized variationally. For GPU acceleration, we implemented explicit memory management for multiple GPUs and CPU cores per node, which can be easily adapted to either CUDA (Nvidia) or HIP (AMD) programming environment. Timings on the new AMD testbed for the exascale Frontier show very good scalability and GPU speed-up.
Monday, March 14, 2022 12:42PM - 12:54PM	B48.00005: Large-Scale Materials Science Codes Porting Strategies for Next Generation Exascale Architectures Mauro Del Ben, Steven G Louie, Jack R Deslippe Due to the intensive computational workload of their applications, materials science codes have been and still are among those which mostly benefit from leadership class HPC facilities. At the state of the art, graphics processing units (GPUs) dominate the HPC paradigm and force developers to actively maintain and optimize core compute kernels going forward. In this talk, we will focus on our experiences navigating this portability effort for the BerkeleyGW software package. BerkeleyGW is a massively parallel software package employed to study the excited state properties of electrons in materials by using the GW and the GW plus Bethe-Salpeter Equation (GW-BSE) methods, and beyond. The code is capable of scaling out to tens of thousands of nodes and effectively utilizing strong-scaling GPU architectures. We will discuss our experiences porting BerkeleyGW to three different GPU programming models (CUDA, OpenACC, OpenMP Target) and to various GPU vendor architectures, as well as challenges impeding true performance portability we encountered along the way. Special attention will be paid to code modernization practices which we found useful in the porting pipeline.
Monday, March 14, 2022 12:54PM - 1:06PM	B48.00006: GPU-Acceleration of Large-Scale Full-Frequency GW Calculations Victor Yu, Marco Govoni Many-body perturbation theory is a powerful method to simulate electronic excitations in molecules and materials starting from the output of density functional theory calculations. However, its widespread application to large systems has been hindered by the high computational cost. We present a GPU acceleration study of the full-frequency GW method for periodic systems, as implemented in the WEST code [http://west-code.org]. We discuss the use of (1) optimized GPU libraries, e.g., cuFFT and cuBLAS, (2) a hierarchical parallelization strategy that minimizes CPU-GPU, GPU-GPU, and CPU-CPU data transfer operations, (3) asynchronous GPU kernels that overlap with MPI communications, and (4) mixed precision in selected portions of the code. We demonstrate a substantial speedup of the GPU-accelerated version of WEST with respect to its CPU version, and we show good strong and weak scaling using up to 25,920 GPUs on the OLCF/Summit supercomputer. The GPU version of WEST yields electronic structures using the full-frequency GW method for realistic nanostructures and interfaces comprising up to 10,368 electrons. This work was supported by MICCoM, as part of the Computational Materials Sciences Program funded by the U.S. Department of Energy, Office of Science, Basic Energy Sciences.
Monday, March 14, 2022 1:06PM - 1:18PM	B48.00007: QMCPACK performance portability on NVIDIA and AMD GPUs Ye Luo, Peter Doak, Paul Kent As Exascale supercomputers are being deployed in U.S., QMCPACK (https://qmcpack.org) developers have migrated the code base to a performance portable implementation for science production on these powerful machines with out-of-box experience. With a fresh design of code architecture, historically divergent code paths for CPUs and GPUs have been unified and a core set of features are available on all the computing platforms including CPUs and GPUs today. With portable OpenMP target offload programming model and high quality vendor linear algebra libraries, impressive performance has been achieved with minimal vendor specific customization needed. We show current performance for materials calculations on NVIDIA and AMD GPUs with a broad range of electron counts and analyze the remaining inefficiencies.
Monday, March 14, 2022 1:18PM - 1:30PM	B48.00008: Using high accuracy many-body methods (QMC and sCI) to describe ground state and excited state properties of strongly correlated battery cathodes Anouar Benali, kevin E gasperich, Tomas Rojas, Vijay R Singh, Pallab Barai, Anh T Ngo, Hyeondeok Shin Since the commercialization of the first Lithium Ion Batteries (LIB) with LiCoO$_2$ as the active material in the cathodes, its electrochemical performance has been extensively studied both experimentally and theoretically. Increasing the capacity of LIB with high voltage leads to oxygen loss and surface reconstruction resulting in a rapid capacity fade and impedance growth. Using surface coating treatment stabilizes the surface and minimizes reactions with electrolytes but requires a good understanding of the electronic structure and properties of LiCoO$_2$. In the past decades, many studies using Density Functional Theory (DFT) corrected for strong correlation with an ad-hoc Hubbard U energy were published reproducing many important properties of the material. While DFT+U can reproduce a known property (band gap or lattice parameter), the value of U needs to be updated for each new property making the approach non-predictive. In this talk we use a combination of Diffusion Monte Carlo (DMC) and select Configuration Interaction (sCI) for solids to describe the orbitals of LiCoO$_2$, leading to a better description of the band gaps of strongly correlated transition metal oxide materials and open the path to more reliable and trial-wavefunction invariant DMC calculations.
Monday, March 14, 2022 1:30PM - 1:42PM	B48.00009: Forces, stresses and related properties in solids by plane-wave auxiliary-field quantum Monte Carlo Siyuan Chen, Fengjie Ma, Shiwei Zhang We present accurate calculations of interatomic force and stress in solid state systems, using the plane-wave auxiliary-field quantum Monte Carlo (PW-AFQMC) method [1] . AFQMC has been shown to be an excellent many-body total energy method. Computation of observables other than the ground-state energy requires back-propagation [2], which we have implemented in the PW-AFQMC framework and to compute accurate charge densities [3]. Here we present results on computing derivatives of the total energy, including forces and stresses. Accurate AFQMC interatomic forces and stresses can be applied for a full geometry optimization in solids, which we demonstrate in the silicon beta-tin structure and molybdenum disulfide (MoS₂) monolayer. Further, we generalize the correlated-sampling technique [4] to compute observables, and demonstrate it by computing the phonon spectrum from the force fields at a low cost. This paves the way for ab initio many-body computation of thermodynamic properties.
Monday, March 14, 2022 1:42PM - 1:54PM	B48.00010: Ab initio Calculations in Atoms, Molecules, and Solids, Treating Spin-Orbit Coupling and Electron Interaction on Equal Footing Brandon Eskridge, Henry Krakauer, Hao Shi, Shiwei Zhang Understanding the interplay between electron-electron interaction and spin-orbit coupling in molecules and materials is key to answering various fundamental and technological questions. An unbiased theoretical treatment requires that material specificity, electron correlation, and spin-orbit coupling (SOC) be captured accurately and on equal footing. We have incorporated explicit, non-perturbative treatment of spin-orbit coupling into ab initio auxiliary-field quantum Monte Carlo (AFQMC) calculations. The approach allows a general computational framework for molecular and bulk systems in which materials specificity, electron correlation, and spin-orbit coupling effects can be captured accurately, with favorable computational scaling versus system size. We adopt relativistic effective-core potentials which have been obtained by fitting to fully relativistic data and which have demonstrated a high degree of reliability and transferability in molecular systems. This results in a 2-component spin-coupled Hamiltonian, which is then treated by generalizing the ab initio AFQMC approach. We demonstrate the method by computing the electron affinity in Pb, the bond dissociation energy in Br₂ and I₂, and solid Bi.
Monday, March 14, 2022 1:54PM - 2:06PM	B48.00011: Electronic structure calculations at finite temperature using the piecewise interaction picture density matrix quantum Monte Carlo approach William Z Van Benschoten, James J Shepherd In this work, we present a modification to the propagator in density matrix quantum Monte Carlo methods (DMQMC) which is a method to sample the N-body density matrix for an electronic Hamiltonian. Starting from the interaction picture (IP) requires IP-DMQMC to sample only one temperature during a single calculation; this was developed to alleviate the under-sampling of the important energy states. The new approach combines the IP-DMQMC and DMQMC propagators in a piecewise fashion (PIP-DMQMC) allowing for near-continuous temperature sampling. We benchmark this method by comparing to the sum over full configuration interaction (FCI) states, IP-DMQMC, and DMQMC. We find equivalent or improved energy estimates for the benchmark systems. We then use initiator PIP-DMQMC to simulate the water and methane molecules in the cc-pVDZ basis. Finally, we compare the cost of this method to that of the original IP-DMQMC method. We believe this method will extend the size of systems which can be accurately treated with finite temperature and will provide useful comparison data for other finite temperature methods.
Monday, March 14, 2022 2:06PM - 2:18PM	B48.00012: Rapid Generation of Optimal Generalized Monkhorst-Pack Grids Yunzhe Wang, Pandu Wisesa, Adarsh Balasubramanian, Shyam Dwaraknath, Tim Mueller The calculation of properties of crystalline materials can often be accelerated by using k-point grids to estimate an integral in reciprocal space. We have shown that a generalization of the widely-used Monkhorst-Pack method for k-point grid generation can reduce the number of symmetrically distinct k-points required to reach a given level of accuracy, and thus accelerate computational throughput, by roughly a factor of two for well-converged density functional theory calculations. To facilitate the widespread adoption of this approach, we have developed algorithms that enable dynamic generation of optimal generalized Monkhorst-Pack grids within seconds. We present these algorithms and several software tools in which they are implemented, including a C++ library with a Python interface designed for integration with third-party software packages.

About APS

The American Physical Society (APS) is a non-profit membership organization working to advance the knowledge of physics.

Headquarters 1 Physics Ellipse, College Park, MD 20740-3844 (301) 209-3200
Editorial Office 100 Motor Pkwy, Suite 110, Hauppauge, NY 11788 (631) 591-4000
Office of Public Affairs 529 14th St NW, Suite 1050, Washington, D.C. 20045-2001 (202) 662-8700

Bulletin of the American Physical Society

APS March Meeting 2022

Volume 67, Number 3

Monday–Friday, March 14–18, 2022; Chicago

Session B48: Building the Bridge to Exascale: Applications and Opportunities in Materials, Chemistry, and Biology I

Follow Us

Engage

My APS

Information for

About APS