Bulletin of the American Physical Society
APS March Meeting 2017
Volume 62, Number 4
Monday–Friday, March 13–17, 2017; New Orleans, Louisiana
Session C7: Computational Physics at the Petascale and Beyond IIIFocus
|
Hide Abstracts |
Sponsoring Units: DCOMP DMP-DCMP DCP-DBIO Chair: Jack Deslippe, Lawerence Berkeley National Laboratory Room: 266 |
Monday, March 13, 2017 2:30PM - 2:42PM |
C7.00001: New implementations for large scale DFT and GW calculations with numeric atom-centered orbital basis functions on many-core architectures Alvaro Vazquez Mayagoitia Advances in recent implementations in FHI-aims code (https://aimsclub.fhi-berlin.mpg.de/) to exploit many-core architectures with multi-level parallelism, by combining Message Passing Interface programing and OpenMP threads, are presented. FHI-aims which is an all-electron code and uses numeric atom-centered orbital basis was modified to effectively use thousands of cores and compute large scale electronic structure calculations of solids and clusters. This advances enabled high-throughput and data analytic driven approaches for material discovery. Extensive benchmark results in many-core petascale computers using DFT calculations, with dispersion corrected hybrid functionals, and GW approaches in molecular crystals are also presented. [Preview Abstract] |
Monday, March 13, 2017 2:42PM - 2:54PM |
C7.00002: A Unified Software Interface to Solve or Circumvent the Kohn-Sham Eigenvalue Problem: ELSI Victor Yu, William Huhn, Bj\"{o}rn Lange, Volker Blum, Fabiano Corsetti, Lin Lin, Jianfeng Lu, Alvaro Vazquez-Mayagoita, Chao Yang Solving or circumventing a generalized eigenvalue problem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory (KS-DFT). This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to simplify the access to existing strategies to address the KS eigenvalue problem. Supported algorithms include the massively parallel dense eigensolver ELPA ($O(N^3)$), the orbital minimization method in libOMM ($O(N^3)$ with a reduced prefactor), and the Pole EXpansion and Selected Inversion (PEXSI) approach with lower computational complexity (at most $O(N^2)$). The ELSI interface aims to simplify the implementation and optimal use of the different strategies, by a) optional automatic selection of the correct solver depending on the specific problem; b) reasonable default parameters for a chosen solver; and c) automatic conversion between input and internal working matrix formats. Benchmarks are shown for all-electron Hamilton and overlap matrices for system sizes up to several thousand atoms. This work is supported by the National Science Foundation under Grant Number 1450280. [Preview Abstract] |
Monday, March 13, 2017 2:54PM - 3:06PM |
C7.00003: Advances in Real Space Methods to Solve the Kohn-Sham Equation Chalres Lena, James R. Chelikowsky, Ariel Biller, Leeor Kronik We will discuss advances in solving the Kohn-Sham equation using pseudopotentials implemented in real space. A solution is often limited by the high computational demand in solving an eigenvalue problem at each self-consistent-field. Our code replaces the explicit eigenvalue calculations by an approximation of the desired invariant subspace, refined with well-selected Chebyshev polynomial filters. The subspace iteration at each step is notably faster than solving a corresponding eigenproblem by the most efficient algorithms. Moreover, the subspace iteration reaches self-consistency within roughly the same number of steps as an eigensolver-based approach. We will discuss some advances in data partitioning. These advances include improvements to sparse matrix vector multiplication, which makes up the large computational component of the filtering process. We demonstrate an efficient scaling of the Laplacian component to more than 10$^8$ grid points and of the Rayleigh-Ritz component to 10$^5$ electronic states. As an application, we consider nanocrystals of silicon containing over 10,000 atoms. [Preview Abstract] |
Monday, March 13, 2017 3:06PM - 3:42PM |
C7.00004: Accelerating large scale Kohn-Sham density functional theory calculations with semi-local functionals and hybrid functionals Invited Speaker: Lin Lin The computational cost of standard Kohn-Sham density functional theory (KSDFT) calculations scale cubically with respect to the system size, which limits its use in large scale applications. In recent years, we have developed an alternative procedure called the pole expansion and selected inversion (PEXSI) method [1-2]. The PEXSI method solves KSDFT without solving any eigenvalue and eigenvector, and directly evaluates physical quantities including electron density, energy, atomic force, density of states, and local density of states. The overall algorithm scales as at most quadratically for all materials including insulators, semiconductors and the difficult metallic systems. The PEXSI method can be efficiently parallelized over 10,000 - 100,000 processors on high performance machines. The PEXSI method has been integrated into a number of community electronic structure software packages such as ATK, BigDFT, CP2K, DGDFT, FHI-aims and SIESTA, and has been used in a number of applications with 2D materials beyond 10,000 atoms [3]. The PEXSI method works for LDA, GGA and meta-GGA functionals. The mathematical structure for hybrid functional KSDFT calculations is significantly different. I will also discuss recent progress on using adaptive compressed exchange method for accelerating hybrid functional calculations [4]. References: [1] L. Lin, J. Lu, L. Ying, R. Car and W. E, Fast algorithm for extracting the diagonal of the inverse matrix with application to the electronic structure analysis of metallic systems, Commun. Math. Sci. 7, 755, 2009 [2] L. Lin, M. Chen, C. Yang and L. He, Accelerating atomic orbital-based electronic structure calculation via pole Expansion and selected inversion, J. Phys. Condens. Matter 25, 295501, 2013 [3] W. Hu, L. Lin, C. Yang, J. Dai and J. Yang, Edge-modified phosphorene nanoflake heterojunctions as highly efficient solar cells, Nano Lett. 16 1675, 2016 [4] L. Lin, Adaptively compressed exchange operator, J. Chem. Theory Comput. 12, 2242, 2016 [Preview Abstract] |
Monday, March 13, 2017 3:42PM - 3:54PM |
C7.00005: RMG An Open Source Electronic Structure Code for Multi-Petaflops Calculations Emil Briggs, Wenchang Lu, Miroslav Hodak, Jerzy Bernholc RMG (Real-space Multigrid) is an open source, density functional theory code for quantum simulations of materials. It solves the Kohn-Sham equations on real-space grids, which allows for natural parallelization via domain decomposition. Either subspace or Davidson diagonalization, coupled with multigrid methods, are used to accelerate convergence. RMG is a cross platform open source package which has been used in the study of a wide range of systems, including semiconductors, biomolecules, and nanoscale electronic devices. It can optionally use GPU accelerators to improve performance on systems where they are available. The recently released versions (\textgreater 2.0) support multiple GPU's per compute node, have improved performance and scalability, enhanced accuracy and support for additional hardware platforms. New versions of the code are regularly released at \underline {http://www.rmgdft.org}. The releases include binaries for Linux, Windows and MacIntosh systems, automated builds for clusters using cmake, as well as versions adapted to the major supercomputing installations and platforms. Several recent, large-scale applications of RMG will be discussed. [Preview Abstract] |
Monday, March 13, 2017 3:54PM - 4:06PM |
C7.00006: GPU-Accelerated Large-Scale Electronic Structure Theory on Titan with a First-Principles All-Electron Code William Paul Huhn, Bj\"{o}rn Lange, Victor Yu, Volker Blum, Seyong Lee, Mina Yoon Density-functional theory has been well established as the dominant quantum-mechanical computational method in the materials community. Large accurate simulations become very challenging on small to mid-scale computers and require high-performance compute platforms to succeed. GPU acceleration is one promising approach. In this talk, we present a first implementation of all-electron density-functional theory in the FHI-aims code for massively parallel GPU-based platforms. Special attention is paid to the update of the density and to the integration of the Hamiltonian and overlap matrices, realized in a domain decomposition scheme on non-uniform grids. The initial implementation scales well across nodes on ORNL's Titan Cray XK7 supercomputer (8 to 64 nodes, 16 MPI ranks/node) and shows an overall speed up in runtime due to utilization of the K20X Tesla GPUs on each Titan node of 1.4x, with the charge density update showing a speed up of 2x. Further acceleration opportunities will be discussed. [Preview Abstract] |
Monday, March 13, 2017 4:06PM - 4:18PM |
C7.00007: GPU Acceleration of the Locally Selfconsistent Multiple Scattering Code for First Principles Calculation of the Ground State and Statistical Physics of Materials Markus Eisenbach The Locally Self-consistent Multiple Scattering (LSMS) code solves the first principles Density Functional theory Kohn-Sham equation for a wide range of materials with a special focus on metals, alloys and metallic nano-structures. It has traditionally exhibited near perfect scalability on massively parallel high performance computer architectures. We present our efforts to exploit GPUs to accelerate the LSMS code to enable first principles calculations of O(100,000) atoms and statistical physics sampling of finite temperature properties. Using the Cray XK7 system Titan at the Oak Ridge Leadership Computing Facility we achieve a sustained performance of 14.5PFlop/s and a speedup of 8.6 compared to the CPU only code. This work has been sponsored by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, Material Sciences and Engineering Division and by the Office of Advanced Scientific Computing. This work used resources of the Oak Ridge Leadership Computing Facility, which is supported by the Office of Science of the U.S. Department of Energy under contract no. DE-AC05-00OR22725. [Preview Abstract] |
Monday, March 13, 2017 4:18PM - 4:30PM |
C7.00008: Performance optimization of Qbox and WEST on Intel Knights Landing Huihuo Zheng, Christopher Knight, Giulia Galli, Marco Govoni, Francois Gygi We present the optimization of electronic structure codes Qbox and WEST targeting the Intel\textregistered Xeon Phi\texttrademark processor, codenamed Knights Landing (KNL). Qbox is an ab-initio molecular dynamics code based on plane wave density functional theory (DFT) and WEST is a post-DFT code for excited state calculations within many-body perturbation theory. Both Qbox and WEST employ highly scalable algorithms which enable accurate large-scale electronic structure calculations on leadership class supercomputer platforms beyond 100,000 cores, such as Mira and Theta at the Argonne Leadership Computing Facility. In this work, features of the KNL architecture (e.g. hierarchical memory) are explored to achieve higher performance in key algorithms of the Qbox and WEST codes and to develop a road-map for further development targeting next-generation computing architectures. In particular, the optimizations of the Qbox and WEST codes on the KNL platform will target efficient large-scale electronic structure calculations of nanostructured materials exhibiting complex structures and prediction of their electronic and thermal properties for use in solar and thermal energy conversion device. [Preview Abstract] |
Monday, March 13, 2017 4:30PM - 4:42PM |
C7.00009: Enabling Hybrid Density Functional Theory Based Ab Initio Molecular Dynamics in Large-Scale Condensed-Phase Systems Junteng Jia, Alvaro Vazquez-Mayagoitia, Robert A. DiStasio Jr. Hybrid density functional theory (DFT) provides an accurate and reliable quantum mechanical model for studying condensed-phase systems. However, this accuracy is often accompanied by a prohibitively large computational cost associated with evaluating the exact exchange ($E_{\rm xx}$) energy and corresponding orbital-dependent potential. In this work, we report some of our recent theoretical and algorithmic developments that significantly reduce the time to solution in condensed-phase hybrid DFT on modern supercomputer architectures. By utilizing maximally localized Wannier functions (MLWFs) for the occupied orbitals, we formally reduce the computational scaling to $O(N)$ in the system size. By devloping novel preconditioning techniques in conjunction with extensive code optimization/vectorization, we achieve an additional order of magnitude boost in performance when migrating from BG/Q to KNL during the solution of the Poisson equation (the computational cornerstone of our $E_{\rm xx}$ algorithm). A novel task-based distribution scheme that we have recently developed to minimize communication overhead and maximize load balance will also be discussed as a potential strategy for ensuring favorable scalability and portability of our algorithm to future petascale architectures. [Preview Abstract] |
Monday, March 13, 2017 4:42PM - 4:54PM |
C7.00010: Large scale electronic structure calculations of nanosystems on supercomputers Lin-Wang Wang Using linear scaling three dimensional fragment (LS3DF) method, and running on the Oak Ridge Leadership Computing Facility, we have studied the electronic structures of nanoscale systems with tens of thousands of atoms. The LS3DF is a divide-and-conquer method, which divides a large system into many small fragments, and solves each fragment using quantum mechanical methods. The LS3DF method can run on computer like Titan with tens of thousands of cores and on GPU processors. We have studied the electron localization of MoS2/MoSe2 bilayer. Such bilayer forms Moire pattern from the atomic structure point of view. We show that the electronic state can also be localized following such atomic Moire pattern. We have also studied the wave function localization in the hybrid perovskite. We show that, the electrostatic fluctuation caused by the random orientation of organic molecule CH3NH3 is sufficient to localize the electron wave function and significantly reduce its carrier mobility. Finally, we have also studied various nanoscale vortices in a ferroelectric system. Such ferroelectric nanoscale structure can significantly alter the electronic structures of the systems and change their transport properties. This work is supported by BES/SC/DOE, through the Theory of Material program (KC2301) and by INCITE for computer time. [Preview Abstract] |
Monday, March 13, 2017 4:54PM - 5:06PM |
C7.00011: Zirconia and its allotropes; A Quantum Monte Carlo study. Andrea Jokisaari, Anouar Benali, Hyeondeok Shin, Ye Luo, Alejandro Lopez Bezanilla, Laura Ratcliff, Peter Littlewood, Olle Heinonen With a high strength and stability at elevated temperatures, Zirconia (zirconium dioxide) is one of the best corrosion-resistant and refractive materials used in metallurgy, and is used in structural ceramics, catalytic converters, oxygen sensors, nuclear industry, and in chemically passivating surfaces. The wide range of applications of ZrO2 has motivated a large number of electronic structures studies of its known allotropes (monoclinic, tetragonal and cubic). Density Functional Theory has been successful at reproducing some of the fundamental properties of some of the allotropes, but these results remain dependent on the specific combination of exchange-correlation functional and type of pseudopotentials, making any type of structural prediction or defect analysis uncertain. Quantum Monte Carlo (QMC) is a many-body quantum theory solving explicitly the electronic correlations, allowing reproducing and predicting materials’ properties with a limited number of controlled approximations. In this study, we use QMC to revisit the energetic stability of Zirconia's allotropes and compare our results with those obtained from density functional theory. [Preview Abstract] |
Monday, March 13, 2017 5:06PM - 5:18PM |
C7.00012: Massively Parallel Spectrum Slicing Eigensolver for Ab Initio Calculations Murat Keceli, Hong Zhang, Fabiano Corsetti, Carmen Campos, Jose Roman, Alvaro Vazquez- Mayagoitia, Peter Zapol, Albert Wagner Hartree-Fock or density functional theory (DFT) based methods require the self-consistent solutions of the generalized eigenvalue problem, which means solving a similar problem many times until a convergence criteria is met. Computation of the eigensolutions (matrix diagonalization) becomes the bottleneck when the number of basis functions reaches thousands. We developed and benchmarked a PETSc and SLEPc based sparse eigensolver suitable for such applications. The eigensolver makes use of shift-and-invert parallel spectral transformations and we integrated this eigensolver into Siesta ab initio molecular dynamics package. By performing DFT energy and gradient calculations for water clusters, Boron nitride films, and polyethylene; we demonstrated up to seven-fold speed-up and better strong scaling efficiency compared to default eigensolver in Siesta. There are three main advantages of our solver compared to dense solvers: 1) reduced memory footprint, and computational complexity (exploits sparsity) 2) less global communications (does not require fast interconnects) 3) job balance and performance improvement at subsequent iterations. Moreover, in contrast to diagonalization free methods, it provides eigensolutions and it is applicable for both metals and insulators. [Preview Abstract] |
Monday, March 13, 2017 5:18PM - 5:30PM |
C7.00013: SNAP: Automated Generation of High-Accuracy Interatomic Potentials using Quantum Data Aidan Thompson, Mitchell Wood, Simon Phillpot Molecular dynamics simulation is a powerful computational method for bridging between macroscopic continuum models and quantum models treating a few hundred atoms, but it is limited by the accuracy of the interatomic potential. Sound physical and chemical understanding have led to good potentials for certain systems, but it is difficult to extend them to new materials and properties. The solution is obvious but challenging: develop more complex potentials that reproduce large quantum datasets. The growing availability of large data sets has made it possible to use automated machine-learning approaches for interatomic potential development. In the SNAP approach, the interatomic potential depends on a very general set of atomic neighborhood descriptors, based on the bispectrum components of the density projected onto the surface of the unit 3-sphere. Previously, this approach was demonstrated for tantalum, reproducing the screw dislocation Peierls barrier. In this talk, it will be shown that the SNAP method is capable of reproducing a wide range of energy landscapes relevant to diverse material science applications: i) point defects in indium phosphide, ii) stability of tungsten surfaces at high temperatures, and iii) formation of intrinsic defects in uranium. [Preview Abstract] |
Follow Us |
Engage
Become an APS Member |
My APS
Renew Membership |
Information for |
About APSThe American Physical Society (APS) is a non-profit membership organization working to advance the knowledge of physics. |
© 2024 American Physical Society
| All rights reserved | Terms of Use
| Contact Us
Headquarters
1 Physics Ellipse, College Park, MD 20740-3844
(301) 209-3200
Editorial Office
100 Motor Pkwy, Suite 110, Hauppauge, NY 11788
(631) 591-4000
Office of Public Affairs
529 14th St NW, Suite 1050, Washington, D.C. 20045-2001
(202) 662-8700