Bulletin of the American Physical Society
71st Annual Meeting of the APS Division of Fluid Dynamics
Volume 63, Number 13
Sunday–Tuesday, November 18–20, 2018; Atlanta, Georgia
Session E31: High-Performance Computing |
Hide Abstracts |
Chair: Sharath Girimaji, Texas A&M University Room: Georgia World Congress Center B403 |
Sunday, November 18, 2018 5:10PM - 5:23PM |
E31.00001: Extreme-scale computing for pseudo-spectral codes using GPUs and fine-grained asynchronism, with application to turbulence Kiran Ravikumar, David Appelhans, Pui-Kuen Yeung As computing advances to the pre-Exascale era dominated by accelerators such as Graphical Processing Units, a substantial re-thinking is necessary for many communication-intensive applications, including turbulence simulations based on pseudo-spectral methods. We have developed an asynchronous algorithm with one-dimensional domain decomposition optimized for machines with large CPU memory and fast GPUs, in particular SUMMIT at the Oak Ridge National Laboratory, which consists of IBM Power-9 CPU's and NVIDIA V100 GPU's. Data located in the CPU memory are processed in a fine-grained (batch) manner by overlapping high BW NVLINK transfers, with fast GPU computations and high BW system interconnect allowing a much larger problem to be run than the much smaller GPU memory might suggest. Pinned memory and zerocopy approaches are used to transfer strided data between the GPU and CPU obtaining high NVLINK throughput. Several advanced communication protocols are explored in order to obtain maximum network throughput for collective communication. Benchmarks at the scale of $12288^3$ grid points on 1024 SUMMIT nodes show good weak scaling, with a speedup of over 3X compared to the multi-threaded CPU-only algorithm. |
Sunday, November 18, 2018 5:23PM - 5:36PM |
E31.00002: Toward asynchronous computations: Proxy equation approach for elliptic systems Ankita Mittal, Sharath S Girimaji Computational solutions of transport equations using traditional schemes scale poorly on massively parallel architectures due to mandatory mathematical synchronization between processing elements (PEs). Asynchronous computations relax the synchronization requirement and scale well on massively parallel architectures but at the cost of reduced accuracy. The proxy equation approach [1] reasonably recovers this loss in accuracy by modifying the flow parameters of the original equation. However, this approach is currently limited to hyperbolic (advection) and parabolic (diffusion) processes. In this study we extend the proxy equation approach to elliptic systems. We utilize pseudo-compressible approach to develop asynchronous schemes for elliptic pressure field in incompressible Navier-Stokes equations. The new scheme is validated using numerical simulations of benchmark problems. |
Sunday, November 18, 2018 5:36PM - 5:49PM |
E31.00003: Soleil-X : Turbulence, Particles, and Radiation in the Regent Programming Language Hilario Torres, Gianluca Iaccarino Soleil-X is a multiphysics solver that is being developed at Stanford University as a part of the Predictive Science Academic Alliance Program. The goals of our application are to investigate particle laden turbulent flows in a radiation environment for solar energy receiver applications using high-fidelity computations as well as to explore exascale computing strategies for multiphysics simulations. This talk will give an overview of the software architecture of Soleil-X, in particular how we are utilizing the Legion programming system, via the Regent language, to enable performance and portability across a variety of different machine architectures including GPUs. Soliel-X scaling results, performance profiles, multi-physics simulation results, and ensemble capability will also be discussed. |
Sunday, November 18, 2018 5:49PM - 6:02PM |
E31.00004: A scalable multi-GPU solver for direct numerical simulation of boundary layer flow Sanghyun Ha, Junshin Park, Donghyun You A flow solver scalable on multiple Graphics Processing Units (GPUs) for direct numerical simulation of boundary layer flow is presented. The solver utilizes a previously reported work (J. Comp. Physics, vol. 352 (2018), pp.246-264) which proposes a semi-implicit fractional-step method on a single GPU. Extension of this work to accommodate multiple GPUs is found to be difficult mainly due to global transpose required in the Alternating Direction Implicit (ADI) and Fourier-transform-based direct methods. The present study suggests a new strategy for designing a multi-GPU solver by applying an algorithm known as the Parallel Diagonal Dominant (PDD) method to the previous work. Implementation details and performance results are compared with those of a flow solver using global transpose. Results from test simulations of turbulent and transitional boundary layer on multiple Tesla P100 GPUs are presented. |
Sunday, November 18, 2018 6:02PM - 6:15PM |
E31.00005: Numerical investigation on optimizing the high performance computing resources required for finite element analysis Harika Gurram, Lars Koesterke, Maytal Dahan, Tracy Brown In this investigation, finite element analysis is performed to analyze the fluid flow over macro and micro structures. Series of simulations are performed on Stampede2 Knights Landing (KNL) and Skylake (SKX) compute nodes at TACC. Ls –Dyna is used to generate and solve the grids. This study focuses on investigating and modelling a mathematical equation to choose the optimized number of processors to run various grids structure on complex geometries. Two different geometries were considered; belly-flap of blended wing body and a flexible thin optical wire of diameter 5 microns. Three different sizes of structured and unstructured grids (coarse, medium and fine) are generated on both the geometries. The numerical equations for optimized computing time is calculated based on the geometry, grid size, viscous layers, flow speed, KNL, and SKX compute nodes. This study also attempts to model the relation between the processor's clock rate to the number of grid elements (KNL processors clock rate is 1.4GHz whereas SKX processors clock rate is 2.1GHz). The ultimate goal of this research is to give an insight on how to optimize the usage of high performance computing clusters which will, in turn, save the computing cost, time and resources. |
Sunday, November 18, 2018 6:15PM - 6:28PM |
E31.00006: A unified framework for synchronous and asynchronous optimizedfinite differences for exascale fluid flow simulations. Komal Kumari, Raktim Bhattacharya, Diego A. Donzis We present an integrated framework for the development of optimized spatial finite difference schemes that includes formal order of accuracy, spectral resolution and stability. We show how this framework exposes explicitly the tradeoffs between these three elements. This allows us to, for example, construct difference schemes that remain stable for very large time steps, at the expense of spectral accuracy at selected wavenumbers. We extend this framework to derive optimized temporal schemes that exhibit numerical properties similar to schemes with higher order of accuracy. The optimization framework is further extended for the development of asynchronous schemes which mimic the properties of synchronous schemes closely. By relaxing communications, asynchronous schemes provide significant advantages at extreme levels of parallelism and thus provide a suitable path towards effective use of future exascale systems. Asynchronous simulations using asynchrony-tolerant and optimized asynchronous finite difference schemes will be discussed. |
Follow Us |
Engage
Become an APS Member |
My APS
Renew Membership |
Information for |
About APSThe American Physical Society (APS) is a non-profit membership organization working to advance the knowledge of physics. |
© 2023 American Physical Society
| All rights reserved | Terms of Use
| Contact Us
Headquarters
1 Physics Ellipse, College Park, MD 20740-3844
(301) 209-3200
Editorial Office
1 Research Road, Ridge, NY 11961-2701
(631) 591-4000
Office of Public Affairs
529 14th St NW, Suite 1050, Washington, D.C. 20045-2001
(202) 662-8700