Bulletin of the American Physical Society
APS March Meeting 2018
Monday–Friday, March 5–9, 2018; Los Angeles, California
Session B34: Petascale Science and Beyond: Applications and Opportunities for Materials, Chemical, and Bio Physics II
11:15 AM–2:15 PM,
Monday, March 5, 2018
LACC Room: 409A
Sponsoring Units: DCOMP DBIO DCP DCMP
Chair: Jack Wells, Oak Ridge National Lab
Abstract: B34.00001 : Data-driven Molecular Engineering of Solar-Powered Windows using Materials Database Auto-Generation Tools with Large-Scale Data-Mining*
11:15 AM–11:51 AM
Large-scale data-mining workflows are increasingly able to predict successfully new chemicals that possess a targeted functionality. The success of such materials discovery approaches is nonetheless contingent upon having the right data source to mine, adequate supercomputing facilities and workflows to enable this mining, and algorithms that suitably encode structure-function relationships as data-mining workflows which progressively short list data toward the prediction of a lead material for experimental validation.
This talk describes how to met these data science requirements via a large-scale data-mining case study that aims to discover new materials for solar-powered windows. In particular, the presentation shows how to auto-generate large material databases of photovoltaic-relevant experimental information from documents, using natural language processing and machine learning, via our ChemDataExtractor tool . Machine learning is then employed to populate any missing experimental data. A workflow that executes large-scale electronic structure calculations to afford a computational counterpart to these experimental data is then described. These wavefunction calculations are used to extend knowledge beyond experiment. The resulting large database of chemical structures and their optical properties is then mined for materials discovery using algorithms that are encoded forms of structure-function relationships. These molecular design rules progressively filter the parent set of chemicals until a lead candidate appears, which is experimentally validated.
*J.M.C. thanks the 1851 Royal Commission of the Great Exhibition for the 2014 Fellowship in Design, hosted by Argonne National Laboratory where work done was supported by DOE Office of Science, Office of Basic Energy Sciences, and used research resources of the Argonne Leadership Computing Facility, which
Editorial Office 1 Research Road, Ridge, NY 11961-2701 (631) 591-4000
Office of Public Affairs 529 14th St NW, Suite 1050, Washington, D.C. 20045-2001 (202) 662-8700