Bulletin of the American Physical Society
APS March Meeting 2020
Volume 65, Number 1
Monday–Friday, March 2–6, 2020; Denver, Colorado
Session U24: Statistical Physics Meets Machine Learning |
Hide Abstracts |
Sponsoring Units: GSNP GDS Chair: David Schwab Room: 401 |
Thursday, March 5, 2020 2:30PM - 2:42PM |
U24.00001: A nonlinear and statistical physics approach to machine learning electronic hardware Daniel Lathrop, Liam Shaughnessy, Brian Hunt, Heidi Komkov, Alessandro Restelli As the uses of machine learning continue to grow in science and industry, there is a need to reduce power and increase processing speed. As has been done before, hardware co-processors can take over some tasks. We present research developing novel machine learning hardware that relies on a large network of nonlinear electronic nodes to instantiate a reservoir computer. We characterize the behaviors of these networks and find a critical point as we adjust their sensitivity. Moreover, we find that their machine learning performance, in terms of accuracy, depends on the sensitivity of the network. |
Thursday, March 5, 2020 2:42PM - 2:54PM |
U24.00002: Reservoir Computer Optimization for Parity Checking Wendson Barbosa, Guilhem Ribeill, Minh-Hai Nguyen, Thomas A Ohki, Graham E Rowlands, Daniel J Gauthier In recent years, the Reservoir Computing (RC) approach, a recurrent-neural-network-based scheme for Machine Learning (ML), has been used extensively for solving different tasks such as time-series prediction, nonlinear system control and classification tasks. A benchmark problem regarding the latter is the parity check of a random sequence of bits. Although at a first look it seems to be a simple problem, it is known to be a very difficult task to be solved using ML techniques. We shall discuss the reservoir computer hyper-parameters optimization and exploration of different architectures for inputting data to the reservoir to improve the parity classification performance as well as paths toward high-speed hardware implementation. |
Thursday, March 5, 2020 2:54PM - 3:06PM |
U24.00003: Using Machine Learning to Infer Composition of Complex Chemical Mixtures Unab Javed, Kannan P Ramaiyan, Cortney R Kreller, Eric L Brosha, Rangachary Mukundan, Alexandre Morozov Predicting the concentration of each constituent in a complex gas or liquid mixture is an important challenge in many fields of science and technology, ranging from real-time monitoring of automotive exhaust to detecting potentially toxic substances in the air. We employ an array of solid-state sensors to test gas mixtures in a controlled laboratory environment, recording voltage responses from the sensor array. The sensors in the array typically react to more than one gas in the mixture and their voltage responses are non-linear, making the task of decoding compositions of gas mixtures highly non-trivial. We have developed a Bayesian algorithm which, given a set of readings from the array, identifies and quantifies all gases present in the system. The Bayesian nature of our approach allows us to estimate the uncertainty of the predictions in a rigorous manner and to carry out model selection. Our machine learning framework can be used to model any non-linear system with correlations between inputs and has applications in a wide variety of settings. |
Thursday, March 5, 2020 3:06PM - 3:18PM |
U24.00004: Deep generative spin-glass models with normalizing flows Masoud Mohseni, Gavin Hartnett We develop and train a novel universal class of deep spin-glass models that can learn to represent multiscale phenomena in physics and computer science including critical phenomena, discrete optimization, and probabilistic inference in graphical models. To this end, we first provide a continuous formulation of spin-glasses and convert the discrete Boltzmann distributions into physically equivalent continuous distributions. We then use recent techniques in deep learning known as “Normalizing Flows” to generate new low-energy states of such complex systems below spin-glass phase transitions. In particular, we demonstrate that the real non-volume preserving flows can be successfully trained to generate complex spin-glass distributions. We explore two alternative methods for training the normalizing flow based on minimizing reverse and forward Kullback-Leibler divergence. Moreover, we show how the problem of mode collapse for such deep generative models can be overcome at or below a critical point. |
Thursday, March 5, 2020 3:18PM - 3:30PM |
U24.00005: A Continuous Formulation of Discrete Spin-Glass Systems Gavin Hartnett, Masoud Mohseni We introduce a new, continuous formulation of discrete spin-glasses in which the discrete Boltzmann distribution is replaced by a continuous probability density over the real numbers. This formulation applies for any discrete spin-glass with Ising spins coupled through 2-body interaction terms. A major benefit of working with such a continuous formulation is that the energy landscape may be studied directly using tools from differential geometry and topology. In particular, we show that for a given set of couplings there is a critical temperature above which the energy landscape is convex. Below this temperature the landscape becomes non-convex due to the appearance of multiple critical points. In general, this convex/non-convex transition is distinct from phase transitions to the spin-glass or ferromagnetic phases. In this talk, we introduce our general formalism and theoretically establish the similarities and differences with the mean-field models and the Thouless-Anderson-Palmer equation. We then provide details for a few specific cases including the Sherrington-Kirkpatrick model and random restricted Boltzmann machines. |
Thursday, March 5, 2020 3:30PM - 3:42PM |
U24.00006: Machine-learning the DFT of a classical statistical-mechanical system Petr Yatsyshin, Andrew Duncan, Serafim Kalliadasis We apply machine-learning (ML) to the construction of mean-field theories of classical statistical-mechanical systems. In the density functional formulation of classical statistical physics, the Helmholtz free energy generates a hierarchy of many-body direct correlation functions, and the one-body density is its first member. In equilibrium, the latter minimises the free energy functional of the system. Thus, knowing the free energy functional allows one to solve classical statistical mechanics. In this talk, we address the inverse problem of finding the free energy functional, given the particle data corresponding to the system in equilibrium. Introducing an adversarial ML methodology, we reformulate the learning problem as a two-player game, with the best fitting parameters obtained as the solution of a minimax problem. As proof of concept, we consider the Percus’ model of a 1D fluid, consisting of hard rods on a line, for which the exact functional is known. We emphasize the physics-informed aspect of ML, where the physical constraints, including the "physical intuition”, are combined with ML methods to obtain meaningful results. |
Thursday, March 5, 2020 3:42PM - 3:54PM |
U24.00007: Dynamical loss functions for Machine Learning Miguel Ruiz Garcia, Ge Zhang, Samuel Schoenholz, Andrea Jo-Wei Liu Current deep learning approaches usually rely on very diverse architectures, stemming from trial and error design. This has triggered great interest in improving the theoretical understanding of machine learning. The structure of the loss function landscape and the way it affects the performance of the algorithm have received recent interest. Loss functions penalize incorrect identifications and the focus has largely been on optimizing algorithms (e.g. stochastic gradient descent) within the landscapes defined by the loss functions. We take a different approach by exploring new loss functions. In particular, we explore the effect of dynamical loss functions, where weights on each training example change during training. Preliminary results show that this new approach can outperform the results obtained with static loss functions for particular cases. |
Thursday, March 5, 2020 3:54PM - 4:06PM |
U24.00008: A mechanical model for supervised learning Menachem Stern, Chukwunonso Arinze, Leron Perez, Stephanie Palmer, Arvind Murugan A broad goal of engineering is to make functional machines with specific, programmed input-output responses. When inputs are specified in advance and few in number, this goal is sought through rational design, changing the system elements to obtain desired responses. In the supervised learning framework of computer science, system parameters (synapses) are modified in response to observed examples of the correct input-output mapping (classification). |
Thursday, March 5, 2020 4:06PM - 4:18PM |
U24.00009: Quantifying statistical mechanical learning in a many-body system with machine learning Weishun Zhong, Jacob M Gold, Sarah Marzen, Jeremy L England, Nicole Yunger Halpern Far-from-equilibrium many-body systems, from soap bubbles to suspensions to polymers, learn the drives that push them. This learning has been characterized with thermodynamic properties, such as work dissipation and strain. We move beyond these macroscopic properties first defined for equilibrium contexts: We quantify statistical mechanical learning with machine learning. Our strategy relies on a parallel that we identify between representation learning and statistical mechanics in the presence of a drive. We apply this parallel to measure novelty detection, classification, and memory capacity. Numerical simulations of a spin glass illustrate our technique. This toolkit exposes self-organization that eludes detection by thermodynamic measures. Our toolkit more reliably and precisely identifies and quantifies learning by matter. |
Thursday, March 5, 2020 4:18PM - 4:30PM |
U24.00010: Information-bottleneck renormalization group for self-supervised representation learning Vudtiwat Ngampruetikorn, William S Bialek, David J. Schwab While highly successful, most deep learning applications rely on supervised learning which requires a large set of manually labelled data. But labelled data are not always available, and effective learning from unlabelled datasets - self-supervised learning - has the potential to greatly expand the scope of deep learning applications. Here we propose a self-supervised learning method that combines the concepts of the information bottleneck and the renormalization group. More specifically we use the information bottleneck to regularize a coarse-graining procedure by encouraging a representation to discard locally specific information (bottleneck) while retaining the long-wavelength features (implicitly assumed to be relevant for downstream tasks). We use variational and noise contrastive approaches to scale up our method for large systems, and we demonstrate our implementation on datasets from machine learning. |
Thursday, March 5, 2020 4:30PM - 4:42PM |
U24.00011: On matching symmetries and information between training time series and machine dynamics. Jan Engelbrecht, Owen Tong Yang, Renato Mirollo Recurrent networks and some deep feed-forward networks in machine learning effectively construct a very high-dimensional dynamical system that classifies objects through its asymptotic dynamics. Many training inputs of the same class are used to construct a machine with similar trajectories flowing to the same fixed point attractor. The attractor itself carries no information/entropy. A specific example in reservoir computing is to train a single-layer machine using a trajectory of a known chaotic dynamical system and construct a linear projection from machine variables back to the chaotic system that reproduces the training chaotic trajectory (validation) and "predicts" a bit into the future (testing). Our perspective is to go beyond validation/testing of a particular trajectory and analyze the symmetry and information in the general asymptotic dynamics of the trained machine. For well trained machines we can get the information and symmetries of the machine dynamics to approximately match that of the training dynamical system. The machine's dynamics then has its own strange attractor and the machine can generate new time series of the same class, i.e. that are different but equivalent to the training data. |
Thursday, March 5, 2020 4:42PM - 4:54PM |
U24.00012: Deep Learning on the 2-Dimensional Ising Model to Extract the Crossover Region Nicholas Walker, Ka-Ming Tam, Mark Jarrell The 2-dimensional square Ising model is investigated with a variational autoencoder in the non-vanishing field case for the purpose of extracting the crossover region between the ferromagnetic and paramagnetic phases. The encoded latent variable space is found to provide suitable metrics for tracking the order and disorder in the Ising configurations that extends to the extraction of a crossover region in a way that is consistent with expectations. The extracted results achieve an exceptional prediction for the critical point as well as favorable comparison to the configurational energetics of the model and agreement with previously published results on the configurational magnetizations of the model. The performance of this method provides encouragement for the use of machine learning to extract meaningful structural information from complex physical systems with no known order parameters. |
Thursday, March 5, 2020 4:54PM - 5:06PM |
U24.00013: Training and classification using Restricted Boltzmann Machine (RBM) on the D-Wave 2000Q Vivek Dixit, Sabre Kais, Muhammad A Alam Training and classification of a restricted Boltzmann machine (RBM) has been performed using D-Wave system. RBM is an energy-based model, which assigns low energy values to the configurations of interest. The D-Wave 2000Q is an adiabatic quantum computer which has been used to obtain samples for the gradient learning. Two datasets namely ‘bars and stripes (BAS)’ and ‘solar farm (PV)’ have been used. For BAS dataset, objective is to classify a given pattern as bars or stripes, while for PV dataset the goal is to predict “efficiency degradation” based on some model parameters. Results are compared with RBM trained using standard contrastive divergence. Classification and data reconstruction are also presented. Estimated classification accuracies indicate comparable performance of the both methods. D-Wave training seems to result in smaller weights, thus reduces overfitting problems. |
Thursday, March 5, 2020 5:06PM - 5:18PM |
U24.00014: Statistical Physics Analysis of Training of Restricted Boltzmann Machines Sangchul Oh, Abdelkader Baggag A restricted Boltzmann machine is a generative probabilistic graphic network. A probability of finding the network in a certain configuration is given by the Boltzmann distribution. It has wide applications from the image generation to the neural network representation of quantum many-body states. We analyze the training process of the restricted Boltzmann machine in the context of statistical physics. For the Bar-and-Stripe pattern as a small size restricted Boltzmann machine, thermodynamic quantities such as free energy, internal energy, work, and entropy are calculated as a function of training epochs. We also investigate the Jarzynski equality that connects the work done during training and the difference in free energies before and after training. It is found that, even after long training, the probabilities of possible outcomes are not even. Some possible sources that cause imperfect training are discussed. |
Thursday, March 5, 2020 5:18PM - 5:30PM |
U24.00015: Mode-Assisted Unsupervised Learning of Restricted Boltzmann Machines Haik Manukian, Yan Ru Pei, Sean Bearden, Massimiliano Di Ventra Restricted Boltzmann Machines (RBMs) are a powerful class of unsupervised models well known in machine learning. However, unlike their more popular supervised counterparts, their training requires computing a gradient that is notoriously difficult even to approximate. In this work we show that properly combining standard gradient approximations with an off-gradient direction, constructed from samples of the RBM ground state (mode), improves their training dramatically over the standard methods. This approach, which we call "mode training", promotes faster training and stability, in addition to lowering the converged relative entropy (KL divergence). We report promising preliminary results with small models on synthetic data sets and discuss extensions to more realistic scenarios, where a physics-based approach, memcomputing [1], is used |
Follow Us |
Engage
Become an APS Member |
My APS
Renew Membership |
Information for |
About APSThe American Physical Society (APS) is a non-profit membership organization working to advance the knowledge of physics. |
© 2024 American Physical Society
| All rights reserved | Terms of Use
| Contact Us
Headquarters
1 Physics Ellipse, College Park, MD 20740-3844
(301) 209-3200
Editorial Office
100 Motor Pkwy, Suite 110, Hauppauge, NY 11788
(631) 591-4000
Office of Public Affairs
529 14th St NW, Suite 1050, Washington, D.C. 20045-2001
(202) 662-8700