Bulletin of the American Physical Society

APS March Meeting 2022

Volume 67, Number 3

Monday–Friday, March 14–18, 2022; Chicago

Session K09: Physics of Machine Learning II

Sponsoring Units: GSNP GDS DCOMP DSOFT
Chair: Yuhai Tu, IBM T. J. Watson Research Center
Room: McCormick Place W-180

Tuesday, March 15, 2022 3:00PM - 3:12PM	K09.00001: Finding Spin Glass Ground States Through Deep Reinforcement Learning Mutian o Shen, Zohar Nussinov, Yang-Yu Liu, Changjun Fan, Yizhou Sun, Zhong Liu Spin glasses are disordered magnets with random interactions that are, generally, in conflict with each other. Finding the ground states of spin glasses is not only essential for the understanding of the nature of disordered magnetic and other physical systems, but also useful to solve a broad array of hard combinatorial optimization problems across multiple disciplines. Despite decades-long efforts, an algorithm with both high accuracy and high efficiency is still lacking. Here we introduce DIRAC --- a deep reinforcement learning framework, which can be trained purely on small-scale spin glass instances and then applied to arbitrarily large ones. DIRAC displays better scalability than other methods and can be leveraged to enhance any thermal annealing method. Extensive calculations on 2D, 3D and 4D Edwards-Anderson spin glass instances demonstrate the superior performance of DIRAC over existing methods. As many hard combinatorial optimization problems have Ising spin glass formulations, our results suggest a promising tool in solving these hard problems. Moreover, the presented algorithm will help us better understand the nature of the low-temperature spin-glass phase, which is a fundamental challenge in statistical physics.
Tuesday, March 15, 2022 3:12PM - 3:24PM	K09.00002: The random energy landscape of soft-spin networks and its application to combinatorial optimizations Atsushi Yamamura, Hideo Mabuchi, Surya Ganguli Building neuromorphic analog hardware to solve large-scale optimization problems has been a central challenge spanning physics and computer science. We analyze the computational performance of one such machine, a network of coupled optical parametric oscillators that act as soft-spins confined in a double-well potential. Their remarkable capacity for solving Ising encodable combinatorial optimization problems has been intensively analyzed in many optical experiments and computer simulations. However, we lack theoretical insights into why such networks can be efficient optimizers. We tackle this question by elucidating the geometry of the network's high dimensional energy landscape through the Kac-Rice formula with full replica symmetry and supersymmetry breaking. Our analysis reveals a hierarchy of geometric phase transitions in the landscape that facilitates optimization dynamics. In particular, we reveal a geometric phase where the non-convex landscape's local minima are almost all global minima, which allows the network to find solutions significantly better than naive spectral methods. We also find the existence of a phase where we cannot adiabatically follow the global minima, which creates a finite energy gap between the networks' solution and the optimal lowest energy.
Tuesday, March 15, 2022 3:24PM - 3:36PM	K09.00003: From classification to models: When do SVMs discover physical features? Arabind Swain, Ilya M Nemenman The complexity of glasses makes it challenging to explain their dynamics. Machine Learning (ML) in recent years has emerged as a promising tool to understand glassy dynamics, predicting rearrangements of particles in the glass with a high degree of accuracy. It also helped in discovering the link between structural features of the glass and the rearrangement dynamics. Support Vector Machine (SVM) was one of the first ML methods to discover such a non-trivial relationship. Specifically, an SVM model hyper-parameter, the distance from the separating hyperplane, was found to be linearly related to the Arrhenius energy, which governed the rearrangement process. Here we investigate under which conditions SVM can discover such relevant physical quantities in glassy systems. We study a toy glass model with known energy where an SVM is trained to predict the system dynamics based on a large array of structural features. We demonstrate analytically that the distance from the inferred separating hyperplane becomes linearly related to Arrhenius energy under some conditions, which we investigate. This is a step towards understanding when ML can discover new physics.
Tuesday, March 15, 2022 3:36PM - 3:48PM	K09.00004: Machine learning probing universality class of four models Lev Shchur, Evgeni Burovski, Vladislav Chertenkov We test details of a possible investigation of the universality using a deep learning approach. We chose an example of the universality class of the two-dimensional 4-state Potts model. There are four known models within the universality class -- the 4-state Potts model, the Baxter-Wu model, the Ashkin-Teller model, and the Turban model. We answered part of the questions – accuracy of the critical temperature estimation and correlation length exponent and the possibility of extracting some critical exponents' ratios. We check the accuracy of the approach with learning using the samples generated using one of the models mentioned above and apply the trained network for the testing remaining three models.
Tuesday, March 15, 2022 3:48PM - 4:00PM	K09.00005: Inverse design of nucleation seeds Ella M King, Chrisy Xiyu Du, Michael P Brenner Homogeneous nucleation is a rare, slow process. As a result, most commercial applications favor heterogeneous nucleation, using portions of a crystal to seed crystal growth. Nearly all of our theoretical understanding of nucleation seeds is limited to spherical and near-spherical seed structures. However, flexible seeds and seeds with greater surface area may better enable crystal growth by providing more pathways to crystallization. Using a combination of forward simulation and inverse design, we investigate the optimal nucleation seed structure to promote crystal growth. A deeper understanding of heterogeneous nucleation will have applications ranging from improved growth of silicon wafers for electronics to alternative methods of seeding clouds in the atmosphere.
Tuesday, March 15, 2022 4:00PM - 4:12PM	K09.00006: Machine learning enabled large-scale quantum kinetic Monte Carlo simulations of the Falicov-Kimball model Sheng Zhang, Gia-Wei Chern, Puhan Zhang We show that the celebrated Falicov-Kimball model exhibits rich and intriguing phase-ordering dynamics. With the aid of modern machine learning methods, we demonstrate the first-ever large-scale kinetic Monte Carlo simulations on the Falicov-Kimball model. We uncover an unusual phase-separation scenario where domain coarsening occurs simultaneously at two different scales: the growth of checkerboard clusters at smaller length scales and the expansion of super-clusters, which are aggregates of the checkerboard patterns of the same sign, at a larger scale. We show that the emergence of super-clusters is due to a hidden dynamical breaking of the sublattice symmetry. A self-trapping mechanism related to the super-clusters gives rise to the arrested growth of the checkerboard patterns and of the super-clusters themselves. Glassy behaviors similar to the one reported in this work could be generic for other correlated electron systems.
Tuesday, March 15, 2022 4:12PM - 4:24PM	K09.00007: Biased Monte Carlo sampling in RBMs Beatriz Seoane, Aurélien Decelle, Cyril Furtlehner, Nicolas Bereux RBMs are generative models capable of fitting complex dataset's probability distributions. Thanks to their simple structure, they are particularly well suited for interpretability and pattern extraction, a feature particularly appealing for scientific use. Yet, in practice, it is hard to extract good equilibrium models for structured datasets (which are the standard case in most biologically interesting datasets) due to a divergence of the Monte Carlo mixing times. In this work, we show this barrier can be easily surmounted using biased Monte Carlo methods, just as commonly done in Statistical Mechanics, to reach equilibrium in the vicinity of first order phase transitions.
Tuesday, March 15, 2022 4:24PM - 4:36PM	K09.00008: Nonequilibrium Monte Carlo for unfreezing variables near computational phase transitions Masoud Mohseni, Daniel K Eppens, Federico Ricci-Tersenghi, Johan Strumpfer, Alan Ho, Raffaele Marino, Vasil Denchev, Sergei V Isakov, Sergio Boixo, Hartmut Neven Sampling highly complex cost/energy functions over discrete variables is at the heart of many open problems across different scientific disciplines and industries. The major obstacle is the emergence of many-body effects among certain interacting variables leading to critical slowing down or collective freezing for known stochastic local search strategies. An exponential computational effort is generally required to unfreeze such variables and explore other unseen regions of the configuration space. Here, we introduce a quantum-inspired family of nonlocal Nonequilibrium Monte Carlo (NMC) algorithms by developing an adaptive gradient-free strategy that can efficiently learn key instance-wise geometrical features of the cost function. That information is employed on the fly to construct spatially inhomogeneous thermal fluctuations for collectively unfreezing variables at various length scales, circumventing costly exploration versus exploitation trade-offs. We apply our algorithm to two of the most challenging combinatorial optimization problems: random k-satisfiability (k-SAT) near the computational phase transitions and Quadratic Assignment Problems (QAP). We observe significant speedup over both generic local stochastic solvers, such as Adaptive Parallel Tempering (APT), and certain specialized deterministic solvers. In particular, for the 10% of hardest random 4-SAT instances we observe two orders of magnitude improvements in the quality of solutions over state-of-the-art specialized solvers known as Survey Propagation (SP) and Backtracking Survey Propagation (BSP).
Tuesday, March 15, 2022 4:36PM - 4:48PM	K09.00009: Gauge freedoms, symmetries, and the interpretability of sequence-function relationships Justin B Kinney, Anna Posfai, David M McCandlish Quantitative models of sequence-function relationships are ubiquitous in post-genomic biology, e.g., for describing the activities of gene regulatory sequences or to model the fitness landscapes of proteins. These models usually exhibit many gauge freedoms—directions in parameter space along which transformations do not affect model output. But in contrast to the central role that gauge freedoms play in theoretical physics, the origins, properties, and consequences of gauge freedoms in sequence-function relationships have received little attention. Here we study gauge freedoms in both linear and linear-nonlinear sequence-function relationships. In particular, we connect the gauge freedoms that arise in linear models with one-hot sequence features to character permutation symmetry, and discuss the importance of such symmetry for biological interpretability. We also propose specific gauge fixing strategies that are especially useful in the context of these linear models. Finally, we identify two distinct classes of gauge freedoms that arise in linear-nonlinear models which are not present in linear models. This work thus establishes mathematical and conceptual tools for better understanding how sequence encodes biological function.
Tuesday, March 15, 2022 4:48PM - 5:00PM	K09.00010: Exploring relative advantages of dual vs single dimensionality reduction Eslam Abdelaleem, K. Michael Martini, Ilya M Nemenman The current experimental advancements in studying complex systems enable the recording of the activity of a large number of constituents of the system, while simultaneously recording the resultant system behavior. Usually, the modeling process begins with dimensionality reduction. Here we explore the dual dimensionality reduction, aiming to identify simultaneously which collective dynamics from the individual constituents are responsible for which reduced dimensional response features. Using linear modeling, we show that the dual dimensionality reduction approach requires significantly fewer data points than the regular methods (i.e., reducing each feature independently, and then identifying relations between the reduced descriptions). However, increasing the number of recorded dimensions is reflected as more noise in the measurements (quantified by the Signal to Noise Ratio - SNR). We numerically investigate the effect of SNR and the number of samples or observations N on single versus dual dimensionality reduction, showing in which regimes and conditions dual reduction is better and when both are the same. Then we outline how we study these results analytically.
Tuesday, March 15, 2022 5:00PM - 5:12PM	K09.00011: An unsupervised neural network learns reproducible and interpretable representations of active matter systems Chih-Wei Joshua Liu, Nikta Fakhri, Junang Li, Michal Szurek Active matter systems are high-dimensional. Dimensionality reduction is key to efficient characterization of systems' dynamics, but existing approaches for dimensionality reduction, such as principal-component analysis and singular-value decomposition, can generate representations that are difficult to interpret. We previously presented an unsupervised neural network approach for representing high-dimensional systems. Building on this work, here we present a variational Bayesian method for dimensionality reduction in dynamics data. We show that our unsupervised algorithm learns reproducible and interpretable representations of the system and observe putative currents in the representation phase space. Our data-driven approach facilitates experimental characterization of high-dimensional dynamics, and may also enable irreversibility quantification in general active matter systems.
Tuesday, March 15, 2022 5:12PM - 5:24PM	K09.00012: Criticality in Deep Neural Networks using Jacobian(s) Darshil H Doshi, Andrey Gromov, Tianyu He Deep Neural Networks (DNNs) have proven to be extremely successful in a variety of pattern-recognition tasks. In contrast, the analytical understanding lags far behind. The simplifying limit of infinite width offers a way to analyze such networks. In this limit DNNs become Gaussian Processes; taking on a deterministic form. Working in this limit, we look at the “propagation of signal” through the network to identify “phases” in the space of parameter-distributions (weights and biases). Specifically, we focus on the propagation of gradients, using the Jacobian(s) of the network function. The norm of the Jacobian matrices succinctly capture the converging and diverging behavior (phases) of the gradient propagation. Furthermore, we show that the network performs optimally at the boundary of these two phases. The analysis provides us with the optimal values of parameter-initializations for training DNNs.
Tuesday, March 15, 2022 5:24PM - 5:36PM	K09.00013: Understanding Layer Normalization in Deep Neural Networks Tianyu He, Darshil H Doshi, Andrey Gromov Deep neural networks (DNNs) have been proven to be very powerful in classification, language modeling, and computer vision problems. However, training DNNs are computationally hard due to the huge number of parameters. A good initialization of a DNN can save enormous amounts of computational power. Layer normalization (LayerNorm) is introduced to make training faster and generalization easier. We show that the effect of LayerNorm can be quantitatively studied in infinite width limit, by plotting phase diagrams of DNNs. We then show empirically that for many cases, with LayerNorm layers, that a DNN initialized close to the phase boundary can have the best performance in both training and validation.
Tuesday, March 15, 2022 5:36PM - 5:48PM	K09.00014: Information Bottleneck for Data-driven Renormalization without Locality K. Michael Martini, Joseph L Natale, Ilya M Nemenman The renormalization group (RG) has been immensely successful in defining variables relevant for the description of macroscopic systems. It achieves this by successively coarse graining and removing the smallest length scales in the problem to capture the long wavelength behavior. RG has been generally used for systems where the interactions are local and highly symmetric. Here we introduce a method for coarse graining experimental data, while extracting features relevant to their macroscopic behavior, without explicit references to locality or symmetry. We use the value of the mutual information between the microscopic components of the system as a proxy for locality. Our approach starts by selecting the most "local" microscopic components and uses the information bottleneck method to coarse-grain them while preserving as much information about the next "scale" as possible. We analytically determine the optimal tradeoff parameter between compression and information retention by finding the leading order error in our estimate of the information. By repeatedly applying this procedure we can recover the renormalization group flow for coupling strengths and correlations for data taken from the 2D Ising system. Applications to neural recordings are forthcoming.

About APS

The American Physical Society (APS) is a non-profit membership organization working to advance the knowledge of physics.

Headquarters 1 Physics Ellipse, College Park, MD 20740-3844 (301) 209-3200
Editorial Office 100 Motor Pkwy, Suite 110, Hauppauge, NY 11788 (631) 591-4000
Office of Public Affairs 529 14th St NW, Suite 1050, Washington, D.C. 20045-2001 (202) 662-8700

Bulletin of the American Physical Society

APS March Meeting 2022

Volume 67, Number 3

Monday–Friday, March 14–18, 2022; Chicago

Session K09: Physics of Machine Learning II

Follow Us

Engage

My APS

Information for

About APS