Bulletin of the American Physical Society

APS March Meeting 2023

Volume 68, Number 3

Las Vegas, Nevada (March 5-10)
Virtual (March 20-22); Time Zone: Pacific Time

Session B12: Statistical Physics Meets Machine Learning

Sponsoring Units: GSNP
Chair: Yuhai Tu, IBM T. J. Watson Research Center
Room: Room 235

Monday, March 6, 2023 11:30AM - 12:06PM	B12.00001: Understanding machine learning via solvable models Invited Speaker: Lenka Zdeborová The affinity between statistical physics and machine learning has a long history, I will describe the main lines of this long-lasting friendship in the context of current theoretical challenges and open questions about deep learning. Theoretical physics often proceeds in terms of solvable synthetic models, I will describe the related line of work on solvable models of simple feed-forward neural networks. I will highlight a path forward to capture the subtle interplay between the structure of the data, the architecture of the network, and the learning algorithm.
Monday, March 6, 2023 12:06PM - 12:42PM	B12.00002: Modern Hopfield Networks in AI and Neurobiology Invited Speaker: Dmitry Krotov Modern Hopfield Networks or Dense Associative Memories are recurrent neural networks with fixed point attractor states that are described by an energy function. In contrast to conventional Hopfield Networks, their modern versions have a very large memory storage capacity, which makes them appealing tools for many problems in machine learning and cognitive and neuro-sciences. In this talk I will introduce an intuition and a mathematical formulation of this class of models, and will give examples of problems in AI that can be tackled using these new ideas. I will also explain how different individual models of this class (e.g. hierarchical memories, attention mechanism in transformers, etc.) arise from their general mathematical formulation with the Lagrangian functions.
Monday, March 6, 2023 12:42PM - 1:18PM	B12.00003: A Picture of the Prediction Space of Deep Networks Invited Speaker: Pratik Chaudhari There are two stark paradoxes in deep learning today. First, deep networks have many more parameters than the number of training data and they can therefore overfit. And yet, these networks predict remarkably accurately---defying accepted statistical wisdom. Second, training deep networks is a high-dimensional, large-scale and non-convex optimization problem and should be prohibitively hard. And yet, training is tractable---even easy. This talk seeks to shed light upon these paradoxes. It will use techniques from information geometry to study the prediction space of the deep networks. I will argue that deep networks generalize well because of a characteristic structure in the space of learning tasks. The input correlation matrix for typical tasks has a “sloppy” eigenspectrum where, in addition to a few large eigenvalues, there is a large number of small eigenvalues that are distributed uniformly over a very large range. As a consequence, quantities such as the Hessian or the Fisher Information Matrix also have a sloppy eigenspectrum. Using these ideas, I will demonstrate an analytical non-vacuous generalization bound for deep networks. I will argue that training a deep network is computationally tractable because for sloppy tasks, the training process explores an extremely low-dimensional (~0.001% of the dimensionality of the embedding space) manifold in the prediction space. Models with different neural architectures (fully-connected, convolutional, residual, and attention-based), training methods (stochastic gradient descent and variants), weight initializations (random vs. pre-training on random labels), and regularization techniques (weight-decay, batch-normalization, and data-augmentation) evolve along very similar trajectories in the prediction space when trained for the same task and traverse a very similar manifold.
Monday, March 6, 2023 1:18PM - 1:54PM	B12.00004: Lessons from scale in large language models and quantitative reasoning Invited Speaker: Ethan Dyer Large language models trained on diverse training data have shown impressive results on many tasks involving natural language -- in many cases matching or exceeding human performance. Some measures of progress exhibit remarkably robust power-law improvement over many orders of magnitude in dataset, model and compute scale, while other capabilities remain difficult to extrapolate. One domain which has traditionally been challenging for such models is multi-step quantitative reasoning for mathematics and science. I will discuss recent progress attempting to understand and extrapolate model capabilities with scale and Minerva, a large language model designed to perform multi-step STEM problem solving.
Monday, March 6, 2023 1:54PM - 2:30PM	B12.00005: Deep Learning Theory Beyond the Kernel Limit Invited Speaker: Cengiz Pehlevan Deep learning has emerged as a successful paradigm for solving challenging machine learning and computational problems across a variety of domains. However, theoretical understanding of the training and generalization of modern deep learning methods lags behind current practice. I will give an overview of our recent results in this domain, including a new theory that we derived by applying dynamical field theory to deep learning dynamics. This theory gives insight into internal representations learned by the network under different learning rules.

About APS

The American Physical Society (APS) is a non-profit membership organization working to advance the knowledge of physics.

Headquarters 1 Physics Ellipse, College Park, MD 20740-3844 (301) 209-3200
Editorial Office 100 Motor Pkwy, Suite 110, Hauppauge, NY 11788 (631) 591-4000
Office of Public Affairs 529 14th St NW, Suite 1050, Washington, D.C. 20045-2001 (202) 662-8700

Bulletin of the American Physical Society

APS March Meeting 2023

Volume 68, Number 3

Las Vegas, Nevada (March 5-10)Virtual (March 20-22); Time Zone: Pacific Time

Session B12: Statistical Physics Meets Machine Learning

Follow Us

Engage

My APS

Information for

About APS

Las Vegas, Nevada (March 5-10)
Virtual (March 20-22); Time Zone: Pacific Time