Bulletin of the American Physical Society
APS March Meeting 2023
Volume 68, Number 3
Las Vegas, Nevada (March 5-10)
Virtual (March 20-22); Time Zone: Pacific Time
Session B12: Statistical Physics Meets Machine LearningInvited
|
Hide Abstracts |
Sponsoring Units: GSNP Chair: Yuhai Tu, IBM T. J. Watson Research Center Room: Room 235 |
Monday, March 6, 2023 11:30AM - 12:06PM |
B12.00001: Understanding machine learning via solvable models Invited Speaker: Lenka Zdeborová The affinity between statistical physics and machine learning has a long history, I will describe the main lines of this long-lasting friendship in the context of current theoretical challenges and open questions about deep learning. Theoretical physics often proceeds in terms of solvable synthetic models, I will describe the related line of work on solvable models of simple feed-forward neural networks. I will highlight a path forward to capture the subtle interplay between the structure of the data, the architecture of the network, and the learning algorithm. |
Monday, March 6, 2023 12:06PM - 12:42PM |
B12.00002: Modern Hopfield Networks in AI and Neurobiology Invited Speaker: Dmitry Krotov Modern Hopfield Networks or Dense Associative Memories are recurrent neural networks with fixed point attractor states that are described by an energy function. In contrast to conventional Hopfield Networks, their modern versions have a very large memory storage capacity, which makes them appealing tools for many problems in machine learning and cognitive and neuro-sciences. In this talk I will introduce an intuition and a mathematical formulation of this class of models, and will give examples of problems in AI that can be tackled using these new ideas. I will also explain how different individual models of this class (e.g. hierarchical memories, attention mechanism in transformers, etc.) arise from their general mathematical formulation with the Lagrangian functions. |
Monday, March 6, 2023 12:42PM - 1:18PM |
B12.00003: A Picture of the Prediction Space of Deep Networks Invited Speaker: Pratik Chaudhari There are two stark paradoxes in deep learning today. First, deep networks have many more parameters than the number of training data and they can therefore overfit. And yet, these networks predict remarkably accurately---defying accepted statistical wisdom. Second, training deep networks is a high-dimensional, large-scale and non-convex optimization problem and should be prohibitively hard. And yet, training is tractable---even easy. This talk seeks to shed light upon these paradoxes. It will use techniques from information geometry to study the prediction space of the deep networks. |
Monday, March 6, 2023 1:18PM - 1:54PM |
B12.00004: Lessons from scale in large language models and quantitative reasoning Invited Speaker: Ethan Dyer Large language models trained on diverse training data have shown impressive results on many tasks involving natural language -- in many cases matching or exceeding human performance. Some measures of progress exhibit remarkably robust power-law improvement over many orders of magnitude in dataset, model and compute scale, while other capabilities remain difficult to extrapolate. One domain which has traditionally been challenging for such models is multi-step quantitative reasoning for mathematics and science. I will discuss recent progress attempting to understand and extrapolate model capabilities with scale and Minerva, a large language model designed to perform multi-step STEM problem solving. |
Monday, March 6, 2023 1:54PM - 2:30PM |
B12.00005: Deep Learning Theory Beyond the Kernel Limit Invited Speaker: Cengiz Pehlevan Deep learning has emerged as a successful paradigm for solving challenging machine learning and computational problems across a variety of domains. However, theoretical understanding of the training and generalization of modern deep learning methods lags behind current practice. I will give an overview of our recent results in this domain, including a new theory that we derived by applying dynamical field theory to deep learning dynamics. This theory gives insight into internal representations learned by the network under different learning rules. |
Follow Us |
Engage
Become an APS Member |
My APS
Renew Membership |
Information for |
About APSThe American Physical Society (APS) is a non-profit membership organization working to advance the knowledge of physics. |
© 2024 American Physical Society
| All rights reserved | Terms of Use
| Contact Us
Headquarters
1 Physics Ellipse, College Park, MD 20740-3844
(301) 209-3200
Editorial Office
100 Motor Pkwy, Suite 110, Hauppauge, NY 11788
(631) 591-4000
Office of Public Affairs
529 14th St NW, Suite 1050, Washington, D.C. 20045-2001
(202) 662-8700