77th Annual Meeting of the Division of Fluid Dynamics
Sunday–Tuesday, November 24–26, 2024;
Salt Lake City, Utah
Session T15: Low-Order Modeling and Machine Learning in Fluid Dynamics: Methods V
4:45 PM–6:29 PM,
Monday, November 25, 2024
Room: 155 E
Chair: Haithem Taha, University of California, Irvine
Abstract: T15.00002 : Intrinsic Instabilities and Generalization Challenges in Neural Partial Differential Equations*
4:58 PM–5:11 PM
Abstract
Presenter:
Arvind T Mohan
(Los Alamos National Laboratory (LANL))
Authors:
Arvind T Mohan
(Los Alamos National Laboratory (LANL))
Ashesh K Chattopadhyay
(University of California, Santa Cruz)
Jonah M Miller
(Los Alamos National Laboratory)
NeuralPDEs have seen considerable success and interest, as they directly embed neural networks inside physics PDEs. Like most successful ML models, they are trained on PDE simulations as “ground truth.” An implicit assumption made is that this represents only physics. In contrast, mathematics dictates that these are only numerical approximations of true physics. Since the NeuralPDEs intimately tie networks to the mathematically rigorous governing PDEs, there is also a widespread assumption that NeuralPDEs are more trustworthy and generalizable. In this work, we rigorously test these assumptions, using established ideas from computational physics and numerical analysis to verify if they predict accurate solutions for the right reasons. We posit that NeuralPDEs learn the artifacts in the simulation training data arising from the spatial derivatives' discretized Taylor Series truncation error. Consequently, we find that NeuralPDE models are systematically biased, and their generalization capability often results from a fortuitous interplay of numerical dissipation and truncation error in the training dataset and NeuralPDE, which seldom happens in practical applications. The evidence for our hypothesis is provided with theory, numerical experiments and dynamical system analysis. We show this bias manifests aggressively in simple systems such as the Burgers and KdV equations. Additionally, we demonstrate that an eigenanalysis of the learned network weights can indicate a priori if the model will be unstable/inaccurate for out-of-distribution inputs. Additionally, we show evidence that even when the training dataset is qualitatively and quantitatively accurate, intrinsic sample differences in truncation error act as an “adversarial attack” by destroying generalization accuracy in NeuralPDEs despite achieving excellent training accuracy. Finally, we discuss the implications of this finding on the reliability and robustness of NeuralPDEs and ML models for applications.
*Funded by LANL LDRD