Applied Math Colloquium - Nathan Urban

Nathan Urban, Los Alamos National Lab

Towards the convergence of scientific simulations and data science

Numerical simulations are the gold standard for predictive physical science. However, simulations are only as good as the physical assumptions and computational resources put into them. This results in a variety of possible model forms, each making its own set of physical or numerical approximations, leading to model "structural" uncertainty in predictions. In contrast to numerical simulations, purely data-driven methods such as deep learning have emerged from traditional computer science applications such as image and speech processing, and are increasingly being applied to problems in physical science. However, they have well-known limitations in their ability to extrapolate to unseen scenarios outside their training sets, and to honor constraints and correlations in the data that may be demanded by physical laws.

As an alternative, I discuss some new applications of scientific machine learning (SciML) involving hybrid numerical/data-driven models, where components of a physical simulation are replaced by statistical machine learning components, with applications to climate science and hydrodynamics. These machine learning components, such as deep neural networks and sparse Gaussian processes, can be trained offline to reference data generated from simulations. Alternatively, the emerging field of differentiable programming (∂P) can be employed to train hybrid models online through gradient descent optimization, using an end-to-end hybrid of neural backpropagation and differential equation adjoint modeling to calculate the necessary derivative information. For model-structure uncertainty quantification, I present an approach to construct computationally efficient reduced order models using black-box system identification and adjoint construction techniques.

I conclude by briefly discussing the outlook for other areas related to the convergence of simulation modeling and data science. One is in "in-situ inference": embedding scalable parallel Bayesian spatiotemporal inference algorithms online into exascale simulations so large that the output they generate cannot be saved to disk. The other is in information fusion: combining the predictions of heterogenous collections of models and data in a form of Bayesian network or graphical model containing machine learning components trained to different data sources, for end-to-end uncertainty quantification and probabilistic prediction.

Friday, November 15, 2019 at 3:00pm to 4:00pm

Engineering Center, ECCR 245
1111 Engineering Drive, Boulder, CO 80309

Event Type

Colloquium/Seminar

Interests

Science & Technology, Research & Innovation

Audience

Faculty, Students, General Public, Postdoc

College, School & Unit

Engineering & Applied Science

Tags

colloquium

Group
Applied Mathematics
Add to Calendar
GoogleiCalOutlook