Alec Dunton Dissertation Defense

Alec Dunton, Department of Applied Mathemtics, Uniersity of Colorado Boulder

Matrix Methods for Data Compression in Large-Scale Applications

 

Modern scientific applications generate and require more data every year, far outpacing storage capabilities. This growing disparity has inspired work in lossless and lossy data compression, which seek to alleviate the overwhelming surge in big data. Lossless compression approaches provide an exact reconstruction of the original data, with the tradeoff of a lower compression factor. Lossy compression approaches, on the other hand, achieve larger compression factors than lossless methods at the cost of error in reconstruction. 

In the interest of reducing the size of data generated in scientific applications, this thesis proposes low-rank approximation-based lossy compression algorithms for reducing the dimensionality of data matrices. Several pass-efficient, memory lean, and fast low-rank approximation methods are proposed for temporal compression of scientific data. These approaches are shown to compress matrices arising in various scientific applications. These low-rank methods are particularly successful in compressing scientific data matrices when a significant fraction of the variance in the data can be captured on a low-dimensional linear subspace; such structure typically arises in diffusion-dominated problems such as low Reynolds number flow simulations. 

On the other hand, in advection- and convection-dominated problems, low-rank matrix methods can perform quite poorly. Recent work in deep learning has demonstrated that a class of neural networks called autoencoders can break through this limitation on linear dimensionality reduction methods. Instead of identifying low-dimensional linear subspaces, autoencoders learn nonlinear manifolds which can approximate a data matrix using far fewer latent dimensions. Generalizing the linear subspace-based approaches developed in the previous chapters of this thesis, the final work presented provides an online algorithm for embedding and reconstructing large-scale data matrices on nonlinear manifolds using autoencoders.

Dial-In Information

Zoom ID: 361 981 5494

Zoom Password: A8742nJAD&

Wednesday, June 30, 2021 at 9:00am to 11:00am

Virtual Event
Event Type

Colloquium/Seminar

Interests

Science & Technology, Research & Innovation

Audience

Faculty, Students, Graduate Students, Postdoc

College, School & Unit

Engineering & Applied Science

Group
Applied Mathematics
Subscribe
Google Calendar iCal Outlook