Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity

From MaRDI portal
Publication:92247

DOI10.1137/19M129526XzbMATH Open1461.65067arXiv1706.02205OpenAlexW3156927273MaRDI QIDQ92247FDOQ92247


Authors: Florian Schäfer, T. J. Sullivan, Houman Owhadi, Florian Schäfer, T. J. Sullivan, Houman Owhadi Edit this on Wikidata


Publication date: 7 June 2017

Published in: Multiscale Modeling & Simulation (Search for Journal in Brave)

Abstract: Dense kernel matrices ThetainmathbbRNimesN obtained from point evaluations of a covariance function G at locations xi1leqileqNsubsetmathbbRd arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions of elliptic boundary value problems and homogeneously-distributed sampling points, we show how to identify a subset Ssubset1,dots,N2, with , such that the zero fill-in incomplete Cholesky factorisation of the sparse matrix Thetaij1(i,j)inS is an epsilon-approximation of Theta. This factorisation can provably be obtained in complexity O(Nlog(N)logd(N/epsilon)) in space and O(Nlog2(N)log2d(N/epsilon)) in time, improving upon the state of the art for general elliptic operators; we further present numerical evidence that d can be taken to be the intrinsic dimension of the data set rather than that of the ambient space. The algorithm only needs to know the spatial configuration of the xi and does not require an analytic representation of G. Furthermore, this factorization straightforwardly provides an approximate sparse PCA with optimal rate of convergence in the operator norm. Hence, by using only subsampling and the incomplete Cholesky factorization, we obtain, at nearly linear complexity, the compression, inversion and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky factorization we also obtain a solver for elliptic PDE with complexity O(Nlogd(N/epsilon)) in space and O(Nlog2d(N/epsilon)) in time, improving upon the state of the art for general elliptic operators.


Full work available at URL: https://arxiv.org/abs/1706.02205




Recommendations




Cites Work


Cited In (34)

Uses Software



Summary: This article provides an overview of numerical methods for approximating and solving problems related to Gaussian processes, kernel matrices, and elliptic PDEs. It discusses various approaches for efficiency and accuracy, including sparse linear solvers (e.g., \multigrid, sparse Cholesky factorization) and low-rank approximations (Nyström, rank-revealing Cholesky factorization). Hierarchical methods (hierarchical matrices, HODLR, HSS matrices) and wavelet-based techniques are also highlighted. A notable contribution mentioned involves a modified Cholesky factorization for kernel matrices from elliptic PDE Green's functions, aiming for scalability and accuracy. Overall, the article showcases a range of solutions for computational challenges in these domains.

Summary_simple: Scientists use various methods to solve complex math problems related to Gaussian processes and partial differential equations (PDEs). These problems often involve large amounts of data, making them hard to compute. To tackle this, researchers employ techniques like sparse linear solvers, low-rank approximations, hierarchical methods, and wavelet-based approaches. A new method, modifying an existing math tool called Cholesky factorization, shows promise in solving certain PDEs efficiently without needing overly complex structures. This variety of approaches helps scientists find the best way to solve their specific computational challenges.


This page was built for publication: Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q92247)