Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity

From MaRDI portal
(Redirected from Publication:92247)




Abstract: Dense kernel matrices ThetainmathbbRNimesN obtained from point evaluations of a covariance function G at locations xi1leqileqNsubsetmathbbRd arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions of elliptic boundary value problems and homogeneously-distributed sampling points, we show how to identify a subset Ssubset1,dots,N2, with , such that the zero fill-in incomplete Cholesky factorisation of the sparse matrix Thetaij1(i,j)inS is an epsilon-approximation of Theta. This factorisation can provably be obtained in complexity O(Nlog(N)logd(N/epsilon)) in space and O(Nlog2(N)log2d(N/epsilon)) in time, improving upon the state of the art for general elliptic operators; we further present numerical evidence that d can be taken to be the intrinsic dimension of the data set rather than that of the ambient space. The algorithm only needs to know the spatial configuration of the xi and does not require an analytic representation of G. Furthermore, this factorization straightforwardly provides an approximate sparse PCA with optimal rate of convergence in the operator norm. Hence, by using only subsampling and the incomplete Cholesky factorization, we obtain, at nearly linear complexity, the compression, inversion and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky factorization we also obtain a solver for elliptic PDE with complexity O(Nlogd(N/epsilon)) in space and O(Nlog2d(N/epsilon)) in time, improving upon the state of the art for general elliptic operators.



Cites work


Cited in
(34)


Describes a project that uses

Uses Software



Summary This article provides an overview of numerical methods for approximating and solving problems related to Gaussian processes, kernel matrices, and elliptic PDEs. It discusses various approaches for efficiency and accuracy, including sparse linear solvers (e.g., \multigrid, sparse Cholesky factorization) and low-rank approximations (Nyström, rank-revealing Cholesky factorization). Hierarchical methods (hierarchical matrices, HODLR, HSS matrices) and wavelet-based techniques are also highlighted. A notable contribution mentioned involves a modified Cholesky factorization for kernel matrices from elliptic PDE Green's functions, aiming for scalability and accuracy. Overall, the article showcases a range of solutions for computational challenges in these domains.

Summary_simple Scientists use various methods to solve complex math problems related to Gaussian processes and partial differential equations (PDEs). These problems often involve large amounts of data, making them hard to compute. To tackle this, researchers employ techniques like sparse linear solvers, low-rank approximations, hierarchical methods, and wavelet-based approaches. A new method, modifying an existing math tool called Cholesky factorization, shows promise in solving certain PDEs efficiently without needing overly complex structures. This variety of approaches helps scientists find the best way to solve their specific computational challenges.


This page was built for publication: Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q92247)