Functional principal subspace sampling for large scale functional data analysis
From MaRDI portal
(Redirected from Publication:2137809)
Abstract: Functional data analysis (FDA) methods have computational and theoretical appeals for some high dimensional data, but lack the scalability to modern large sample datasets. To tackle the challenge, we develop randomized algorithms for two important FDA methods: functional principal component analysis (FPCA) and functional linear regression (FLR) with scalar response. The two methods are connected as they both rely on the accurate estimation of functional principal subspace. The proposed algorithms draw subsamples from the large dataset at hand and apply FPCA or FLR over the subsamples to reduce the computational cost. To effectively preserve subspace information in the subsamples, we propose a functional principal subspace sampling probability, which removes the eigenvalue scale effect inside the functional principal subspace and properly weights the residual. Based on the operator perturbation analysis, we show the proposed probability has precise control over the first order error of the subspace projection operator and can be interpreted as an importance sampling for functional subspace estimation. Moreover, concentration bounds for the proposed algorithms are established to reflect the low intrinsic dimension nature of functional data in an infinite dimensional space. The effectiveness of the proposed algorithms is demonstrated upon synthetic and real datasets.
Recommendations
- Multi-dimensional functional principal component analysis
- Analysing large datasets of functional data: a survey sampling point of view
- Sampled forms of functional PCA in reproducing kernel Hilbert spaces
- Functional principal components analysis via penalized rank one approximation
- An algorithm for the principal component analysis of large data sets
Cites work
- A reproducing kernel Hilbert space approach to functional linear regression
- A statistical perspective on algorithmic leveraging
- A statistical perspective on randomized sketching for ordinary least-squares
- An introduction to matrix concentration inequalities
- Analysing large datasets of functional data: a survey sampling point of view
- Asymptotics and concentration bounds for bilinear forms of spectral projectors of sample covariance
- CLT in functional linear regression models
- Confidence bands for Horvitz-Thompson estimators using sampled noisy functional data
- Defining probability density for a distribution of random functions
- Estimation in functional linear quantile regression
- Estimation of the Mean of Functional Time Series and a Two-Sample Problem
- Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication
- Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix
- Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition
- Fast approximation of matrix coherence and statistical leverage
- Faster least squares approximation
- Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions
- Functional Data Analysis for Sparse Longitudinal Data
- Functional linear regression analysis for longitudinal data
- Horvitz-Thompson estimators for functional data: asymptotic confidence bands and optimal allocation for stratified sampling
- Inference for functional data with applications
- Introduction to Functional Data Analysis
- Kernel ridge vs. principal component regression: minimax bounds and the qualification of regularization operators
- Lectures on randomized numerical linear algebra
- Methodology and convergence rates for functional linear regression
- More efficient estimation for logistic regression with optimal subsamples
- Newton Sketch: A Near Linear-Time Optimization Algorithm with Linear-Quadratic Convergence
- OSUMC
- On some extensions of Bernstein's inequality for self-adjoint operators
- Optimal Sampling for Generalized Linear Models Under Measurement Constraints
- Optimal subsampling for large sample logistic regression
- Principal component models for sparse functional data
- Properties of design-based functional principal components analysis
- Randomized Sketches of Convex Programs With Sharp Guarantees
- Randomized sketches for kernels: fast and optimal nonparametric regression
- Rotation sampling for functional data
- Sketched ridge regression: optimization perspective, statistical perspective, and model averaging
- Sketching as a tool for numerical linear algebra
- Sketching for principal component regression
- Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods
- Theoretical foundations of functional data analysis, with an introduction to linear operators
- Uniform convergence and asymptotic confidence bands for model-assisted estimators of the mean of sampled functional data
Cited in
(3)
This page was built for publication: Functional principal subspace sampling for large scale functional data analysis
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2137809)