Torus principal component analysis with applications to RNA structure
From MaRDI portal
(Redirected from Publication:1624855)
Abstract: There are several cutting edge applications needing PCA methods for data on tori and we propose a novel torus-PCA method with important properties that can be generally applied. There are two existing general methods: tangent space PCA and geodesic PCA. However, unlike tangent space PCA, our torus-PCA honors the cyclic topology of the data space whereas, unlike geodesic PCA, our torus-PCA produces a variety of non-winding, non-dense descriptors. This is achieved by deforming tori into spheres and then using a variant of the recently developed principle nested spheres analysis. This PCA analysis involves a step of small sphere fitting and we provide an improved test to avoid overfitting. However, deforming tori into spheres creates singularities. We introduce a data-adaptive pre-clustering technique to keep the singularities away from the data. For the frequently encountered case that the residual variance around the PCA main component is small, we use a post-mode hunting technique for more fine-grained clustering. Thus in general, there are three successive interrelated key steps of torus-PCA in practice: pre-clustering, deformation, and post-mode hunting. We illustrate our method with two recently studied RNA structure (tori) data sets: one is a small RNA data set which is established as the benchmark for PCA and we validate our method through this data. Another is a large RNA data set (containing the small RNA data set) for which we show that our method provides interpretable principal components as well as giving further insight into its structure.
Recommendations
Cites work
- scientific article; zbMATH DE number 5668397 (Why is no real title available?)
- scientific article; zbMATH DE number 3673370 (Why is no real title available?)
- Analysis of principal nested spheres
- Bayesian alignment using hierarchical models, with applications in protein bioinformatics
- Functional and shape data analysis
- Generalized Procrustes analysis
- GeoPCA
- Horizontal dimensionality reduction and iterated frame bundle development
- Intrinsic means on the circle: uniqueness, locus and asymptotics
- Multiscale inference about a density
- Multiscale methods for shape constraints in deconvolution: confidence statements for qualitative features
- Principal arc analysis on direct product manifolds
- Principal component analysis for Riemannian manifolds, with an application to triangular shape spaces
- Statistical shape analysis. With applications in R
- The circular SiZer, inferred persistence of shape parameters and application to early stem cell differentiation
Cited in
(23)- Finite Mixtures of Multivariate Wrapped Normal Distributions for Model Based Clustering of p -Torus Data
- Comments on: ``Recent advances in directional statistics
- Rejoinder on: ``Recent advances in directional statistics
- An infinitesimal probabilistic model for principal component analysis of manifold valued data
- Estimation of parameters in multivariate wrapped models for data on a p-torus
- Recent advances in directional statistics
- Statistics for data with geometric structure. Abstracts from the workshop held January 21--27, 2018
- Principal component analysis and clustering on manifolds
- Statistical methods generalizing principal component analysis to non-Euclidean spaces
- Scaled Torus Principal Component Analysis
- Rejoinder: Fitting a folded normal distribution without EM
- Random Fixed Boundary Flows
- Kurtosis test of modality for rotationally symmetric distributions on hyperspheres
- Density estimation for toroidal data using semiparametric mixtures
- Principal boundary on Riemannian manifolds
- Tukey’s Depth for Object Data
- Toroidal PCA via density ridges
- Dihedral angles principal geodesic analysis using nonlinear statistics
- Applying backward nested subspace inference to tori and polyspheres
- Clustering on the torus by conformal prediction
- Response to `Fitting a folded normal distribution without EM'
- Functional random effects modeling of brain shape and connectivity
- Data analysis on nonstandard spaces
This page was built for publication: Torus principal component analysis with applications to RNA structure
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1624855)