Large-scale inference of correlation among mixed-type biological traits with phylogenetic multivariate probit models
From MaRDI portal
Publication:2233162
Abstract: Inferring concerted changes among biological traits along an evolutionary history remains an important yet challenging problem. Besides adjusting for spurious correlation induced from the shared history, the task also requires sufficient flexibility and computational efficiency to incorporate multiple continuous and discrete traits as data size increases. To accomplish this, we jointly model mixed-type traits by assuming latent parameters for binary outcome dimensions at the tips of an unknown tree informed by molecular sequences. This gives rise to a phylogenetic multivariate probit model. With large sample sizes, posterior computation under this model is problematic, as it requires repeated sampling from a high-dimensional truncated normal distribution. Current best practices employ multiple-try rejection sampling that suffers from slow-mixing and a computational cost that scales quadratically in sample size. We develop a new inference approach that exploits 1) the bouncy particle sampler (BPS) based on piecewise deterministic Markov processes to simultaneously sample all truncated normal dimensions, and 2) novel dynamic programming that reduces the cost of likelihood and gradient evaluations for BPS to linear in sample size. In an application with 535 HIV viruses and 24 traits that necessitates sampling from a 12,840-dimensional truncated normal, our method makes it possible to estimate the across-trait correlation and detect factors that affect the pathogen's capacity to cause disease. This inference framework is also applicable to a broader class of covariance structures beyond comparative biology.
Recommendations
- Efficient Bayesian inference of general Gaussian models on large phylogenetic trees
- Inferring Phenotypic Trait Evolution on Large Trees With Many Incomplete Measurements
- Assessing phenotypic correlation through the multivariate phylogenetic latent liability model
- Fast likelihood calculation for multivariate Gaussian phylogenetic models with shifts
- Bayesian Phylogenetic Inference via Markov Chain Monte Carlo Methods
Cites work
- scientific article; zbMATH DE number 720679 (Why is no real title available?)
- A case study competition among methods for analyzing large spatial data
- Analysis of multivariate probit models
- Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations (with discussion)
- Assessing phenotypic correlation through the multivariate phylogenetic latent liability model
- Bayesian Gaussian Copula Factor Models for Mixed Data
- Bayesian data analysis.
- Bayesian graphical Lasso models and efficient posterior computation
- Equation of state calculations by fast computing machines
- Extending ordinal regression with a latent zero-augmented beta distribution
- Generating random correlation matrices based on vines and extended onion method
- Inference from iterative simulation using multiple sequences
- Large-scale inference of correlation among mixed-type biological traits with phylogenetic multivariate probit models
- Limit theorems for the zig-zag process
- MCMC using Hamiltonian dynamics
- Multilevel latent Gaussian process model for mixed discrete and continuous multivariate response data
- Multivariate stochastic process models for correlated responses of mixed type
- Piecewise deterministic Markov processes for scalable Monte Carlo on restricted domains
- Simple marginally noninformative prior distributions for covariance matrices
- Sparse Bayesian infinite factor models
- The Bouncy Particle Sampler: A Non-Reversible Rejection-Free Markov Chain Monte Carlo Method
- The coalescent
- The no-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo
Cited in
(11)- Implicit Copula Variational Inference
- Efficient Bayesian inference of general Gaussian models on large phylogenetic trees
- Lagged couplings diagnose Markov chain Monte Carlo phylogenetic inference
- Statistical challenges in tracking the evolution of SARS-CoV-2
- Sampling constrained continuous probability distributions: a review
- Large-scale inference of correlation among mixed-type biological traits with phylogenetic multivariate probit models
- Inferring Phenotypic Trait Evolution on Large Trees With Many Incomplete Measurements
- Posterior computation with the Gibbs zig-zag sampler
- Bayesian Inference on High-Dimensional Multivariate Binary Responses
- Assessing phenotypic correlation through the multivariate phylogenetic latent liability model
- Bayesian Conjugacy in Probit, Tobit, Multinomial Probit and Extensions: A Review and New Results
This page was built for publication: Large-scale inference of correlation among mixed-type biological traits with phylogenetic multivariate probit models
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2233162)