Identifiability and inference of non-parametric rates-across-sites models on large-scale phylo\-genies
From MaRDI portal
(Redirected from Publication:376326)
Abstract: Mutation rate variation across loci is well known to cause difficulties, notably identifiability issues, in the reconstruction of evolutionary trees from molecular sequences. Here we introduce a new approach for estimating general rates-across-sites models. Our results imply, in particular, that large phylogenies are typically identifiable under rate variation. We also derive sequence-length requirements for high-probability reconstruction. Our main contribution is a novel algorithm that clusters sites according to their mutation rate. Following this site clustering step, standard reconstruction techniques can be used to recover the phylogeny. Our results rely on a basic insight: that, for large trees, certain site statistics experience concentration-of-measure phenomena.
Recommendations
- Phylogenetic mixtures: concentration of measure in the large-tree limit
- Identifiability of a Markovian model of molecular evolution with gamma-distributed rates
- What can and what cannot be inferred from pairwise sequence comparisons?
- A phase transition for a random cluster model on phylogenetic trees.
- Identifiability of large phylogenetic mixture models
Cites work
- scientific article; zbMATH DE number 2079368 (Why is no real title available?)
- scientific article; zbMATH DE number 1865935 (Why is no real title available?)
- scientific article; zbMATH DE number 765034 (Why is no real title available?)
- scientific article; zbMATH DE number 819814 (Why is no real title available?)
- A basic limitation on inferring phylogenies by pairwise sequence comparisons
- A few logs suffice to build (almost) all trees (I)
- Broadcasting on trees and the Ising model.
- Finding a maximum likelihood tree is hard
- Full reconstruction of Markov models on evolutionary trees: identifiability and consistency.
- Identifiability of a Markovian model of molecular evolution with gamma-distributed rates
- Identifiability of large phylogenetic mixture models
- Letter to editor: Rate-variation need not defeat phylogenetic inference through pairwise sequence comparisons
- Mixed-up trees: the structure of phylogenetic mixtures
- On the variational distance of two trees
- Phylogenetic mixtures: concentration of measure in the large-tree limit
- Phylogenies without branch bounds: contracting the short, pruning the deep
Cited in
(10)- Analytic solutions for three taxon ML trees with variable rates across sites
- Phylogenetic mixtures: concentration of measure in the large-tree limit
- Reversible polymorphism-aware phylogenetic models and their application to tree inference
- Counting mutations by parsimony and estimation of mutation rate variation across nucleotide sites -- a simulation study
- Identifiability and inference of phylogenetic birth-death models
- Consistency and identifiability of the polymorphism-aware phylogenetic models
- Phase transition in the sample complexity of likelihood-based phylogeny inference
- Full reconstruction of non-stationary strand-symmetric models on rooted phylogenies
- Identifiability of a Markovian model of molecular evolution with gamma-distributed rates
- Continuous and tractable models for the variation of evolutionary rates
This page was built for publication: Identifiability and inference of non-parametric rates-across-sites models on large-scale phylo\-genies
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q376326)