Identifiability and inference of non-parametric rates-across-sites models on large-scale phylo\-genies

From MaRDI portal
Publication:376326

DOI10.1007/S00285-012-0571-4zbMATH Open1276.92082arXiv1108.0129OpenAlexW2963738166WikidataQ51335274 ScholiaQ51335274MaRDI QIDQ376326FDOQ376326


Authors: Elchanan Mossel, Sebastien Roch Edit this on Wikidata


Publication date: 4 November 2013

Published in: Journal of Mathematical Biology (Search for Journal in Brave)

Abstract: Mutation rate variation across loci is well known to cause difficulties, notably identifiability issues, in the reconstruction of evolutionary trees from molecular sequences. Here we introduce a new approach for estimating general rates-across-sites models. Our results imply, in particular, that large phylogenies are typically identifiable under rate variation. We also derive sequence-length requirements for high-probability reconstruction. Our main contribution is a novel algorithm that clusters sites according to their mutation rate. Following this site clustering step, standard reconstruction techniques can be used to recover the phylogeny. Our results rely on a basic insight: that, for large trees, certain site statistics experience concentration-of-measure phenomena.


Full work available at URL: https://arxiv.org/abs/1108.0129




Recommendations




Cites Work


Cited In (10)





This page was built for publication: Identifiability and inference of non-parametric rates-across-sites models on large-scale phylo\-genies

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q376326)