Identifying the rooted species tree from the distribution of unrooted gene trees under the coalescent
From MaRDI portal
Publication:658995
Abstract: Gene trees are evolutionary trees representing the ancestry of genes sampled from multiple populations. Species trees represent populations of individuals -- each with many genes -- splitting into new populations or species. The coalescent process, which models ancestry of gene copies within populations, is often used to model the probability distribution of gene trees given a fixed species tree. This multispecies coalescent model provides a framework for phylogeneticists to infer species trees from gene trees using maximum likelihood or Bayesian approaches. Because the coalescent models a branching process over time, all trees are typically assumed to be rooted in this setting. Often, however, gene trees inferred by traditional phylogenetic methods are unrooted. We investigate probabilities of unrooted gene trees under the multispecies coalescent model. We show that when there are 4 species with one gene sampled per species, the distribution of unrooted gene tree topologies identifies the unrooted species tree topology and some, but not all, information in the species tree edges (branch lengths). The location of the root on the species tree is not identifiable in this situation. However, for 5 or more species with one gene sampled per species, we show that the distribution of unrooted gene tree topologies identifies the rooted species tree topology and all its internal branch lengths. The length of any pendent branch leading to a leaf of the species tree is also identifiable for any species from which more than one gene is sampled.
Recommendations
- Identifiability of the unrooted species tree topology under the coalescent model with time-reversible substitution processes, site-specific rate variation, and invariable sites
- Identifiability and reconstructibility of species phylogenies under a modified coalescent
- Split probabilities and species tree inference under the multispecies coalescent model
- Determining species tree topologies from clade probabilities under the coalescent
- Maximum tree: a consistent estimator of the species tree
Cites work
- Identifying evolutionary trees and substitution parameters for the general Markov model with invariable sites
- Invariants of phylogenies in a simple case with discrete states
- Line-of-descent and genealogical processes, and their applications in population genetics models
- Maximum tree: a consistent estimator of the species tree
- Phylogenetic invariants for the general Markov model of sequence mutation
- Reconstructing the shape of a tree from observed dissimilarity data
- The complexity of reconstructing trees from qualitative characters and subtrees
- The probability of topological concordance of gene trees and species trees.
Cited in
(24)- Identifiability of speciation times under the multispecies coalescent
- On the unranked topology of maximally probable ranked gene tree topologies
- Rates of convergence in the two-island and isolation-with-migration models
- Identifiability of the unrooted species tree topology under the coalescent model with time-reversible substitution processes, site-specific rate variation, and invariable sites
- Classes of explicit phylogenetic networks and their biological and mathematical significance
- On the number of non-equivalent ancestral configurations for matching gene trees and species trees
- The probability distribution of ranked gene trees on a species tree
- Linearization of the Kingman coalescent
- Identifiability and reconstructibility of species phylogenies under a modified coalescent
- Identifying circular orders for blobs in phylogenetic networks
- Distance-based species tree estimation under the coalescent: information-theoretic trade-off between number of loci and sequence length
- Inferring metric trees from weighted quartets via an intertaxon distance
- Phylogenetic network-assisted rooting of unrooted gene trees
- A stochastic Farris transform for genetic data under the multispecies coalescent with applications to data requirements
- Statistically consistent rooting of species trees under the multispecies coalescent model
- Polynomial-Time Statistical Estimation of Species Trees Under Gene Duplication and Loss
- Computing the probability of gene trees concordant with the species tree in the multispecies coalescent
- Hypothesis testing near singularities and boundaries
- The tree of blobs of a species network: identifiability under the coalescent
- Determining species tree topologies from clade probabilities under the coalescent
- Species tree estimation under joint modeling of coalescence and duplication: sample complexity of quartet methods
- Identifying species network features from gene tree quartets under the coalescent model
- Split probabilities and species tree inference under the multispecies coalescent model
- Anomalous networks under the multispecies coalescent: theory and prevalence
This page was built for publication: Identifying the rooted species tree from the distribution of unrooted gene trees under the coalescent
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q658995)