Choosing the tree which actually best explains the data: another look at the bootstrap in phylogenetic reconstruction. (Q1566634)

From MaRDI portal
Revision as of 20:20, 22 July 2023 by Importer (talk | contribs) (‎Created a new Item)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
scientific article
Language Label Description Also known as
English
Choosing the tree which actually best explains the data: another look at the bootstrap in phylogenetic reconstruction.
scientific article

    Statements

    Choosing the tree which actually best explains the data: another look at the bootstrap in phylogenetic reconstruction. (English)
    0 references
    0 references
    0 references
    0 references
    4 June 2000
    0 references
    We consider the problem of phylogenetic reconstruction, which consists in estimating the evolutionary history of a set of species. This unknown history is modelled as a tree and estimated from nucleotide sequences taken from the species' genome. The first goal of the estimation is to produce a tree which is structurally as close as possible to the true tree. However, most phylogenetic tree-building methods rely on optimization criteria which lead to infering fully resolved trees, i.e., models of maximal complexity. Thus, such trees usually contain some wrong edges, too specific to the data, i.e., resulting from an overfitting effect. We first introduce a structural goodness-of-fit criterion based on quartets of species. Then we describe a tree-building method inferring a fully resolved tree by optimizing this criterion. We present two descending approaches to remove unreliable edges from this tree. The first one relies on the bootstrap process as introduced in the phylogenetic field by Felsenstein. The second one is original in this context but analogous to usual methods in model calibration. Simulations show the efficiency of both approaches, in that the structural distance between the true tree and the estimated tree is significantly reduced.
    0 references
    Phylogeny reconstruction
    0 references
    Tree estimation
    0 references
    Overfitting
    0 references
    Complexitygoodness-of-fit
    0 references
    Bootstrap resampling methods
    0 references
    Cross validation
    0 references
    Structural distance
    0 references
    Four-point condition
    0 references
    Quartets
    0 references

    Identifiers