Choosing the tree which actually best explains the data: another look at the bootstrap in phylogenetic reconstruction. (Q1566634): Difference between revisions
From MaRDI portal
Latest revision as of 19:11, 4 August 2024
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Choosing the tree which actually best explains the data: another look at the bootstrap in phylogenetic reconstruction. |
scientific article |
Statements
Choosing the tree which actually best explains the data: another look at the bootstrap in phylogenetic reconstruction. (English)
0 references
4 June 2000
0 references
We consider the problem of phylogenetic reconstruction, which consists in estimating the evolutionary history of a set of species. This unknown history is modelled as a tree and estimated from nucleotide sequences taken from the species' genome. The first goal of the estimation is to produce a tree which is structurally as close as possible to the true tree. However, most phylogenetic tree-building methods rely on optimization criteria which lead to infering fully resolved trees, i.e., models of maximal complexity. Thus, such trees usually contain some wrong edges, too specific to the data, i.e., resulting from an overfitting effect. We first introduce a structural goodness-of-fit criterion based on quartets of species. Then we describe a tree-building method inferring a fully resolved tree by optimizing this criterion. We present two descending approaches to remove unreliable edges from this tree. The first one relies on the bootstrap process as introduced in the phylogenetic field by Felsenstein. The second one is original in this context but analogous to usual methods in model calibration. Simulations show the efficiency of both approaches, in that the structural distance between the true tree and the estimated tree is significantly reduced.
0 references
Phylogeny reconstruction
0 references
Tree estimation
0 references
Overfitting
0 references
Complexitygoodness-of-fit
0 references
Bootstrap resampling methods
0 references
Cross validation
0 references
Structural distance
0 references
Four-point condition
0 references
Quartets
0 references