On the complexity of computing MP distance between binary phylogenetic trees (Q1682616)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | On the complexity of computing MP distance between binary phylogenetic trees |
scientific article |
Statements
On the complexity of computing MP distance between binary phylogenetic trees (English)
0 references
30 November 2017
0 references
Distance based measures are used to quantify the dissimilarity of two trees in Phylogenetics. Maximum Parsimony distance (MP-distance) is a recent addition to those measures. The MP-distance requires the search for a character which has a low parsimony score on one of the trees involved and a high score on the other one. In this article, authors show that the computation of the MP-distance on two Phylogenetic binary trees is NP-hard. A rooted phylogenetic \(X\)-tree is a rooted tree \(T=(V(T),E(T))\) on a leaf set \(X=\{1,2,\dots,n\}\subset V(T)\). A character \(f\) is a function \(f:X\to \mathcal{C}\), where \(\mathcal{C}\) is a set \(\{c_1,c_2,c_3,\ldots, c_k\}\) of \(k\) character states, \(k\) being a positive integer. If \(|f(X)|=r\), then \(f\) is said to be an \(r\)-state character. An extension of a character \(f\) to \(V(T)\) is a map \(g:V(T)\to \mathcal{C}\) such that \(g(i)=f(i)\) for all \(i\in X\). The parsimony score or parsimony length of a character \(f\) on \(T\), denoted by \(l_f(T)\), is obtained by minimizing \(l_g(T)\) over all possible extensions \(g\) of \(f\). The maximum parsimony distance \(d_{MP}\) of two phylogenetic trees \(T_1\) and \(T_2\) on the same set \(X\) is defined as \(d_{MP}(T_1,T_2)=\max_f\{|l_f(T_1)-l_f(T_2)|\}\), where this maximum is taken over all characters \(f\) on \(X\). With the help of an interesting symmetry breaking construction, a couple of lemmas and an observation, authors establish that the computation of \(d_{MP}(T_1,T_2)\) on binary trees is NP-hard. The proof of this result is pretty lengthy, logical and intuitive and progress by constructing will construct two trees \(T_E\) and \(T_V\) such that, for a certain integer \(P\), \(d_{MP}(T_E,T_V)=P\) if and only if \(\chi^\prime(G)=3\), and the authors establish many interesting properties on the phylogenetic trees \(T_E\) and \(T_V\). Following this result, the authors also prove that the computation of \(d^2_{MP}(T_1,T_2)\) is also NP-hard in a similar way and also show that computation of \(d^2_{MP}(T_1,T_2)\) is APX-hard as well. In the final section, the authors also present an integer linear programming formulation for binary instances. The article is well-written, innovative and interesting. As mentioned earlier, the content of this paper is logical, intuitive and presented sequentially in brilliant manner. The authors deserve much appreciation for the preparation of such a marvellous paper.
0 references
maximum parsimony distance
0 references
phylogenetics
0 references
binary trees
0 references
NP-hardness
0 references