UPGMA and the normalized equidistant minimum evolution problem
From MaRDI portal
Publication:1704587
DOI10.1016/J.TCS.2018.01.022zbMATH Open1486.62188arXiv1704.00497OpenAlexW2605313896MaRDI QIDQ1704587FDOQ1704587
Authors: Andreas Spillner, Taoyang Wu, Vincent Moulton
Publication date: 12 March 2018
Published in: Theoretical Computer Science (Search for Journal in Brave)
Abstract: UPGMA (Unweighted Pair Group Method with Arithmetic Mean) is a widely used clustering method. Here we show that UPGMA is a greedy heuristic for the normalized equidistant minimum evolution (NEME) problem, that is, finding a rooted tree that minimizes the minimum evolution score relative to the dissimilarity matrix among all rooted trees with the same leaf-set in which all leaves have the same distance to the root. We prove that the NEME problem is NP-hard. In addition, we present some heuristic and approximation algorithms for solving the NEME problem, including a polynomial time algorithm that yields a binary, rooted tree whose NEME score is within O(log^2 n) of the optimum. We expect that these results to eventually provide further insights into the behavior of the UPGMA algorithm.
Full work available at URL: https://arxiv.org/abs/1704.00497
Recommendations
- An improvement for unweighted pair group method with arithmetic mean and its application
- Optimal implementations of UPGMA and other common clustering algorithms
- Polyhedral combinatorics of UPGMA cones
- Approximating the balanced minimum evolution problem
- The minimum evolution problem: Overview and classification
Classification and discrimination; cluster analysis (statistical aspects) (62H30) Problems related to evolution (92D15) Analysis of algorithms (68W40)
Cites Work
- Title not available (Why is that?)
- Title not available (Why is that?)
- Computational complexity of inferring phylogenies from dissimilarity matrices
- Cyclic permutations and evolutionary trees
- A robust model for finding optimal evolutionary tree
- On the Approximability of Numerical Taxonomy (Fitting Distances by Tree Metrics)
- A tight bound on approximating arbitrary metrics by tree metrics
- The performance of neighbor-joining methods of phylogenetic reconstruction
- Performance analysis of hierarchical clustering algorithms
- Robustness of phylogenetic inference based on minimum evolution
- Approximating the balanced minimum evolution problem
- \(O(\sqrt{\log n})\) approximation to sparsest cut in \(\tilde{O}(n^2)\) time
- Consistent formulas for estimating the total lengths of trees
- The minimum evolution problem: Overview and classification
- Polyhedral combinatorics of UPGMA cones
Cited In (4)
This page was built for publication: UPGMA and the normalized equidistant minimum evolution problem
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1704587)