The longest common subsequence problem revisited (Q1098310)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | The longest common subsequence problem revisited |
scientific article |
Statements
The longest common subsequence problem revisited (English)
0 references
1987
0 references
This paper re-examines, in a unified framework, two classic approaches to the problem of finding a longest common subsequence (LCS) of two strings, and proposes faster implementations for both. Let \(\ell\) be the length of an LCS between two strings of length m and \(n\geq m\), respectively, and let s be the alphabet size. The first revised strategy follows the paradigm of a previous \(O(\ell n)\) time algorithm by Hirschberg. The new version can be implemented in time \(O(\ell m\cdot \min \{\log s, \log m, \log(2n/m)\})\), which is profitable when the input strings differ considerably in size (a looser bound for both versions is \(O(mn)\)). The second strategy improves on the Hunt-Szymanski algorithm. This latter takes time \(O((r+n)\log n)\), where \(r\leq mn\) is the total number of matches between the two input strings. Such a performance is quite good \((O(n \log n))\) when \(r\sim n\), but it degrades to \(\Theta(mn \log n)\) in the worst case. On the other hand the variation presented here is never worse than linear-time in the product mn. The exact time bound derived for this second algorithm is \(O(m \log n+d \log (2mn/d))\), where \(d\leq r\) is the number of dominant matches (elsewhere referred to as minimal candidates) between the two strings. Both algorithms require an \(O(n \log s)\) preprocessing that is nearly standard for the LCS problem, and they make use of simple and handy auxiliary data structures.
0 references
efficient merging of linear lists
0 references
longest common subsequence
0 references
0 references