On the parameterized intractability of motif search problems (Q858110)

Searching common motifs is a central problem of consensus analysis based on strings with, in particular, applications in computational biology. The Closest Substring problem and the Consensus Patterns problem play an important role in this context. Let \(d_H(s,s')\) denote the Hamming distance between strings \(s\) and \(s'\). Then these two decision problems are defined as follows: Closest Substring: Input. \(k\) strings \(s_1, s_2, \dots, s_k\) over alphabet \(\Sigma\) and non-negative integers \(d\) and \(L\). Question. Is there a string \(s\) of length \(L\) and, for all \(i = 1, \dots, k\), a length-\(L\) substring \(s'_i\) of \(s_i\) such that \(d_H(s,s'_i) \leq d\)? Consensus Patterns: Input. \(k\) strings \(s_1, s_2, \dots, s_k\) over alphabet \(\Sigma\) and non-negative integers \(d\) and \(L\). Question. Is there a string \(s\) of length \(L\) and, for all \(i = 1, \dots, k\), a length-\(L\) substring \(s'_i\) of \(s_i\) such that \(\sum_{i=1}^k d_H(s,s'_i) \leq d\)? In this paper the parameterized complexity of the above two problems is analysed. The main results are the following: A new W[1]-hardness proof both of the Closest Substring and the Consensus Patterns problem are given for various parameters (in particular, \((L,d,k)\)) in case of an unbounded alphabet. It is shown that the W[1]-hardness carries over to the case of a binary alphabet for the parameter \(k\). At the heart of the paper is a sophisticated parameterized \(m\)-reduction from the W[1]-hard Clique problem to the Closest Substring problem with respect to the aggregate parameter \((L,d,k)\) in case of an unbounded alphabet size. It is then shown how this reduction can be modified so as to apply to the parameter \(k\) in case of a binary alphabet. Moreover, also the required modifications in order to extend these W[1]-hardness results to the Consensus Patterns problem are shown. Despite the technical difficulty, this paper maintains a high level of readability. In particular, two detailed examples for the Closest Substring problem (one in case of an unbounded alphabet and one in case of a binary alphabet) greatly help the reader. An overview of already known complexity results in this context, of the results proven in this paper, and of open problems, provides the reader with a full picture of this area. In summary, the paper is a pleasant read. It is interesting both to the parameterized complexity expert (who will appreciate the clear presentation of a sophisticated parameterized problem reduction) and to the novice in this field (who will get an idea of the aesthetics of parameterized complexity analyses). The only criticism with this paper is the lack of motivation for dealing with these problems. A reader who is new to the field of computational biology will enjoy the problem reductions but he/she will not understand what these problems are good for.

0 references

reviewed by

Reinhard Pichler

0 references

zbMATH Keywords

computational complexity

0 references

parameterized complexity

0 references

parameterized intractability

0 references

computational biology