A linear algorithm for string reconstruction in the reverse complement equivalence model (Q450544)

From MaRDI portal
scientific article
Language Label Description Also known as
English
A linear algorithm for string reconstruction in the reverse complement equivalence model
scientific article

    Statements

    A linear algorithm for string reconstruction in the reverse complement equivalence model (English)
    0 references
    0 references
    0 references
    0 references
    13 September 2012
    0 references
    Let \(s=s_1 s_2 \cdots s_n\) be a given string of length \(n\) over a paired alphabet \(\Sigma\), where each letter \(a\) in \(\Sigma\) has a complement \(\bar{a}\) in \(\Sigma\). The reverse-complement of \(s\) is just \(\tilde{s}=\bar{s_n} \cdots \bar{s_2}\bar{s_1}\). Define a string \(t\) to be a subsequence of \(s\) if there exist \(1 \leq i_1 < i_2 < \cdots < i_m \leq n\) such that \(t = s_{i_1}s_{i_2}\cdots s_{i_m}\), and define \(t\) to be an RC-subsequence of \(s\) if \(t\) or \(\tilde{t}\) is a subsequence of \(s\). A result of Erdős et al. shows that two strings \(s, s'\) of length \(n\) are RC-equivalent, i.e., \(s=s'\) or \(s = \tilde{s'}\), if and only if their RC-subsequences coincide up to length \(\lceil \frac{2}{3}(n+1)\rceil\) [\textit{P.~L. Erdős} et al., Ann. Comb. 10, No. 4, 415--430 (2006; Zbl 1115.68122)]. In this very interesting paper, the authors consider the problem of reconstructing \(s\) using queries of the type ``Is \(t\) an RC-subsequence of \(s\)?'', where the length of \(t\) is at most \(\lceil \frac{2}{3}(n+1)\rceil\). They give a lower bound of \(\Omega(n \log |\Sigma|)\) on the number of queries necessary, where \(\log |\Sigma|\) is the logarithm of \(|\Sigma|\) to base 2. In the case of binary alphabets, they describe a reconstruction algorithm which uses \(O(n)\) many queries, while in the case of arbitrary alphabets, the algorithm uses \(O(|\Sigma|n)\) queries.
    0 references
    0 references
    string reconstruction
    0 references
    reverse complement
    0 references
    string algorithms
    0 references
    subsequences
    0 references
    subwords
    0 references
    combinatorics on words
    0 references
    0 references