Optimal canonization of all substrings of a string (Q1183603)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Optimal canonization of all substrings of a string
scientific article

    Statements

    Optimal canonization of all substrings of a string (English)
    0 references
    28 June 1992
    0 references
    Let \(\Sigma\) be a finite alphabet with total order \(<\). Extend \(<\) to \(\Sigma^ +\) lexicographically by letting \(x<y\) iff for some \(r,s,t,\in\Sigma^*\) and \(a,b\in\Sigma\), \(x=ras\), \(y=rbt\), and \(a<b\). Let \(x=s_ 1s_ 2\dots s_ n\in\Sigma^ +\). The \(i\)th rotation of \(x\) (\(1\leq i\leq n\)) is the string \(w=s_ i\dots s_ ns_ 1\dots s_{i- 1}\). The lexicographically smallest such rotation is called canonical. The string \(x\in\Sigma^ +\) is called a Lyndon word if \(x\) is smaller than any of its proper suffixes. A Lyndon decomposition of \(x\in\Sigma^ +\) is the sequence \((w_ 1\dots w_ k)\) of Lyndon words such that \(x=w_ 1\dots w_ k\) and \(w_ k\leq w_{k-1}\leq\dots\leq w_ 1\). The authors study the relationship between the Lyndon decomposition of \(x\) and its canonical rotation \(w\). Specifically, they characterize the Lyndon factor of \(x\) with which \(w\) must start. As an application, the authors develop faster on-line algorithms for finding the canonical rotations of a string. Also, fast incremental algorithms are presented which compute, in linear time, the canonical rotation of all prefixes of \(x\).
    0 references
    0 references
    Lyndon word
    0 references
    Lyndon decomposition
    0 references
    rotation
    0 references
    0 references
    0 references

    Identifiers