A linearly computable measure of string complexity (Q441859)

From MaRDI portal
scientific article
Language Label Description Also known as
English
A linearly computable measure of string complexity
scientific article

    Statements

    A linearly computable measure of string complexity (English)
    0 references
    0 references
    0 references
    8 August 2012
    0 references
    In this very interesting paper, the authors introduce the notion of the \(I\)-complexity of a given string, which counts the number of distinct substrings in it, and investigate its properties. They prove that this measure of string complexity is computable in linear time and space from the well-known longest common prefix array (LCP array), the companion data structure of the suffix array. They also introduce the repetitions array, which is a permutation of the LCP array, and develop its properties. It is shown that the number of strings with \(I\)-complexity up to a given value is bounded and a large family of strings with almost maximal complexity is presented. The authors prove that most strings have high \(I\)-complexity. Their measure of complexity is compared with the Lempel-Ziv complexity, their main differences are discussed, and it is shown that in spite of their very distinct behavior on infinitely many strings, their values are close. Finally, the authors discuss a number of open problems and ongoing research on \(I\)-complexity.
    0 references
    0 references
    string complexity
    0 references
    description complexity
    0 references
    Lempel-Ziv complexity
    0 references
    0 references
    0 references