A linearly computable measure of string complexity (Q441859)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | A linearly computable measure of string complexity |
scientific article |
Statements
A linearly computable measure of string complexity (English)
0 references
8 August 2012
0 references
In this very interesting paper, the authors introduce the notion of the \(I\)-complexity of a given string, which counts the number of distinct substrings in it, and investigate its properties. They prove that this measure of string complexity is computable in linear time and space from the well-known longest common prefix array (LCP array), the companion data structure of the suffix array. They also introduce the repetitions array, which is a permutation of the LCP array, and develop its properties. It is shown that the number of strings with \(I\)-complexity up to a given value is bounded and a large family of strings with almost maximal complexity is presented. The authors prove that most strings have high \(I\)-complexity. Their measure of complexity is compared with the Lempel-Ziv complexity, their main differences are discussed, and it is shown that in spite of their very distinct behavior on infinitely many strings, their values are close. Finally, the authors discuss a number of open problems and ongoing research on \(I\)-complexity.
0 references
string complexity
0 references
description complexity
0 references
Lempel-Ziv complexity
0 references