A linear algorithm for the random sampling from regular languages (Q2428675)

The article discusses selecting a word of a given length out of all words in a given regular language uniformly at random. This problem is also known as the random sampling problem for regular languages and has received some attention already. The previously fastest algorithm for this problem is presented in [\textit{A. Denise} and \textit{P. Zimmermann}, Theor. Comput. Sci. 218, No. 2, 233--248 (1999; Zbl 0933.68154)] and uses a floating-point representation together with the standard recursive generation algorithm [\textit{A. Nijenhuis} and \textit{H. S. Wilf}, Combinatorial algorithms. New York etc.: Academic Press (1975; Zbl 0343.68004)] which counts the number of words of a given length. The current contribution applies the same floating-point representation to a divide-and-conquer base scheme to obtain expected linear time complexity in the given length~\(n\), which improves the previously best expected run-time of~\(O(n \log n)\). In addition, the space complexity is improved from~\(O(n \log n)\) for preprocessing and worst-case~\(O(n^2)\) at run-time to \(O((\log n)^2)\) for preprocessing and worst-case~\(O(n)\) at run-time. Thus, the first linear algorithm for the random sampling problem from regular languages is obtained. The paper is well written and very understandable. All notions are carefully explained and the text can be understood even by non-experts and even without reference to the previous literature. The theoretical development is supplemented by a careful discussion of implementation issues and presents experimental results that support the main claim of superiority over the existing solutions.

0 references

reviewed by

Andreas Maletti

0 references

zbMATH Keywords

combinatorial algorithm

0 references

random sampling

0 references

regular language

0 references

divide-and-conquer