Checking determinism of regular expressions with counting (Q2343138)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Checking determinism of regular expressions with counting
scientific article

    Statements

    Checking determinism of regular expressions with counting (English)
    0 references
    0 references
    0 references
    4 May 2015
    0 references
    The paper is concerned with improving the worst-case run time of regular expression determinism testing algorithms. In particular, improvements are made with regards to regular expression counting operators. Complexity of \textit{matching} problems is not considered. Informally, a regular expression is weakly deterministic if, when matching a word against the expression, a symbol can be matched to only one position in the expression without looking ahead. A regular expression is strongly deterministic if it is weakly deterministic and if the operators appearing in the expression can only be used in a unique way when matching the expression with a word. Based on earlier works [\textit{B. Groz}, \textit{S. Maneth} and \textit{S. Staworko}, ``Deterministic regular expressions in linear time'', in: Proceedings of the 31st symposium on principles of database systems, PODS '12. New York, NY: ACM. 49--60 (2012; \url{doi:10.1145/2213556.2213566}); \textit{P. Kilpeläinen}, ``Checking determinism of XML schema content models in optimal time'', Inf. Syst. 36, No. 3, 596--617 (2009; \url{doi:10.1016/j.is.2010.10.001})], the authors generalise linear-time determinism testing algorithms established previously for standard regular expressions onto regular expressions with counting. The paper culminates in presenting an \(O(|\Sigma_E| |E|)\)-time algorithm for checking weak determinism of regular expressions with counting, and in presenting an \(O(|E|)\)-time algorithm for checking \textit{strong} determinism of regular expressions in that same class. The results will find applications in the analysis of XML content models (weakly deterministic as per a W3C requirement) or queries over XML streams where strongly deterministic regular expressions are naturally found.
    0 references
    weakly deterministic regular expressions
    0 references
    strongly deterministic regular expressions
    0 references
    regular expressions with counting
    0 references
    determinism testing
    0 references
    star normal form
    0 references
    Glushkov automata
    0 references
    first and followlast sets
    0 references

    Identifiers