Prefix-free languages: left and right quotient and reversal (Q896681)

This paper discusses the state complexity of the operations ``left quotient'', ``right quotient'', and ``reversal'' for prefix-free regular languages. The state complexity of a language \(L\) is the number of states of the minimal deterministic finite-state automaton accepting \(L\). The state complexity of an operation \(f\) is the worst-case state complexity of the output language \(f(L)\) given the state complexity of the input language \(L\). In other words, given the worst-case input language of state complexity \(n\), what is the state complexity of \(f(L)\) as a function of \(n\). A language is regular if its state complexity is finite (i.e., there is a finite-state automaton accepting it). One of the operations discussed in this paper is the reversal of \(L\), which simply reverses all words in \(L\) and is known to preserve regular languages. The two other operations are actually operations that use two languages \(V\) and \(W\). The left quotient \(V \setminus W = \{w \mid \exists v \in V : vw \in W\}\) and the right quotient \(W / V = \{w \mid \exists v \in V : wv \in W\}\) are binary operations on languages that also preserve regularity. Finally, a language \(L\) is prefix-free if \(v\) is not a prefix of \(w\) for all different words \(v, w \in L\). Given that the alphabet has at least \(n-1\) symbols, where \(n\) is the state complexity of the prefix-free \(W\), the state complexity of the left quotient is \(2^{n-1}\). It is also demonstrated that this size cannot be realized by automata over alphabets of smaller size. The bound for \(n-2\) symbols is \(2^{n-1} - 1\). For alphabet sizes independent of \(n\), a multitude of results is obtained, but some of those remain exponential in \(n\). The right quotient is significantly easier for prefix-free languages and the state complexity is only \(n-1\), which is also proven to be tight in alphabets with at least 2 symbols. Finally, the reversal again introduces an exponential state complexity similar to the one of general reversal (not restricted to prefix-free languages). The bound \(2^{n-2}+1\) is proven to be tight for alphabets with at least 3 symbols. For alphabets with 2 symbols two lower bounds are provided and it is conjectured that these bounds are tight provided that \(n \geq 12\). Overall, the paper is well written and can be understood by any reader with some background in automata theory. The main tools used in the constructions and proofs are the determinization and minimization constructions for finite-state automata, which are explained in any good textbook on automata theory. Illustrations are provided for all constructed automata and an empirical evaluation was used to support the conjecture.

0 references

reviewed by

Andreas Maletti

0 references

zbMATH Keywords

state complexity

0 references

determinization

0 references

minimization

0 references

left quotient