From regular expressions to deterministic automata (Q580983)

From MaRDI portal
scientific article
Language Label Description Also known as
English
From regular expressions to deterministic automata
scientific article

    Statements

    From regular expressions to deterministic automata (English)
    0 references
    0 references
    0 references
    1986
    0 references
    The main theorem allows an elegant algorithm to be refined into an efficient one. The elegant algorithm for constructing a finite automaton from a regular expression is based on `derivatives of ' regular expressions; the efficient algorithm is based on `marking of' regular expressions. Derivatives of regular expressions correspond to state transitions in finite automata. When a finite automaton makes a transition under input symbol a, a leading a is stripped from the remaining input. Correspondingly, if the input string is generated by a regular expression E, then the derivative of E by a generates the remaining input after a leading a is stripped. \textit{J. A. Brzozowski} [J. Assoc. Comput. Mach. 11, 481-494 (1964; Zbl 0225.94044)] used derivatives to construct finite automata; the state for expression E has a transition under a to the state for the derivative of E by a. This approach extends to regular expressions with new operators, including intersection and complement; however, explicit computation of derivatives can be expensive. Marking of regular expressions yields an expression with distinct input symbols. Following \textit{R. McNaughton} and \textit{H. Yamada} [IRE Trans. Electron. Comput. EC-9, 38-47 (1960; Zbl 0156.255)], we attach subscripts to each input symbol in an expression; \((ab+b)^*ba\) becomes \((a_ 1b_ 2+b_ 3)^*b_ 4a_ 5\). Conceptually, the efficient algorithm constructs an automaton for the marked expression. The marks on the transitions are then erased, resulting in a nondeterministic automaton for the original unmarked expression. This approach works for the usual operations of union, concatenation, and iteration; however, intersection and complement cannot be handled because marking and unmarking do not preserve the languages generated by regular expressions with these operators.
    0 references
    0 references
    algorithm for constructing a finite automaton from a regular expression
    0 references
    efficient algorithm
    0 references
    Derivatives of regular expressions
    0 references
    marked expression
    0 references
    0 references