Hyper-optimization for deterministic tree automata (Q2344747)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Hyper-optimization for deterministic tree automata
scientific article

    Statements

    Hyper-optimization for deterministic tree automata (English)
    0 references
    0 references
    18 May 2015
    0 references
    For a (total bottom-up finite-state) deterministic tree automaton (DTA) \(M=\langle Q,\Sigma,\delta,F\rangle\) (with the set of states \(Q\), the ranked alphabet \(\Sigma\), the transition mapping \(\delta\), and the set of final states \(F\)) let \(L(M)^q=\{t\in T_\Sigma\mid \delta(t)=q\}\) for any \(q\in Q\); then the tree language recognized by \(M\) will be \(L(M)=\bigcup_{q\in F}L(M)^q\). A state \(q\in Q\) is called kernel when \(L(M)^q\) is infinite; otherwise it is called preamble. For any \(p,q\in Q\) let \(L(M)^q_p=\{c\in C_\Sigma\mid \delta(c[p])=q\}\) be the set of all (stateless) contexts that take \(p\) into \(q\); and let \(L(M)_p=\bigcup_{q\in F}L(M)^q_p\). For DTAs \(M=\langle Q,\Sigma,\delta,F\rangle\) and \(N=\langle P,\Sigma,\mu,G\rangle\), let \(E_{q,p}=L(M)_q\,\triangle\,L(N)_p\) be the symmetric difference of context sets recognized by \(q\in Q\) and \(p\in P\). The states \(p\), \(q\) are called almost equivalent when \(E_{q,p}\) is finite; they are called equivalent when \(E_{q,p}=\emptyset\). A DTA is minimal when it has no different but equivalent pairs of states. A DTA is called hyper-minimal when it has the minimal number of states among its almost equivalent DTAs. In the first theorem of the paper, it is shown that ``two almost equivalent hyper-minimal DTAs have DTA-isomorphic kernels and transition-isomorphic preambles''. This theorem ``states that two almost equivalent, hyper-minimal DTAs are indeed very similar. They have a bijection between their states that preserves the distinction between preamble and kernel states. Moreover, via this bijection the two DTAs behave equally (besides acceptance) on the preamble states and isomorphically on the kernel states. Thus, two such DTAs can only differ in two aspects \dots'' \qquad ``1. the finality (i.e., whether the state is final or not) of preamble states, and \qquad ``2. transitions from exclusively preamble states to a kernel state.'' As a corollary it is shown how ``to compute the number of almost equivalent, hyper-minimal DTAs for the given minimal DTA''. The second theorem takes a DTA \(M=\langle Q,\Sigma,\delta,F\rangle\) and a hyper-minimal DTA \(N=\langle P,\Sigma,\mu,G\rangle\) almost equivalent to \(M\), and computes the number of the elements of the set \(E_p=L(N)^p\cap E\), for a preamble state \(p\in P\), where \(E=L(M)\,\triangle\,L(N)\) is the set of tree errors. Finally, the last (and the main) theorem of the paper proves that for every hyper-minimal DTA \(N\) that is almost equivalent to \(M\) one can determine \(|E|\) in time \({\mathcal O}(|M|\cdot |Q|)\), where \(|M|=\sum_{k>0}k\cdot|\Sigma_k|\cdot |Q|^k\) is the size of \(M\) (\(\Sigma_k\) is the set of symbols of \(\Sigma\) with rank \(k\)). It is also noted that given a DTA \(M\) one can compute a hyper-minimal DTA \(N\) with the minimal \(|E|\) in time \({\mathcal O}(|M|\cdot |Q|)\).
    0 references
    0 references
    hyper-minimization
    0 references
    deterministic tree automaton
    0 references
    lossy compression
    0 references
    error analysis
    0 references
    error optimization
    0 references

    Identifiers