Hyper-optimization for deterministic tree automata (Q2344747)

For a (total bottom-up finite-state) deterministic tree automaton (DTA) \(M=\langle Q,\Sigma,\delta,F\rangle\) (with the set of states \(Q\), the ranked alphabet \(\Sigma\), the transition mapping \(\delta\), and the set of final states \(F\)) let \(L(M)^q=\{t\in T_\Sigma\mid \delta(t)=q\}\) for any \(q\in Q\); then the tree language recognized by \(M\) will be \(L(M)=\bigcup_{q\in F}L(M)^q\). A state \(q\in Q\) is called kernel when \(L(M)^q\) is infinite; otherwise it is called preamble. For any \(p,q\in Q\) let \(L(M)^q_p=\{c\in C_\Sigma\mid \delta(c[p])=q\}\) be the set of all (stateless) contexts that take \(p\) into \(q\); and let \(L(M)_p=\bigcup_{q\in F}L(M)^q_p\). For DTAs \(M=\langle Q,\Sigma,\delta,F\rangle\) and \(N=\langle P,\Sigma,\mu,G\rangle\), let \(E_{q,p}=L(M)_q\,\triangle\,L(N)_p\) be the symmetric difference of context sets recognized by \(q\in Q\) and \(p\in P\). The states \(p\), \(q\) are called almost equivalent when \(E_{q,p}\) is finite; they are called equivalent when \(E_{q,p}=\emptyset\). A DTA is minimal when it has no different but equivalent pairs of states. A DTA is called hyper-minimal when it has the minimal number of states among its almost equivalent DTAs. In the first theorem of the paper, it is shown that ``two almost equivalent hyper-minimal DTAs have DTA-isomorphic kernels and transition-isomorphic preambles''. This theorem ``states that two almost equivalent, hyper-minimal DTAs are indeed very similar. They have a bijection between their states that preserves the distinction between preamble and kernel states. Moreover, via this bijection the two DTAs behave equally (besides acceptance) on the preamble states and isomorphically on the kernel states. Thus, two such DTAs can only differ in two aspects \dots'' \qquad ``1. the finality (i.e., whether the state is final or not) of preamble states, and \qquad ``2. transitions from exclusively preamble states to a kernel state.'' As a corollary it is shown how ``to compute the number of almost equivalent, hyper-minimal DTAs for the given minimal DTA''. The second theorem takes a DTA \(M=\langle Q,\Sigma,\delta,F\rangle\) and a hyper-minimal DTA \(N=\langle P,\Sigma,\mu,G\rangle\) almost equivalent to \(M\), and computes the number of the elements of the set \(E_p=L(N)^p\cap E\), for a preamble state \(p\in P\), where \(E=L(M)\,\triangle\,L(N)\) is the set of tree errors. Finally, the last (and the main) theorem of the paper proves that for every hyper-minimal DTA \(N\) that is almost equivalent to \(M\) one can determine \(|E|\) in time \({\mathcal O}(|M|\cdot |Q|)\), where \(|M|=\sum_{k>0}k\cdot|\Sigma_k|\cdot |Q|^k\) is the size of \(M\) (\(\Sigma_k\) is the set of symbols of \(\Sigma\) with rank \(k\)). It is also noted that given a DTA \(M\) one can compute a hyper-minimal DTA \(N\) with the minimal \(|E|\) in time \({\mathcal O}(|M|\cdot |Q|)\).

0 references

reviewed by

Saeed Salehi

0 references

zbMATH Keywords

hyper-minimization

0 references

deterministic tree automaton

0 references

lossy compression

0 references

error analysis