Linear-time computation of DAWGs, symmetric indexing structures, and MAWs for integer alphabets
From MaRDI portal
Publication:6093582
Abstract: The directed acyclic word graph (DAWG) of a string of length is the smallest (partial) DFA which recognizes all suffixes of with only nodes and edges. In this paper, we show how to construct the DAWG for the input string from the suffix tree for , in time for integer alphabets of polynomial size in . In so doing, we first describe a folklore algorithm which, given the suffix tree for , constructs the DAWG for the reversed string of in time. Then, we present our algorithm that builds the DAWG for in time for integer alphabets, from the suffix tree for . We also show that a straightforward modification to our DAWG construction algorithm leads to the first -time algorithm for constructing the affix tree of a given string over an integer alphabet. Affix trees are a text indexing structure supporting bidirectional pattern searches. We then discuss how our constructions can lead to linear-time algorithms for building other text indexing structures, such as linear-size suffix tries and symmetric CDAWGs in linear time in the case of integer alphabets. As a further application to our -time DAWG construction algorithm, we show that the set of all minimal absent words (MAWs) of can be computed in optimal, input- and output-sensitive time and working space for integer alphabets.
Recommendations
Cites work
- scientific article; zbMATH DE number 3913713 (Why is no real title available?)
- scientific article; zbMATH DE number 7559178 (Why is no real title available?)
- A Space-Economical Suffix Tree Construction Algorithm
- Alignment-free sequence comparison using absent words
- Alphabet-dependent string searching with wexponential search trees
- Automata and forbidden words
- Complete inverted files for efficient text retrieval and analysis
- Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching
- Compressed suffix trees with full functionality
- Computing DAWGs and minimal absent words in linear time for integer alphabets
- Efficient Computation of Substring Equivalence Classes with Suffix Arrays
- Efficient computation of shortest absent words in a genomic sequence
- Linear bidirectional on-line construction of affix trees
- Linear-size suffix tries
- New text indexing functionalities of the compressed suffix arrays
- On extended special factors of a word
- On the sorting-complexity of suffix tree construction
- On-line construction of compact directed acyclic word graphs
- Online algorithms for constructing linear-size suffix trie
- Suffix Arrays: A New Method for On-Line String Searches
- Suffix trees, DAWGs and CDAWGs for forward and backward tries
- The smallest automaton recognizing the subwords of a text
- Transducers and repetitions
- Versatile succinct representations of the bidirectional Burrows-Wheeler transform
- Words and forbidden factors
Cited in
(8)- scientific article; zbMATH DE number 1962785 (Why is no real title available?)
- scientific article; zbMATH DE number 1929950 (Why is no real title available?)
- Truncated DAWGs and their application to minimal absent word problem
- Computing DAWGs and minimal absent words in linear time for integer alphabets
- Elastic founder graphs improved and enhanced
- Linear-size suffix tries and linear-size CDAWGs simplified and improved
- scientific article; zbMATH DE number 2087051 (Why is no real title available?)
- Linear-time computation of generalized minimal absent words for multiple strings
This page was built for publication: Linear-time computation of DAWGs, symmetric indexing structures, and MAWs for integer alphabets
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6093582)