Linear-time computation of DAWGs, symmetric indexing structures, and MAWs for integer alphabets
From MaRDI portal
Publication:6093582
DOI10.1016/J.TCS.2023.114093zbMATH Open1520.68229arXiv2307.01428OpenAlexW4385360496MaRDI QIDQ6093582FDOQ6093582
Authors: Yuta Fujishige, Yuki Tsujimaru, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda
Publication date: 7 September 2023
Published in: Theoretical Computer Science (Search for Journal in Brave)
Abstract: The directed acyclic word graph (DAWG) of a string of length is the smallest (partial) DFA which recognizes all suffixes of with only nodes and edges. In this paper, we show how to construct the DAWG for the input string from the suffix tree for , in time for integer alphabets of polynomial size in . In so doing, we first describe a folklore algorithm which, given the suffix tree for , constructs the DAWG for the reversed string of in time. Then, we present our algorithm that builds the DAWG for in time for integer alphabets, from the suffix tree for . We also show that a straightforward modification to our DAWG construction algorithm leads to the first -time algorithm for constructing the affix tree of a given string over an integer alphabet. Affix trees are a text indexing structure supporting bidirectional pattern searches. We then discuss how our constructions can lead to linear-time algorithms for building other text indexing structures, such as linear-size suffix tries and symmetric CDAWGs in linear time in the case of integer alphabets. As a further application to our -time DAWG construction algorithm, we show that the set of all minimal absent words (MAWs) of can be computed in optimal, input- and output-sensitive time and working space for integer alphabets.
Full work available at URL: https://arxiv.org/abs/2307.01428
Recommendations
Cites Work
- Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching
- Compressed suffix trees with full functionality
- Suffix Arrays: A New Method for On-Line String Searches
- A Space-Economical Suffix Tree Construction Algorithm
- The smallest automaton recognizing the subwords of a text
- Words and forbidden factors
- On-line construction of compact directed acyclic word graphs
- Versatile succinct representations of the bidirectional Burrows-Wheeler transform
- Automata and forbidden words
- Linear-size suffix tries
- Title not available (Why is that?)
- Transducers and repetitions
- New text indexing functionalities of the compressed suffix arrays
- Linear bidirectional on-line construction of affix trees
- Complete inverted files for efficient text retrieval and analysis
- On the sorting-complexity of suffix tree construction
- Efficient computation of shortest absent words in a genomic sequence
- Efficient Computation of Substring Equivalence Classes with Suffix Arrays
- Alphabet-dependent string searching with wexponential search trees
- Alignment-free sequence comparison using absent words
- On extended special factors of a word
- Suffix trees, DAWGs and CDAWGs for forward and backward tries
- Title not available (Why is that?)
- Computing DAWGs and minimal absent words in linear time for integer alphabets
- Online algorithms for constructing linear-size suffix trie
Cited In (8)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Truncated DAWGs and their application to minimal absent word problem
- Computing DAWGs and minimal absent words in linear time for integer alphabets
- Elastic founder graphs improved and enhanced
- Linear-size suffix tries and linear-size CDAWGs simplified and improved
- Title not available (Why is that?)
- Linear-time computation of generalized minimal absent words for multiple strings
This page was built for publication: Linear-time computation of DAWGs, symmetric indexing structures, and MAWs for integer alphabets
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6093582)