Succinct Dictionary Matching with No Slowdown
From MaRDI portal
Publication:3575239
DOI10.1007/978-3-642-13509-5_9zbMATH Open1286.68521arXiv1001.2860OpenAlexW3101778446MaRDI QIDQ3575239FDOQ3575239
Publication date: 26 July 2010
Published in: Combinatorial Pattern Matching (Search for Journal in Brave)
Abstract: The problem of dictionary matching is a classical problem in string matching: given a set S of d strings of total length n characters over an (not necessarily constant) alphabet of size sigma, build a data structure so that we can match in a any text T all occurrences of strings belonging to S. The classical solution for this problem is the Aho-Corasick automaton which finds all occ occurrences in a text T in time O(|T| + occ) using a data structure that occupies O(m log m) bits of space where m <= n + 1 is the number of states in the automaton. In this paper we show that the Aho-Corasick automaton can be represented in just m(log sigma + O(1)) + O(d log(n/d)) bits of space while still maintaining the ability to answer to queries in O(|T| + occ) time. To the best of our knowledge, the currently fastest succinct data structure for the dictionary matching problem uses space O(n log sigma) while answering queries in O(|T|log log n + occ) time. In this paper we also show how the space occupancy can be reduced to m(H0 + O(1)) + O(d log(n/d)) where H0 is the empirical entropy of the characters appearing in the trie representation of the set S, provided that sigma < m^epsilon for any constant 0 < epsilon < 1. The query time remains unchanged.
Full work available at URL: https://arxiv.org/abs/1001.2860
Recommendations
- Succinct 2D dictionary matching with no slowdown
- An improved query time for succinct dynamic dictionary matching
- Fast approximate dictionary matching
- Fast approximate matching of words against a dictionary
- Faster compressed dictionary matching
- Succinct 2D dictionary matching
- Compressed matching in dictionaries
Cited In (24)
- Compressed automata for dictionary matching
- Title not available (Why is that?)
- Faster Lightweight Lempel-Ziv Parsing
- A framework for designing space-efficient dictionaries for parameterized and order-preserving matching
- Streaming dictionary matching with mismatches
- Alphabet-Independent and Scaled Dictionary Matching
- Compressing dictionary matching index via sparsification technique
- Succinct 2D dictionary matching
- Fast approximate matching of words against a dictionary
- Searching and Indexing Circular Patterns
- Compressed indexes for text with wildcards
- On the complexity of recognizing Wheeler graphs
- Dictionary Matching with Uneven Gaps
- Worst-case efficient single and multiple string matching on packed texts in the word-RAM model
- On the Hardness and Inapproximability of Recognizing Wheeler Graphs
- Fast circular dictionary-matching algorithm
- Succincter Text Indexing with Wildcards
- Worst Case Efficient Single and Multiple String Matching in the RAM Model
- Compressed Multiple Pattern Matching
- Streaming Dictionary Matching with Mismatches
- Dictionary matching and indexing with errors and don't cares
- Compressed text indexing with wildcards
- A grouping approach for succinct dynamic dictionary matching
- Dictionary matching with a bounded gap in pattern or in text
This page was built for publication: Succinct Dictionary Matching with No Slowdown
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3575239)