Compressed indexing with signature grammars
From MaRDI portal
Publication:2294697
DOI10.1007/978-3-319-77404-6_25zbMATH Open1485.68081arXiv1711.08217OpenAlexW2964082949MaRDI QIDQ2294697FDOQ2294697
Authors: Anders Roy Christiansen, Mikko Berggren Ettienne
Publication date: 12 February 2020
Abstract: The compressed indexing problem is to preprocess a string of length into a compressed representation that supports pattern matching queries. That is, given a string of length report all occurrences of in . We present a data structure that supports pattern matching queries in time using space where is the size of the LZ77 parse of and is an arbitrarily small constant, when the alphabet is small or for any constant . We also present two data structures for the general case; one where the space is increased by , and one where the query time changes from worst-case to expected. These results improve the previously best known solutions. Notably, this is the first data structure that decides if occurs in in time using space. Our results are mainly obtained by a novel combination of a randomized grammar construction algorithm with well known techniques relating pattern matching to 2D-range reporting.
Full work available at URL: https://arxiv.org/abs/1711.08217
Recommendations
- Time-space trade-offs for Lempel-Ziv compressed indexing
- Time-space trade-offs for Lempel-Ziv compressed indexing
- Indexing compressed text
- Compressed suffix arrays and suffix trees with applications to text indexing and string matching (extended abstract)
- Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching
Data structures (68P05) Coding and information theory (compaction, compression, models of communication, encoding schemes, etc.) (aspects in computer science) (68P30) Grammars and rewriting systems (68Q42) Algorithms on strings (68W32)
Cited In (6)
- Document listing on repetitive collections with guaranteed performance
- Grammar-compressed indexes with logarithmic search time
- A compressed dynamic self-index for highly repetitive text collections
- Top tree compression of tries
- Rpair: rescaling RePair with Rsync
- Dynamic index and LZ factorization in compressed space
This page was built for publication: Compressed indexing with signature grammars
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2294697)