Compressed indexing with signature grammars

From MaRDI portal
Publication:2294697

DOI10.1007/978-3-319-77404-6_25zbMATH Open1485.68081arXiv1711.08217OpenAlexW2964082949MaRDI QIDQ2294697FDOQ2294697


Authors: Anders Roy Christiansen, Mikko Berggren Ettienne Edit this on Wikidata


Publication date: 12 February 2020

Abstract: The compressed indexing problem is to preprocess a string S of length n into a compressed representation that supports pattern matching queries. That is, given a string P of length m report all occurrences of P in S. We present a data structure that supports pattern matching queries in O(m+occ(lglgn+lgepsilonz)) time using O(zlg(n/z)) space where z is the size of the LZ77 parse of S and epsilon>0 is an arbitrarily small constant, when the alphabet is small or z=O(n1delta) for any constant delta>0. We also present two data structures for the general case; one where the space is increased by O(zlglgz), and one where the query time changes from worst-case to expected. These results improve the previously best known solutions. Notably, this is the first data structure that decides if P occurs in S in O(m) time using O(zlg(n/z)) space. Our results are mainly obtained by a novel combination of a randomized grammar construction algorithm with well known techniques relating pattern matching to 2D-range reporting.


Full work available at URL: https://arxiv.org/abs/1711.08217




Recommendations




Cited In (6)





This page was built for publication: Compressed indexing with signature grammars

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2294697)