Small-space LCE data structure with constant-time queries

From MaRDI portal
Publication:5111224

DOI10.4230/LIPICS.MFCS.2017.10zbMATH Open1441.68026arXiv1702.07458OpenAlexW2773679974MaRDI QIDQ5111224FDOQ5111224

Hideo Bannai, Takaaki Nishimoto, Yuka Tanimura, Shunsuke Inenaga, Masayuki Takeda

Publication date: 26 May 2020

Abstract: The emph{longest common extension} (emph{LCE}) problem is to preprocess a given string w of length n so that the length of the longest common prefix between suffixes of w that start at any two given positions is answered quickly. In this paper, we present a data structure of O(zau2+fracnau) words of space which answers LCE queries in O(1) time and can be built in O(nlogsigma) time, where 1leqauleqsqrtn is a parameter, z is the size of the Lempel-Ziv 77 factorization of w and sigma is the alphabet size. This is an emph{encoding} data structure, i.e., it does not access the input string w when answering queries and thus w can be deleted after preprocessing. On top of this main result, we obtain further results using (variants of) our LCE data structure, which include the following: - For highly repetitive strings where the zau2 term is dominated by fracnau, we obtain a emph{constant-time and sub-linear space} LCE query data structure. - Even when the input string is not well compressible via Lempel-Ziv 77 factorization, we still can obtain a emph{constant-time and sub-linear space} LCE data structure for suitable au and for sigmaleq2o(logn). - The time-space trade-off lower bounds for the LCE problem by Bille et al. [J. Discrete Algorithms, 25:42-50, 2014] and by Kosolobov [CoRR, abs/1611.02891, 2016] can be "surpassed" in some cases with our LCE data structure.


Full work available at URL: https://arxiv.org/abs/1702.07458




Recommendations




Cites Work


Cited In (3)





This page was built for publication: Small-space LCE data structure with constant-time queries

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5111224)