Small-space LCE data structure with constant-time queries
From MaRDI portal
Publication:5111224
DOI10.4230/LIPICS.MFCS.2017.10zbMATH Open1441.68026OpenAlexW2773679974MaRDI QIDQ5111224FDOQ5111224
Authors: Yuka Tanimura, Takaaki Nishimoto, Hideo Bannai, Shunsuke Inenaga, Masayuki Takeda
Publication date: 26 May 2020
Abstract: The emph{longest common extension} (emph{LCE}) problem is to preprocess a given string of length so that the length of the longest common prefix between suffixes of that start at any two given positions is answered quickly. In this paper, we present a data structure of words of space which answers LCE queries in time and can be built in time, where is a parameter, is the size of the Lempel-Ziv 77 factorization of and is the alphabet size. This is an emph{encoding} data structure, i.e., it does not access the input string when answering queries and thus can be deleted after preprocessing. On top of this main result, we obtain further results using (variants of) our LCE data structure, which include the following: - For highly repetitive strings where the term is dominated by , we obtain a emph{constant-time and sub-linear space} LCE query data structure. - Even when the input string is not well compressible via Lempel-Ziv 77 factorization, we still can obtain a emph{constant-time and sub-linear space} LCE data structure for suitable and for . - The time-space trade-off lower bounds for the LCE problem by Bille et al. [J. Discrete Algorithms, 25:42-50, 2014] and by Kosolobov [CoRR, abs/1611.02891, 2016] can be "surpassed" in some cases with our LCE data structure.
Full work available at URL: https://arxiv.org/abs/1702.07458
Recommendations
Data structures (68P05) Coding and information theory (compaction, compression, models of communication, encoding schemes, etc.) (aspects in computer science) (68P30)
Cites Work
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- A new characterization of maximal repetitions by Lyndon trees
- A universal algorithm for sequential data compression
- Algorithms on Strings, Trees and Sequences
- Application of Lempel-Ziv factorization to the approximation of grammar-based compression.
- Approximating LZ77 via Small-Space Multiple-Pattern Matching
- Computing all distinct squares in linear time for integer alphabets
- Computing longest single-arm-gapped palindromes in a string
- Construction of a de Bruijn graph for assembly from a truncated suffix tree
- Deterministic sub-linear space LCE data structures with efficient construction
- Efficient string matching with k mismatches
- Efficiently finding all maximal \(\alpha\)-gapped repeats
- Fast Algorithms for Finding Nearest Common Ancestors
- Fast lightweight suffix array construction and checking
- Faster longest common extension queries in strings over general alphabets
- Finding level-ancestors in trees
- Finding maximal 2-dimensional palindromes
- Finger Search in Grammar-Compressed Strings
- Fully dynamic data structure for LCE queries in compressed space
- Incremental String Comparison
- Linear time algorithms for finding and representing all the tandem repeats in a string
- Linear work suffix array construction
- Longest common extensions in sublinear space
- Longest common extensions with recompression
- Optimal bounds for computing \(\alpha\)-gapped repeats
- Searching for gapped palindromes
- Space-Time Tradeoffs for Longest-Common-Prefix Array Computation
- Space-efficient representation of truncated suffix trees, with applications to Markov order estimation
- Suffix Arrays: A New Method for On-Line String Searches
- Surpassing the information theoretic bound with fusion trees
- The level ancestor problem simplified
- Tight lower bounds for the longest common extension problem
- Time-space trade-offs for longest common extensions
- Truncated suffix trees and their application to data compression.
Cited In (9)
- Deterministic sub-linear space LCE data structures with efficient construction
- Tight lower bounds for the longest common extension problem
- Fully dynamic data structure for LCE queries in compressed space
- Longest common extensions in sublinear space
- Internal shortest absent word queries in constant time and linear space
- Longest common extensions with recompression
- A compressed dynamic self-index for highly repetitive text collections
- Time-space trade-offs for longest common extensions
- Title not available (Why is that?)
This page was built for publication: Small-space LCE data structure with constant-time queries
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5111224)