Random access in persistent strings and segment selection
From MaRDI portal
Abstract: We consider compact representations of collections of similar strings that support random access queries. The collection of strings is given by a rooted tree where edges are labeled by an edit operation (inserting, deleting, or replacing a character) and a node represents the string obtained by applying the sequence of edit operations on the path from the root to the node. The goal is to compactly represent the entire collection while supporting fast random access to any part of a string in the collection. This problem captures natural scenarios such as representing the past history of an edited document or representing highly-repetitive collections. Given a tree with nodes, we show how to represent the corresponding collection in space and query time. This improves the previous time-space trade-offs for the problem. Additionally, we show a lower bound proving that the query time is optimal for any solution using near-linear space. To achieve our bounds for random access in persistent strings we show how to reduce the problem to the following natural geometric selection problem on line segments. Consider a set of horizontal line segments in the plane. Given parameters and , a segment selection query returns the th smallest segment (the segment with the th smallest -coordinate) among the segments crossing the vertical line through -coordinate . The segment selection problem is to preprocess a set of horizontal line segments into a compact data structure that supports fast segment selection queries. We present a solution that uses space and support segment selection queries in time, where is the number of segments. Furthermore, we prove that that this query time is also optimal for any solution using near-linear space.
Cites work
- scientific article; zbMATH DE number 4211552 (Why is no real title available?)
- scientific article; zbMATH DE number 140457 (Why is no real title available?)
- scientific article; zbMATH DE number 140460 (Why is no real title available?)
- scientific article; zbMATH DE number 2079421 (Why is no real title available?)
- scientific article; zbMATH DE number 1830754 (Why is no real title available?)
- A dynamic stabbing-max data structure with sub-logarithmic query time
- A faster grammar-based self-index
- A simple storage scheme for strings achieving entropy bounds
- A universal algorithm for sequential data compression
- Access, rank, and select in grammar-compressed strings
- An optimal dynamic data structure for stabbing-semigroup queries
- Application of Lempel-Ziv factorization to the approximation of grammar-based compression.
- Approximate pattern matching in LZ77-compressed texts
- At the roots of dictionary compression: string attractors
- Balancing Straight-line Programs
- Compressed Data Structures for Dynamic Sequences
- Compressed representations of sequences and full-text indexes
- Data compression via textual substitution
- Data structure lower bounds on random access to grammar-compressed strings
- Document listing on repetitive collections
- Document listing on repetitive collections with guaranteed performance
- Dynamic Compressed Strings with Random Access
- Dynamic planar orthogonal point location in sublogarithmic time
- Dynamic relative compression, dynamic partial sums, and substring concatenation
- Efficient fully-compressed sequence representations
- Fast relative Lempel-Ziv self-index for similar sequences
- Filtering Search: A New Approach to Query-Answering
- Indexing highly repetitive collections
- LZ77-based self-indexing with faster pattern matching
- Logarithmic Lower Bounds in the Cell-Probe Model
- Making data structures persistent
- On the Redundancy of Succinct Data Structures
- Optimal lower and upper bounds for representing sequences
- Persistent predecessor search and orthogonal point location on the word RAM
- Random access in persistent strings and segment selection
- Random access to grammar-compressed strings and trees
- Range selection and median: tight cell probe lower bounds and adaptive data structures
- Rank/select operations on large alphabets
- Relative Lempel-Ziv compression of genomes for large-scale storage and retrieval
- Squeezing succinct data structures into entropy bounds
- Succinct data structures for searchable partial sums with optimal worst-case performance
- Succinct indexes for strings, binary relations and multi-labeled trees
- Succinct partial sums and Fenwick trees
- Surpassing the information theoretic bound with fusion trees
- The Smallest Grammar Problem
- The macro model for data compression (extended abstract)
- Time-space trade-offs for Lempel-Ziv compressed indexing
- Trans-dichotomous algorithms for minimum spanning trees and shortest paths
- Transdichotomous Results in Computational Geometry, I: Point Location in Sublogarithmic Time
- Tree compression with top trees
- Two-Dimensional and Three-Dimensional Point Location in Rectangular Subdivisions
Cited in
(3)
This page was built for publication: Random access in persistent strings and segment selection
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6174650)