On compressing and indexing repetitive sequences
From MaRDI portal
Publication:390894
DOI10.1016/j.tcs.2012.02.006zbMath1292.68061OpenAlexW2101881908MaRDI QIDQ390894
Gonzalo Navarro, Sebastian Kreft
Publication date: 9 January 2014
Published in: Theoretical Computer Science (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.tcs.2012.02.006
Coding and information theory (compaction, compression, models of communication, encoding schemes, etc.) (aspects in computer science) (68P30) Algorithms on strings (68W32)
Related Items
Block graphs in practice, Comparison of LZ77-type parsings, FM-index of alignment: a compressed index for similar strings, A separation between RLSLPs and LZ77, Document listing on repetitive collections with guaranteed performance, Unnamed Item, An LMS-based grammar self-index with local consistency properties, On the approximation ratio of LZ-end to LZ77, Grammar compressed sequences with rank/select support, Grammar-compressed indexes with logarithmic search time, Composite Repetition-Aware Data Structures, Compact binary relation representations with rich functionality, Faster repetition-aware compressed suffix trees based on block trees, String Indexing with Compressed Patterns, Hybrid indexes for repetitive datasets, Near-optimal search time in \(\delta \)-optimal space, and vice versa, Time-space trade-offs for Lempel-Ziv compressed indexing, Computational graph pangenomics: a tutorial on data structures and their applications, Near-optimal search time in \(\delta \)-optimal space, Sensitivity of string compressors and repetitiveness measures, FM-index of alignment with gaps, Block trees, Universal compressed text indexing, On Two LZ78-style Grammars: Compression Bounds and Compressed-Space Computation, A Self-index on Block Trees, Linear-Size CDAWG: New Repetition-Aware Indexing and Grammar Compression, Flexible indexing of repetitive collections, Lempel-Ziv compressed structures for document retrieval, LZ77 computation based on the run-length encoded BWT, A new class of searchable and provably highly compressible string transformations, A compressed dynamic self-index for highly repetitive text collections, A Space-Optimal Grammar Compression., Bicriteria Data Compression, LZ-End Parsing in Linear Time, Orthogonal Range Searching for Text Indexing, Lazy Lempel-Ziv Factorization Algorithms, Faster Compressed Suffix Trees for Repetitive Collections
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- An implicit data structure supporting insertion, deletion, and search in \(O(\log ^ 2\,n)\) time
- Representing trees of higher degree
- Large alphabets and incompressibility
- Indexing text using the Ziv--Lempel trie
- Lempel-Ziv factorization using less time \& space
- Faster entropy-bounded compressed suffix trees
- Application of Lempel-Ziv factorization to the approximation of grammar-based compression.
- Log-logarithmic worst-case range queries are possible in space theta(N)
- Stronger Lempel-Ziv based compressed text indexing
- Rank and select revisited and extended
- Compressed representations of sequences and full-text indexes
- Self-indexing Based on LZ77
- Suffix Arrays: A New Method for On-Line String Searches
- Relative Lempel-Ziv Compression of Genomes for Large-Scale Storage and Retrieval
- Self-indexed Text Compression Using Straight-Line Programs
- Compressed suffix arrays and suffix trees with applications to text indexing and string matching (extended abstract)
- An analysis of the Burrows—Wheeler transform
- Self-Indexed Grammar-Based Compression
- An Online Algorithm for Finding the Longest Previous Factors
- Indexing compressed text
- Optimal Succinctness for Range Minimum Queries
- Squeezing succinct data structures into entropy bounds
- A New Succinct Representation of RMQ-Information and Improvements in the Enhanced Suffix Array
- Breaking a Time-and-Space Barrier in Constructing Full-Text Indices
- Upper bounds on the probability of sequences emitted by finite-state sources and on the redundancy of the Lempel-Ziv algorithm
- On the Complexity of Finite Sequences
- A universal algorithm for sequential data compression
- Compression of individual sequences via variable-rate coding
- New text indexing functionalities of the compressed suffix arrays
- Succinct Indexable Dictionaries with Applications to Encoding $k$-ary Trees, Prefix Sums and Multisets
- Compression of Low Entropy Strings with Lempel--Ziv Algorithms
- Reducing the Space Requirement of LZ-Index
- Implementing the LZ-index
- Fully-Compressed Suffix Trees
- Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching
- Run-Length Compressed Indexes Are Superior for Highly Repetitive Sequence Collections