Compressed representations of sequences and full-text indexes

DOI10.1145/1240233.1240243zbMath1321.68263OpenAlexW2107082304MaRDI QIDQ2944557

Veli Mäkinen, Paolo Ferragina, Giovanni Manzini, Gonzalo Navarro

Publication date: 2 September 2015

Published in: ACM Transactions on Algorithms (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1145/1240233.1240243

zbMATH Keywords

entropy Burrows-Wheeler transform text compression text indexing rank and select wavelet tree compression boosting

Mathematics Subject Classification ID

Database theory (68P15) Coding and information theory (compaction, compression, models of communication, encoding schemes, etc.) (aspects in computer science) (68P30) Data structures (68P05) Algorithms on strings (68W32)

Related Items

Compressed string dictionary search with edit distance one, Top-$k$ term-proximity in succinct space, Succinct indices for path minimum, with applications, Engineering a lightweight external memory suffix array construction algorithm, Faster average case low memory semi-external construction of the Burrows-Wheeler transform, Automata evaluation and text search protocols with simulation-based security, Fast construction of wavelet trees, Approximate string matching using a bidirectional index, Approximate search of short patterns with high error rates using the $01^\ast 0$ lossless seeds, Lyndon array construction during Burrows-Wheeler inversion, On position restricted substring searching in succinct space, Large alphabets and incompressibility, The myriad virtues of wavelet trees, Grammar compressed sequences with rank/select support, Approximate string matching with compressed indexes, Towards efficient positional inverted index, Grammar-compressed indexes with logarithmic search time, Practical space-efficient index for structural pattern matching, Wheeler graphs: a framework for BWT-based data structures, Compressed property suffix trees, Compact binary relation representations with rich functionality, Space efficient data structures for dynamic orthogonal range counting, Compressed indexes for text with wildcards, Colored range queries and document retrieval, On compressing and indexing repetitive sequences, Faster repetition-aware compressed suffix trees based on block trees, On compressing permutations and adaptive sorting, Top-$k$ document retrieval in optimal space, On-line construction of position heaps, Multi-pattern matching with bidirectional indexes, Cross-document pattern matching, Improved data structures for the orthogonal range successor problem, Algorithms to compute the Burrows-Wheeler similarity distribution, Bounds from a card trick, On the number of elements to reorder when updating a suffix array, Time-space trade-offs for Lempel-Ziv compressed indexing, Finding range minima in the middle: approximations and applications, Bidirectional search in a string with wavelet trees and bidirectional matching statistics, Approximate all-pairs suffix/prefix overlaps, New algorithms on wavelet trees and applications to information retrieval, Stronger Lempel-Ziv based compressed text indexing, Lightweight data indexing and compression in external memory, Using compressed suffix-arrays for a compact representation of temporal-graphs, Compressed text indexing with wildcards, Wavelet trees for all, On the combinatorics of suffix arrays, Fast relative Lempel-Ziv self-index for similar sequences, Faster semi-external suffix sorting, A framework for designing space-efficient dictionaries for parameterized and order-preserving matching, Efficient fully-compressed sequence representations, Counting suffix arrays and strings, Practical compressed suffix trees, Fixed block compression boosting in FM-indexes: theory and practice, Tight and simple web graph compression for forward and reverse neighbor queries, From first principles to the Burrows and Wheeler transform and beyond, via combinatorial optimization, Fast BWT in small space by blockwise suffix sorting, Rank and select revisited and extended, The affix array data structure and its applications to RNA secondary structure analysis, Fast compressed self-indexes with deterministic linear-time construction, Tighter bounds for the sum of irreducible LCP values, LRM-trees: compressed indices, adaptive sorting, and compressed permutations, On approximate jumbled pattern matching in strings, Combined data structure for previous- and next-smaller-values, A quick tour on suffix arrays and compressed suffix arrays, Dynamic extended suffix arrays, Space-efficient construction of Lempel-Ziv compressed text indexes, Optimal prefix and suffix queries on texts, New space/time tradeoffs for top-$k$ document retrieval on sequences, Distribution-aware compressed full-text indexes, Move-to-front, distance coding, and inversion frequencies revisited, Lempel-Ziv compressed structures for document retrieval, On compact representations of all-pairs-shortest-path-distance matrices, Succinct data structures for searchable partial sums with optimal worst-case performance, Ranked document retrieval for multiple patterns, Wee LCP, Lempel-Ziv-78 compressed string dictionaries, Range majorities and minorities in arrays, On prefix normal words and prefix normal forms, Fully Functional Static and Dynamic Succinct Trees, A four-stage algorithm for updating a Burrows-Wheeler transform, Dynamic rank/select structures with applications to run-length encoded texts, Rank/select on dynamic compressed sequences and applications, The exact multiple pattern matching problem solved by a reference tree approach, Lightweight merging of compressed indices based on BWT variants, Computationally secure pattern matching in the presence of malicious adversaries, Locally Compressed Suffix Arrays, General Document Retrieval in Compact Space, Top tree compression of tries, Faster entropy-bounded compressed suffix trees, Improved parallel construction of wavelet trees and rank/select structures, Random Access to High-Order Entropy Compressed Text, Succinct and Implicit Data Structures for Computational Geometry, Sliding suffix tree, Improved and extended locating functionality on compressed suffix arrays, Bottom-$k$ document retrieval, Geometric BWT: compressed text indexing via sparse suffixes and range searching, Faster online computation of the succinct longest previous factor array, Space-efficient substring occurrence estimation, Space efficient merging of de Bruijn graphs and Wheeler graphs, Succinct dynamic cardinal trees, Computing q-Gram Non-overlapping Frequencies on SLP Compressed Texts, Access, Rank, and Select in Grammar-compressed Strings, Compressed Data Structures for Dynamic Sequences, Unnamed Item, Bidirectional Variable-Order de Bruijn Graphs, Self-indexed Text Compression Using Straight-Line Programs, Succinct Orthogonal Range Search Structures on a Grid with Applications to Text Indexing, Space-Efficient Frameworks for Top- k String Retrieval, Efficient Data Structures for the Orthogonal Range Successor Problem, Succinct Non-overlapping Indexing, Lossless Seeds for Searching Short Patterns with High Error Rates, Compact representations of spatial hierarchical structures with support for topological queries, A new class of string transformations for compressed text indexing, String Indexing with Compressed Patterns, Random access in persistent strings and segment selection, Time-Optimal Top-$k$ Document Retrieval, The longest common substring problem, Unnamed Item, SpliceTAPyR — An Efficient Method for Transcriptome Alignment, An Online Algorithm for Finding the Longest Previous Factors, Better External Memory LCP Array Construction, Algorithms for Indexing Highly Similar DNA Sequences, Full-Text Indexes for High-Throughput Sequencing, Lightweight BWT and LCP Merging via the Gap Algorithm, Linear-Size CDAWG: New Repetition-Aware Indexing and Grammar Compression, Succinct Dynamic Cardinal Trees with Constant Time Operations for Small Alphabet, Succincter Text Indexing with Wildcards, Self-indexing Based on LZ77, Counting Colours in Compressed Strings, LRM-Trees: Compressed Indices, Adaptive Sorting, and Compressed Permutations, Unnamed Item, Unnamed Item, Unnamed Item, Space efficient algorithms for the Burrows-Wheeler backtransformation, On the Value of Multiple Read/Write Streams for Data Compression, Permuted Longest-Common-Prefix Array, Compressed Multiple Pattern Matching, A new class of searchable and provably highly compressible string transformations, Indexing the bijective BWT, LZ-End Parsing in Linear Time, Orthogonal Range Searching for Text Indexing, Unnamed Item, Structural Pattern Matching - Succinctly., Lazy Lempel-Ziv Factorization Algorithms, LCP Array Construction in External Memory, Faster Compressed Suffix Trees for Repetitive Collections, Fast Compressed Self-Indexes with Deterministic Linear-Time Construction, Practical Compact Indexes for Top-kDocument Retrieval, Space-Efficient Parallel Construction of Succinct Representations of Suffix Tree Topologies