Note on the greedy parsing optimality for dictionary-based text compression
From MaRDI portal
Abstract: LZ77-based compression schemes compress the input text by replacing factors in the text with an encoded reference to a previous occurrence formed by the couple (length, offset). For a given factor, the smallest is the offset, the smallest is the resulting compression ratio. This is optimally achieved by using the rightmost occurrence of a factor in the previous text. Given a cost function, for instance the minimum number of bits used to represent an integer, we define the Rightmost Equal-Cost Position (REP) problem as the problem of finding one of the occurrences of a factor which cost is equal to the cost of the rightmost one. We present the Multi-Layer Suffix Tree data structure that, for a text of length n, at any time i, it provides REP(LPF) in constant time, where LPF is the longest previous factor, i.e. the greedy phrase, a reference to the list of REP({set of prefixes of LPF}) in constant time and REP(p) in time O(|p| log log n) for any given pattern p.
Recommendations
- On parsing optimality for dictionary-based text compression -- the \texttt{Zip} case
- scientific article; zbMATH DE number 1305530
- The effect of flexible parsing for dynamic dictionary-based data compression
- The relationship between greedy parsing and symbolwise text compression
- Relations between greedy and bit-optimal LZ77 encodings
Cites work
- scientific article; zbMATH DE number 3919857 (Why is no real title available?)
- scientific article; zbMATH DE number 1305530 (Why is no real title available?)
- A universal algorithm for sequential data compression
- Algorithms on Strings
- Algorithms on Strings, Trees and Sequences
- An algorithm for optimal prefix parsing of a noiseless and memoryless channel
- Compression of individual sequences via variable-rate coding
- Computing longest previous factor in linear time and applications
- Data compression via textual substitution
- Data compression. The complete reference.
- Dictionary-symbolwise flexible parsing
- Dictionary-symbolwise flexible parsing
- Efficient algorithms for three variants of the LPF table
- LPF computation revisited
- Lempel-Ziv factorization using less time \& space
- On parsing optimality for dictionary-based text compression -- the \texttt{Zip} case
- On the bit-complexity of Lempel-Ziv compression
- Online timestamped text indexing
- Replacing suffix trees with enhanced suffix arrays
Cited in
(9)- On parsing optimality for dictionary-based text compression -- the \texttt{Zip} case
- The greedy approach to dictionary-based static text compression on a distributed system
- Compression estimates for a class of dictionary based compressors
- On optimal parsing for LZ78-like compressors
- The effect of flexible parsing for dynamic dictionary-based data compression
- Greedy versus optimal analysis of bounded size dictionary compression and on-the-fly distributed computing
- Relations between greedy and bit-optimal LZ77 encodings
- Quasi-distinct Parsing and Optimal Compression Methods
- The relationship between greedy parsing and symbolwise text compression
This page was built for publication: Note on the greedy parsing optimality for dictionary-based text compression
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2437746)