Source coding, large deviations, and approximate pattern matching
From MaRDI portal
Publication:4674528
Large deviations (60F10) Pattern recognition, speech recognition (68T10) Source coding (94A29) Coding and information theory (compaction, compression, models of communication, encoding schemes, etc.) (aspects in computer science) (68P30) Rate-distortion theory in information and communication theory (94A34)
Abstract: We present a development of parts of rate-distortion theory and pattern- matching algorithms for lossy data compression, centered around a lossy version of the Asymptotic Equipartition Property (AEP). This treatment closely parallels the corresponding development in lossless compression, a point of view that was advanced in an important paper of Wyner and Ziv in 1989. In the lossless case we review how the AEP underlies the analysis of the Lempel-Ziv algorithm by viewing it as a random code and reducing it to the idealized Shannon code. This also provides information about the redundancy of the Lempel-Ziv algorithm and about the asymptotic behavior of several relevant quantities. In the lossy case we give various versions of the statement of the generalized AEP and we outline the general methodology of its proof via large deviations. Its relationship with Barron's generalized AEP is also discussed. The lossy AEP is applied to: (i) prove strengthened versions of Shannon's source coding theorem and universal coding theorems; (ii) characterize the performance of mismatched codebooks; (iii) analyze the performance of pattern- matching algorithms for lossy compression; (iv) determine the first order asymptotics of waiting times (with distortion) between stationary processes; (v) characterize the best achievable rate of weighted codebooks as an optimal sphere-covering exponent. We then present a refinement to the lossy AEP and use it to: (i) prove second order coding theorems; (ii) characterize which sources are easier to compress; (iii) determine the second order asymptotics of waiting times; (iv) determine the precise asymptotic behavior of longest match-lengths. Extensions to random fields are also given.
Recommendations
- A suboptimal lossy data compression based on approximate pattern matching
- On the performance of data compression algorithms based upon string matching
- Pattern matching and lossy data compression on random fields
- The Generalized Asymptotic Equipartition Property: Necessary and Sufficient Conditions
- An implementable lossy version of the Lempel-Ziv algorithm. I. Optimality for memoryless sources
Cited in
(13)- Probabilities of randomly centered small balls and quantization in Banach spaces
- Natural type selection in adaptive lossy compression
- On approximate pattern matching for a class of Gibbs random fields
- Cramér's theorem is atypical
- Finite Blocklength Lossy Source Coding for Discrete Memoryless Sources
- Random databases with approximate record matching
- Asymptotic behavior of the distortion-rate function for Gaussian processes in Banach spaces
- Lossy asymptotic equipartition property for hierarchical data structures
- Complexity-compression tradeoffs in lossy compression via efficient random codebooks and databases
- On the MDL principle for i.i.d. sources with large alphabets
- Large Alphabet Source Coding Using Independent Component Analysis
- Pattern matching and lossy data compression on random fields
- The Generalized Asymptotic Equipartition Property: Necessary and Sufficient Conditions
This page was built for publication: Source coding, large deviations, and approximate pattern matching
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4674528)