Proofs of conservation inequalities for Levin's notion of mutual information of 1974 (Q2219055)

In a computer, every object is represented by a binary sequence, as 0s and 1s. A natural measure of complexity of a binary sequence is its prefix Kolmogorov complexity \(K(x)\), i.e., the shortest length of a program (in a fixed programming language) that computes either this sequence \(x\) or a binary sequence starting with \(x\). This notion enables us to naturally define the amount of information \(I(x:y)\) that \(x\) contains about \(y\). Namely, if \(x\) and \(y\) are completely independent, then the only way to compute both is to compute each of them, so we have \(K(xy)=K(x)+K(y)\). On the other hand, if \(y\) can be easily computed based on \(x\), then to compute \(xy\), we only need to compute \(x\) ``from scratch'', after this \(y\) is easily computed. In this case, \(K(xy)\ll K(x)+K(y)\). The difference \(K(x)+K(y)-K(xy)\) describes how many bits we save in computing \(y\) when we know \(x\) -- and is, thus, a natural measure of how much information about \(y\) is contained in \(x\). For infinite binary sequences \(\alpha\) and \(\beta\), we can define the information \(I(\alpha:\beta)\) by considering values \(I(\alpha_{1:n},\beta_{1:m})\) corresponding to finite prefixes of these infinite sequences. This definition was introduced in 1974 by L. Levin, who formulated, without proofs, several theorems about this notion; for one of these theorems, the proof was never published. The main part of this paper is proving this result; several auxiliary new results are also proved. This main result is related to the fact that for a random binary sequence \(\omega\) -- random in the usual sense, when all digits are independent and all 0s and 1s are equally probable -- the probability that this sequence contains at least \(b\) bits of information about a given sequence \(\alpha\) (e.g., the sequence of truth values of all mathematical statements) decreases with \(b\). Specifically, it can be proven that \(\operatorname{Prob}(I(\omega:\alpha)\ge b)\le c\cdot 2^{-b}\) for some constant \(c\). The same inequality holds for any computable probability measure. Levin's theorem is about what happens if we consider probability measures \(P\) which are computable with respect to some not-necessarily-computable sequence \(\rho\) -- e.g., the sequence of all observational results. In this case, it may happen that \(\rho\) contains some information about \(\alpha\). However, we expect that adding a sequence which is random with respect to a \(\rho\)-computable probability measure \(P\) will not increase this amount of information, i.e., that we have \(\operatorname{Prob}_P(I(\rho\omega:\alpha)-I(\rho:\alpha)\ge b)\le c\cdot 2^{-b}\). This is exactly Levin's theorem whose proof is presented in this paper.

0 references

zbMATH Keywords

mutual information

0 references

conservation inequalities

0 references

Kolmogorov complexity

0 references

author

Nikolai K. Vereshchagin

0 references

reviewed by