Bernstein polynomials and learning theory (Q596812)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Bernstein polynomials and learning theory |
scientific article |
Statements
Bernstein polynomials and learning theory (English)
0 references
10 August 2004
0 references
Let \(f(x)= -x\log x- (I- x)\log(1 - x)\) for \(0\leq x\leq 1\); it is referred to as the entropy function. Let \(B_n[f]\) denote the nth Bernstein polynomial of \(f\). One half of the paper establishes the estimates \[ f(x)- B_n[f](x)\geq {1\over 2n}+ {1\over 20n^2 x(1-x)}- {1\over 12n^2}\text{ for }{15\over n}\leq x\leq 1-{15\over n}, \] \[ \lim_{n\to\infty}\, n\{f- B_n[f]\}(z/n)= -z\log z+ ze^{-z} \sum^\infty_{k=1} {z^k\over k!}\log(k+ 1) \] for \(\leq z\leq 15\). They are then used to establish the optimality of an encoding rule for the following problem. Let \(A_0,\dots, A_m\) be an alphabet of \(m+ 1\) letters which are to be encoded with code length \(\log 1/q_i\) for \(A_i\), where \(\sum q_i= 1\). The expected value of the code length is then \(\sum p_i\log 1/q_i\), where \(A_i\) occurs with the unknown probability \(p_i\), and would be minimised by \(q_i= p_i\). The difference \(\sum p_i\log p_i/q_i\) between the expected value and the minimum is the redundancy. A sample of letters of size \(n\) is available and each \(q_i\) is to be chosen, according to the number \(k_i\) of times that \(A_i\) occurs in the sample, by a rule \(q_i= Q(k_i)\). In the case \(m= 1\) a rule \(Q\) is found such that if \(F^*_n(p)\) is the expected redundancy, where \(p= p_1\) then \[ \lim_{n\to\infty}\,\sup_{0\leq p\leq 1}\, nF^*_n(p)= \textstyle{{1\over 2}} \] which by a result of \textit{T. M. Cover} [IEEE Trans. Inf. Theory 18, 216--217 (1972; Zbl 0231.94001)] is optimal. The general case \(m> 1\) is reduced to the case \(m= 1\). The calculations involved are lengthy, intricate and delicate. The paper is a sequel to a paper [How to achieve minimax expected Kullback-Leibler distance from an unknown finite distribution, Berlin: Springer (2002; Zbl 1024.68047] by the same authors et al., it improves a result of \textit{R. E. Krichevskiy} [IEEE Trans. Inf. Theory 44, No. 1, 296--303 (1998; Zbl 1053.94534)] and makes reference to \textit{J. Forster} [J. Comput. Syst. Sci. 65, No. 4, 612--625 (2002; Zbl 1058.68058)].
0 references
Bernstein polynomials
0 references
Entropy function
0 references
Learning theory
0 references
Optimal encoding
0 references
0 references