Bernstein polynomials and learning theory (Q596812)

From MaRDI portal
Revision as of 18:47, 6 June 2024 by ReferenceBot (talk | contribs) (‎Changed an Item)





scientific article
Language Label Description Also known as
English
Bernstein polynomials and learning theory
scientific article

    Statements

    Bernstein polynomials and learning theory (English)
    0 references
    0 references
    0 references
    10 August 2004
    0 references
    Let \(f(x)= -x\log x- (I- x)\log(1 - x)\) for \(0\leq x\leq 1\); it is referred to as the entropy function. Let \(B_n[f]\) denote the nth Bernstein polynomial of \(f\). One half of the paper establishes the estimates \[ f(x)- B_n[f](x)\geq {1\over 2n}+ {1\over 20n^2 x(1-x)}- {1\over 12n^2}\text{ for }{15\over n}\leq x\leq 1-{15\over n}, \] \[ \lim_{n\to\infty}\, n\{f- B_n[f]\}(z/n)= -z\log z+ ze^{-z} \sum^\infty_{k=1} {z^k\over k!}\log(k+ 1) \] for \(\leq z\leq 15\). They are then used to establish the optimality of an encoding rule for the following problem. Let \(A_0,\dots, A_m\) be an alphabet of \(m+ 1\) letters which are to be encoded with code length \(\log 1/q_i\) for \(A_i\), where \(\sum q_i= 1\). The expected value of the code length is then \(\sum p_i\log 1/q_i\), where \(A_i\) occurs with the unknown probability \(p_i\), and would be minimised by \(q_i= p_i\). The difference \(\sum p_i\log p_i/q_i\) between the expected value and the minimum is the redundancy. A sample of letters of size \(n\) is available and each \(q_i\) is to be chosen, according to the number \(k_i\) of times that \(A_i\) occurs in the sample, by a rule \(q_i= Q(k_i)\). In the case \(m= 1\) a rule \(Q\) is found such that if \(F^*_n(p)\) is the expected redundancy, where \(p= p_1\) then \[ \lim_{n\to\infty}\,\sup_{0\leq p\leq 1}\, nF^*_n(p)= \textstyle{{1\over 2}} \] which by a result of \textit{T. M. Cover} [IEEE Trans. Inf. Theory 18, 216--217 (1972; Zbl 0231.94001)] is optimal. The general case \(m> 1\) is reduced to the case \(m= 1\). The calculations involved are lengthy, intricate and delicate. The paper is a sequel to a paper [How to achieve minimax expected Kullback-Leibler distance from an unknown finite distribution, Berlin: Springer (2002; Zbl 1024.68047] by the same authors et al., it improves a result of \textit{R. E. Krichevskiy} [IEEE Trans. Inf. Theory 44, No. 1, 296--303 (1998; Zbl 1053.94534)] and makes reference to \textit{J. Forster} [J. Comput. Syst. Sci. 65, No. 4, 612--625 (2002; Zbl 1058.68058)].
    0 references
    Bernstein polynomials
    0 references
    Entropy function
    0 references
    Learning theory
    0 references
    Optimal encoding
    0 references

    Identifiers