No fast exponential deviation inequalities for the progressive mixture rule

From MaRDI portal
Revision as of 11:27, 10 July 2024 by Import240710060729 (talk | contribs) (Created automatically from import240710060729)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:6479040

arXivmath/0703848MaRDI QIDQ6479040

Jean-Yves Audibert

Publication date: 28 March 2007

Abstract: We consider the learning task consisting in predicting as well as the best function in a finite reference set G up to the smallest possible additive term. If R(g) denotes the generalization error of a prediction function g, under reasonable assumptions on the loss function (typically satisfied by the least square loss when the output is bounded), it is known that the progressive mixture rule g_n satisfies E R(g_n) < min_{g in G} R(g) + C (log|G|)/n where n denotes the size of the training set, E denotes the expectation w.r.t. the training set distribution and C denotes a positive constant. This work mainly shows that for any training set size n, there exist a>0, a reference set G and a probability distribution generating the data such that with probability at least a R(g_n) > min_{g in G} R(g) + c sqrt{[log(|G|/a)]/n}, where c is a positive constant. In other words, surprisingly, for appropriate reference set G, the deviation convergence rate of the progressive mixture rule is only of order 1/sqrt{n} while its expectation convergence rate is of order 1/n. The same conclusion holds for the progressive indirect mixture rule. This work also emphasizes on the suboptimality of algorithms based on penalized empirical risk minimization on G.











This page was built for publication: No fast exponential deviation inequalities for the progressive mixture rule