Optimal learning rates for distribution regression (Q2283125)

A learning algorithm is studied for distribution regression with regularized least squares (RLS). The algorithm contains two stages of samples and aims at regressing from distributions to real valued outputs. The first stage sample consists of (unknown) probability distributions \(x_i, i=1,\ldots,l,\) and the second stage sample consists of the data \(D=\{x_{ij}, j=1,\ldots,N, y_i; i=1,\ldots,l\},\) where \(x_{ij}\) are independently drawn from \(x_i\) and \(y_i\) is the corresponding label from the output space. Distributions are embedded to a reproducing kernel Hilbert space, and the sample \(D\) is used to form the regressor based on the empirical version of the distribution \(x_i\) by a tool of mean embedding. The obtained error bounds in the \(L^2\)-norm show that the regressor is a good approximation to the regression function. By taking \(\lambda=l^{-a}\) and \(N=l^b,\) a learning rate is derived which is minimax optimal for the distribution regression algorithm; here \(\lambda\) is a regularization parameter from the RLS, and \(a, b\) are positive constants depending on model assumptions. The achieved rate improves the one from \textit{Z. Szabó} et al. [J. Mach. Learn. Res. 17, Paper No. 152, 40 p. (2016; Zbl 1392.62124)] by removing a term \(\log l\).

0 references

zbMATH Keywords

distribution regression

0 references

reproducing kernel Hilbert space

0 references

mean embedding

0 references

integral operator

0 references

optimal learning rate

0 references