Optimal learning rates for distribution regression (Q2283125)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Optimal learning rates for distribution regression |
scientific article |
Statements
Optimal learning rates for distribution regression (English)
0 references
30 December 2019
0 references
A learning algorithm is studied for distribution regression with regularized least squares (RLS). The algorithm contains two stages of samples and aims at regressing from distributions to real valued outputs. The first stage sample consists of (unknown) probability distributions \(x_i, i=1,\ldots,l,\) and the second stage sample consists of the data \(D=\{x_{ij}, j=1,\ldots,N, y_i; i=1,\ldots,l\},\) where \(x_{ij}\) are independently drawn from \(x_i\) and \(y_i\) is the corresponding label from the output space. Distributions are embedded to a reproducing kernel Hilbert space, and the sample \(D\) is used to form the regressor based on the empirical version of the distribution \(x_i\) by a tool of mean embedding. The obtained error bounds in the \(L^2\)-norm show that the regressor is a good approximation to the regression function. By taking \(\lambda=l^{-a}\) and \(N=l^b,\) a learning rate is derived which is minimax optimal for the distribution regression algorithm; here \(\lambda\) is a regularization parameter from the RLS, and \(a, b\) are positive constants depending on model assumptions. The achieved rate improves the one from \textit{Z. Szabó} et al. [J. Mach. Learn. Res. 17, Paper No. 152, 40 p. (2016; Zbl 1392.62124)] by removing a term \(\log l\).
0 references
distribution regression
0 references
reproducing kernel Hilbert space
0 references
mean embedding
0 references
integral operator
0 references
optimal learning rate
0 references