Smaller generalization error derived for a deep residual neural network compared with shallow networks
From MaRDI portal
Publication:6190811
Abstract: Estimates of the generalization error are proved for a residual neural network with random Fourier features layers . An optimal distribution for the frequencies of the random Fourier features and is derived. This derivation is based on the corresponding generalization error for the approximation of the function values . The generalization error turns out to be smaller than the estimate of the generalization error for random Fourier features with one hidden layer and the same total number of nodes , in the case the -norm of is much less than the -norm of its Fourier transform . This understanding of an optimal distribution for random features is used to construct a new training method for a deep residual network. Promising performance of the proposed new algorithm is demonstrated in computational experiments.
Recommendations
- Rademacher complexity and the generalization error of residual networks
- Deep limits of residual neural networks
- An analysis of training and generalization errors in shallow and deep networks
- The generalization error of the minimum-norm solutions for over-parameterized neural networks
- Convergence analysis of deep residual networks
This page was built for publication: Smaller generalization error derived for a deep residual neural network compared with shallow networks
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6190811)