Smaller generalization error derived for a deep residual neural network compared with shallow networks
From MaRDI portal
Publication:6190811
DOI10.1093/IMANUM/DRAC049arXiv2010.01887OpenAlexW3153985548MaRDI QIDQ6190811FDOQ6190811
Mattias Sandberg, Anders Szepessy, Jonas Kiessling, Aku Kammonen, R. Tempone, Petr Plecháč
Publication date: 6 February 2024
Published in: IMA Journal of Numerical Analysis (Search for Journal in Brave)
Abstract: Estimates of the generalization error are proved for a residual neural network with random Fourier features layers . An optimal distribution for the frequencies of the random Fourier features and is derived. This derivation is based on the corresponding generalization error for the approximation of the function values . The generalization error turns out to be smaller than the estimate of the generalization error for random Fourier features with one hidden layer and the same total number of nodes , in the case the -norm of is much less than the -norm of its Fourier transform . This understanding of an optimal distribution for random features is used to construct a new training method for a deep residual network. Promising performance of the proposed new algorithm is demonstrated in computational experiments.
Full work available at URL: https://arxiv.org/abs/2010.01887
supervised learningerror estimatesresidual networkdeep random feature networkslayer-by-layer algorithm
This page was built for publication: Smaller generalization error derived for a deep residual neural network compared with shallow networks
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6190811)