Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness

From MaRDI portal
Publication:2057701

DOI10.1016/J.NEUNET.2020.06.024zbMATH Open1475.68315arXiv1905.11427OpenAlexW3039204554WikidataQ97517817 ScholiaQ97517817MaRDI QIDQ2057701FDOQ2057701


Authors: Pengzhan Jin, Lu Lu, Yifa Tang, George Em Karniadakis Edit this on Wikidata


Publication date: 7 December 2021

Published in: Neural Networks (Search for Journal in Brave)

Abstract: The accuracy of deep learning, i.e., deep neural networks, can be characterized by dividing the total error into three main types: approximation error, optimization error, and generalization error. Whereas there are some satisfactory answers to the problems of approximation and optimization, much less is known about the theory of generalization. Most existing theoretical works for generalization fail to explain the performance of neural networks in practice. To derive a meaningful bound, we study the generalization error of neural networks for classification problems in terms of data distribution and neural network smoothness. We introduce the cover complexity (CC) to measure the difficulty of learning a data set and the inverse of the modulus of continuity to quantify neural network smoothness. A quantitative bound for expected accuracy/error is derived by considering both the CC and neural network smoothness. Although most of the analysis is general and not specific to neural networks, we validate our theoretical assumptions and results numerically for neural networks by several data sets of images. The numerical results confirm that the expected error of trained networks scaled with the square root of the number of classes has a linear relationship with respect to the CC. We also observe a clear consistency between test loss and neural network smoothness during the training process. In addition, we demonstrate empirically that the neural network smoothness decreases when the network size increases whereas the smoothness is insensitive to training dataset size.


Full work available at URL: https://arxiv.org/abs/1905.11427




Recommendations




Cites Work


Cited In (11)

Uses Software





This page was built for publication: Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2057701)