Optimal approximation rate of ReLU networks in terms of width and depth

From MaRDI portal
Publication:2065073

DOI10.1016/J.MATPUR.2021.07.009zbMATH Open1501.41010arXiv2103.00502OpenAlexW3185971845MaRDI QIDQ2065073FDOQ2065073


Authors: Zuowei Shen, Haizhao Yang, Shijun Zhang Edit this on Wikidata


Publication date: 7 January 2022

Published in: Journal de Mathématiques Pures et Appliquées. Neuvième Série (Search for Journal in Brave)

Abstract: This paper concentrates on the approximation power of deep feed-forward neural networks in terms of width and depth. It is proved by construction that ReLU networks with width and depth mathcalO(L) can approximate a H"older continuous function on [0,1]d with an approximation rate , where alphain(0,1] and lambda>0 are H"older order and constant, respectively. Such a rate is optimal up to a constant in terms of width and depth separately, while existing results are only nearly optimal without the logarithmic factor in the approximation rate. More generally, for an arbitrary continuous function f on [0,1]d, the approximation rate becomes , where omegaf(cdot) is the modulus of continuity. We also extend our analysis to any continuous function f on a bounded set. Particularly, if ReLU networks with depth 31 and width mathcalO(N) are used to approximate one-dimensional Lipschitz continuous functions on [0,1] with a Lipschitz constant lambda>0, the approximation rate in terms of the total number of parameters, W=mathcalO(N2), becomes mathcalO(fraclambdaWlnW), which has not been discovered in the literature for fixed-depth ReLU networks.


Full work available at URL: https://arxiv.org/abs/2103.00502




Recommendations




Cites Work


Cited In (25)





This page was built for publication: Optimal approximation rate of ReLU networks in terms of width and depth

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2065073)