Neural network approximation: three hidden layers are enough (Q6054944)
From MaRDI portal
scientific article; zbMATH DE number 7743433
Language | Label | Description | Also known as |
---|---|---|---|
English | Neural network approximation: three hidden layers are enough |
scientific article; zbMATH DE number 7743433 |
Statements
Neural network approximation: three hidden layers are enough (English)
0 references
28 September 2023
0 references
It is well-known that a sufficiently deep neural network has great power in approximating high-dimensional complex functions. Surprisingly, it was constructed in the paper a neural network which has three hidden layers only but possesses super approximation power. The network uses different activation functions at the three hidden layers. Specifically, the network in approximating a function \(f\) is given by \[ \phi(\mathbf{x})=2\omega_f(2\sqrt{d})\sum_{j=1}^N2^{-j}\sigma_3\biggl(a_j\cdot \sigma_2\bigl(1+\sum_{i=1}^d2^{(i-1)N}\sigma_1(2^{N-1}x_i)\bigr)\biggr)+f(\mathbf{0})-\omega_f(2\sqrt{d}),\ \mathbf{x}=(x_1,x_2,\ldots,x_d)\in\mathbf{R}^d, \] where \(N\) denotes the width of the network, \(\omega_f(\cdot)\) is the modulus of continuity of \(f\), \(a_i\in[0,\frac12)\), \(1\le i\le N\), and \[ \sigma_1(x):=\lfloor x\rfloor,\ \sigma_2(x):=2^x, \ \sigma_3(x)=\mathcal{T}(x-\lfloor x\rfloor-\frac12),\ \ x\in\mathbf{R}, \] with \[ \mathcal{T}(x):=\left\{ \begin{array}{ll} 1,&x\ge 0,\\ 0,&x<0. \end{array} \right. \] It was proved in the paper that for a continuous function \(f\) on \([0,1]^d\), there exist \(a_1,a_2,\ldots,a_N\in[0,\frac12)\) such that \[ |f(\mathbf{x})-\phi(\mathbf{x})|\le 2\omega_f(2\sqrt{d})2^{-N}+\omega_f(2\sqrt{d}2^{-N}),\ \ \mathbf{x}\in[0,1]^d. \] Consequently, when \(f\) is Hölder continuous of order \(\alpha\in(0,1]\) with a Hölder constant \(\lambda\), there exist \(a_1,a_2,\ldots,a_N\in[0,\frac12)\) such that \[ |f(\mathbf{x})-\phi(\mathbf{x})|\le 3\lambda(2\sqrt{d})^\alpha2^{-\alpha N},\ \ \mathbf{x}\in[0,1]^d, \] which implies that the proposed three-hidden-layer neural network can exponentially approximate a Hölder continuous function as the width increases. The results reveal an interesting and important property of the expressive power of deep neural networks. Applications of the results to machine learning are also discussed in the paper.
0 references
exponential convergence
0 references
curse of dimensionality
0 references
deep neural network
0 references
floor-exponential-step activation function
0 references
continuous function
0 references
0 references
0 references
0 references