Depth separations in neural networks: what is actually being separated? (Q2117335)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Depth separations in neural networks: what is actually being separated? |
scientific article |
Statements
Depth separations in neural networks: what is actually being separated? (English)
0 references
21 March 2022
0 references
The authors consider approximation properties of depth 2 networks \[ N_2(\mathbf{x})=\sum_{i=1}^wu_i\sigma(\mathbf{w}_i^{\mathsf{T}}\mathbf{x}+b_i). \] The main results are given in three subsections of Section 2. Subsection 2.1 contains a formal result implying that radial functions can be approximated with depth 2, width (parameter \( w \)) poly(\( d \)) (\( \mathbf{x},\mathbf{w}_i\in\mathbb{R}^d \)) networks, to any constant accuracy \( \epsilon. \) This result is proved for networks employing any activation function \( \sigma \) satisfying mild assumption, which implies that the activation can be used to approximate univariate functions well. This assumption is satisfied for all standard activations such as ReLU and sigmoidal functions. In Subsection 2.2 the authors show how Lipschitz radial functions can be approximated by width poly(\( 1/\epsilon) \) depth 2 ReLU networks. Results of Subsection 2.3 complement previous positive approximation results with negative results. Section 3 contains proofs.
0 references
deep learning
0 references
neural network
0 references
approximation theory
0 references
depth separation
0 references