Depth separations in neural networks: what is actually being separated? (Q2117335): Difference between revisions

The authors consider approximation properties of depth 2 networks \[ N_2(\mathbf{x})=\sum_{i=1}^wu_i\sigma(\mathbf{w}_i^{\mathsf{T}}\mathbf{x}+b_i). \] The main results are given in three subsections of Section 2. Subsection 2.1 contains a formal result implying that radial functions can be approximated with depth 2, width (parameter $ w $) poly($ d $) ($ \mathbf{x},\mathbf{w}_i\in\mathbb{R}^d $) networks, to any constant accuracy $ \epsilon. $ This result is proved for networks employing any activation function $ \sigma $ satisfying mild assumption, which implies that the activation can be used to approximate univariate functions well. This assumption is satisfied for all standard activations such as ReLU and sigmoidal functions. In Subsection 2.2 the authors show how Lipschitz radial functions can be approximated by width poly($ 1/\epsilon) $ depth 2 ReLU networks. Results of Subsection 2.3 complement previous positive approximation results with negative results. Section 3 contains proofs.

0 references

zbMATH Keywords

deep learning

0 references

neural network

0 references

approximation theory

0 references

depth separation

0 references

reviewed by

Alexey L. Lukashov

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Proof of the Achievability Conjectures for the General Stochastic Block Model

0 references

Q4222737

0 references

Universal approximation bounds for superpositions of a sigmoidal function

0 references

Theory of Classification: a Survey of Some Recent Advances

0 references

Q4828422

0 references

Approximation and learning of convex superpositions

0 references

Agnostically Learning Halfspaces

0 references

Q2969663

0 references

Approximation by Combinations of ReLU and Squared ReLU Ridge Functions With <inline-formula> <tex-math notation="LaTeX">$\ell^1$ </tex-math> </inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\ell^0$ </tex-math> </inline-formula> Controls

0 references

A note on approximation of a ball by polytopes

0 references

Theory of probability and random processes.

0 references

Inverses of Vandermonde Matrices

0 references

Provable approximation properties for deep neural networks

0 references

Understanding Machine Learning

0 references

Q4558174

0 references

Error bounds for approximations with deep ReLU networks

0 references

Identifiers

zbMATH Open document ID

1504.41023

0 references

DOI

10.1007/s00365-021-09532-7

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:2117335

@@ Property / reviewed by @@
+Alexey L. Lukashov
@@ Property / reviewed by: Alexey L. Lukashov / rank @@
+Normal rank
@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / OpenAlex ID @@
+W2966686587
@@ Property / OpenAlex ID: W2966686587 / rank @@
+Normal rank
@@ Property / arXiv ID @@
+.06984
@@ Property / arXiv ID: 1904.06984 / rank @@
+Normal rank
@@ Property / cites work @@
+Proof of the Achievability Conjectures for the General Stochastic Block Model
+Normal rank
@@ Property / cites work @@
+Q4222737
@@ Property / cites work: Q4222737 / rank @@
+Normal rank
@@ Property / cites work @@
+Universal approximation bounds for superpositions of a sigmoidal function
+Normal rank
@@ Property / cites work @@
+Theory of Classification: a Survey of Some Recent Advances
+Normal rank
@@ Property / cites work @@
+Q4828422
@@ Property / cites work: Q4828422 / rank @@
+Normal rank
@@ Property / cites work @@
+Approximation and learning of convex superpositions
+Normal rank
@@ Property / cites work @@
+Agnostically Learning Halfspaces
@@ Property / cites work: Agnostically Learning Halfspaces / rank @@
+Normal rank
@@ Property / cites work @@
+Q2969663
@@ Property / cites work: Q2969663 / rank @@
+Normal rank
@@ Property / cites work @@
+Approximation by Combinations of ReLU and Squared ReLU Ridge Functions With &lt;inline-formula&gt;                       &lt;tex-math notation="LaTeX"&gt;$\ell^1$ &lt;/tex-math&gt;                   &lt;/inline-formula&gt; and &lt;inline-formula&gt;                       &lt;tex-math notation="LaTeX"&gt;$\ell^0$ &lt;/tex-math&gt;                   &lt;/inline-formula&gt; Controls
+Normal rank
@@ Property / cites work @@
+A note on approximation of a ball by polytopes
@@ Property / cites work: A note on approximation of a ball by polytopes / rank @@
+Normal rank
@@ Property / cites work @@
+Theory of probability and random processes.
@@ Property / cites work: Theory of probability and random processes. / rank @@
+Normal rank
@@ Property / cites work @@
+Inverses of Vandermonde Matrices
@@ Property / cites work: Inverses of Vandermonde Matrices / rank @@
+Normal rank
@@ Property / cites work @@
+Provable approximation properties for deep neural networks
+Normal rank
@@ Property / cites work @@
+Understanding Machine Learning
@@ Property / cites work: Understanding Machine Learning / rank @@
+Normal rank
@@ Property / cites work @@
+Q4558174
@@ Property / cites work: Q4558174 / rank @@
+Normal rank
@@ Property / cites work @@
+Error bounds for approximations with deep ReLU networks
+Normal rank