Depth separations in neural networks: what is actually being separated? (Q2117335): Difference between revisions

The authors consider approximation properties of depth 2 networks \[ N_2(\mathbf{x})=\sum_{i=1}^wu_i\sigma(\mathbf{w}_i^{\mathsf{T}}\mathbf{x}+b_i). \] The main results are given in three subsections of Section 2. Subsection 2.1 contains a formal result implying that radial functions can be approximated with depth 2, width (parameter \( w \)) poly(\( d \)) (\( \mathbf{x},\mathbf{w}_i\in\mathbb{R}^d \)) networks, to any constant accuracy \( \epsilon. \) This result is proved for networks employing any activation function \( \sigma \) satisfying mild assumption, which implies that the activation can be used to approximate univariate functions well. This assumption is satisfied for all standard activations such as ReLU and sigmoidal functions. In Subsection 2.2 the authors show how Lipschitz radial functions can be approximated by width poly(\( 1/\epsilon) \) depth 2 ReLU networks. Results of Subsection 2.3 complement previous positive approximation results with negative results. Section 3 contains proofs.

0 references

zbMATH Keywords

deep learning

0 references

neural network

0 references

approximation theory

0 references

depth separation

0 references

reviewed by

Alexey L. Lukashov

0 references

MaRDI profile type

MaRDI publication profile

0 references

Identifiers

zbMATH Open document ID

1504.41023

0 references

DOI

10.1007/s00365-021-09532-7

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:2117335

Revision as of 22:50, 19 March 2024 Openalex240319060354 (talk \| contribs) 1,841,457 edits Set OpenAlex properties. ← Older edit	Revision as of 01:04, 19 April 2024 Importer (talk \| contribs) Bots 7,036,144 edits ‎Changed an Item Newer edit →
	Property / arXiv ID
		1904.06984
	Property / arXiv ID: 1904.06984 / rank
		Normal rank