Depth separations in neural networks: what is actually being separated? (Q2117335): Difference between revisions

From MaRDI portal
Set OpenAlex properties.
Importer (talk | contribs)
Changed an Item
Property / arXiv ID
 
Property / arXiv ID: 1904.06984 / rank
 
Normal rank

Revision as of 01:04, 19 April 2024

scientific article
Language Label Description Also known as
English
Depth separations in neural networks: what is actually being separated?
scientific article

    Statements

    Depth separations in neural networks: what is actually being separated? (English)
    0 references
    0 references
    0 references
    0 references
    21 March 2022
    0 references
    The authors consider approximation properties of depth 2 networks \[ N_2(\mathbf{x})=\sum_{i=1}^wu_i\sigma(\mathbf{w}_i^{\mathsf{T}}\mathbf{x}+b_i). \] The main results are given in three subsections of Section 2. Subsection 2.1 contains a formal result implying that radial functions can be approximated with depth 2, width (parameter \( w \)) poly(\( d \)) (\( \mathbf{x},\mathbf{w}_i\in\mathbb{R}^d \)) networks, to any constant accuracy \( \epsilon. \) This result is proved for networks employing any activation function \( \sigma \) satisfying mild assumption, which implies that the activation can be used to approximate univariate functions well. This assumption is satisfied for all standard activations such as ReLU and sigmoidal functions. In Subsection 2.2 the authors show how Lipschitz radial functions can be approximated by width poly(\( 1/\epsilon) \) depth 2 ReLU networks. Results of Subsection 2.3 complement previous positive approximation results with negative results. Section 3 contains proofs.
    0 references
    deep learning
    0 references
    neural network
    0 references
    approximation theory
    0 references
    depth separation
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references