A multivariate Riesz basis of ReLU neural networks (Q6144893)

From MaRDI portal
scientific article; zbMATH DE number 7796953
Language Label Description Also known as
English
A multivariate Riesz basis of ReLU neural networks
scientific article; zbMATH DE number 7796953

    Statements

    A multivariate Riesz basis of ReLU neural networks (English)
    0 references
    0 references
    0 references
    0 references
    30 January 2024
    0 references
    Artificial neural networks have in the last 10 years been responsible for the enormous success in answering many questions in approximation theory and other learning tasks in a vast number of scientific areas, for example computer vision, speech recognition, natural language processing, game theory, signal processing, neuroscience, social sciences and many others. This paper contributes to a theoretical framework for understanding neural networks in the following sense. An extremely important property of artificial neural networks is that they can approximate extremely well functions of many variables, which often allows us to avoid the curse of dimensionality. The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience. The expression was coined by Richard E. Bellman when considering problems in dynamic programming. The curse generally refers to issues that arise when the number of datapoints is small (in a suitably defined sense) relative to the intrinsic dimension of the data. The aim of the work in this paper is to understand the possibility to use artificial neural networks for the approximation of multivariate functions by constructing a new system of ReLU neural networks which forms a Riesz basis of the space \(L_2([0, 1]^d)\) for every \(d\geq 1\). In order to do this, the authors consider the trigonometric-like system of piecewise linear functions introduced recently by Daubechies, DeVore, Foucart, Hanin, and Petrova and use Gershgorin's theorem to prove that these indeed provide a Riesz basis of \(L_2([0, 1])\). The authors then generalize this system to large dimensions \(d>1\) and do this by avoiding tensor products. The paper is well written with a good set of references.
    0 references
    0 references
    0 references
    0 references
    0 references
    Riesz basis
    0 references
    rectified linear unit
    0 references
    artificial neural networks
    0 references
    Euler product
    0 references
    Möbius function
    0 references
    0 references