A multivariate Riesz basis of ReLU neural networks (Q6144893)
From MaRDI portal
scientific article; zbMATH DE number 7796953
Language | Label | Description | Also known as |
---|---|---|---|
English | A multivariate Riesz basis of ReLU neural networks |
scientific article; zbMATH DE number 7796953 |
Statements
A multivariate Riesz basis of ReLU neural networks (English)
0 references
30 January 2024
0 references
Artificial neural networks have in the last 10 years been responsible for the enormous success in answering many questions in approximation theory and other learning tasks in a vast number of scientific areas, for example computer vision, speech recognition, natural language processing, game theory, signal processing, neuroscience, social sciences and many others. This paper contributes to a theoretical framework for understanding neural networks in the following sense. An extremely important property of artificial neural networks is that they can approximate extremely well functions of many variables, which often allows us to avoid the curse of dimensionality. The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience. The expression was coined by Richard E. Bellman when considering problems in dynamic programming. The curse generally refers to issues that arise when the number of datapoints is small (in a suitably defined sense) relative to the intrinsic dimension of the data. The aim of the work in this paper is to understand the possibility to use artificial neural networks for the approximation of multivariate functions by constructing a new system of ReLU neural networks which forms a Riesz basis of the space \(L_2([0, 1]^d)\) for every \(d\geq 1\). In order to do this, the authors consider the trigonometric-like system of piecewise linear functions introduced recently by Daubechies, DeVore, Foucart, Hanin, and Petrova and use Gershgorin's theorem to prove that these indeed provide a Riesz basis of \(L_2([0, 1])\). The authors then generalize this system to large dimensions \(d>1\) and do this by avoiding tensor products. The paper is well written with a good set of references.
0 references
Riesz basis
0 references
rectified linear unit
0 references
artificial neural networks
0 references
Euler product
0 references
Möbius function
0 references
0 references