Analysis of a two-layer neural network via displacement convexity (Q1996787)

From MaRDI portal

Jump to:navigation, search

scientific article

Language	Label	Description	Also known as
English	Analysis of a two-layer neural network via displacement convexity	scientific article

Statements

scholarly article

0 references

Analysis of a two-layer neural network via displacement convexity (English)

0 references

0 references

0 references

Andrea Montanari

0 references

The Annals of Statistics

0 references

publication date

26 February 2021

0 references

full work available at URL

https://arxiv.org/abs/1901.01375

0 references

https://projecteuclid.org/euclid.aos/1607677249

0 references

This is a contribution in the domain of approximation of functions from given data by means of neural networks. Let $\Omega\subset \mathbb R^d$ be a compact convex set with regular boundary and assume that the given data $\{(y_j,x_j)\}_{j \geq 1}$ are i.i.d. where $x_j\sim\text{Unif}(\Omega)$, $y_j = f(x_j) +\epsilon_j$. The function $f:\Omega\to \mathbb R_{\geq 0}$ is assumed to be concave and smooth. The problem is to fit these data using the output from a neural network, $\hat{f}(x;\omega)=\frac{1}{N}\displaystyle\sum_{i=1}^{N}K^{\delta}(x-{\omega_i})$. Here, \(K\) is a first-order kernel with compact support and $\omega_i$ are the parameters, actually the neural network weights. The authors introduce the risk function $R_N(\omega)=\mathbb{E}{[f(x)-\hat{f}(x;\omega)]^2}$. The objective is to minimize the risk function with respect to the parameters $\omega_i$. They apply the stochastic gradient descent (SGD) method. Using at each step \(k\) the weights ${\omega_i}^k$ and data $x_k$ and $y_k$, one updates the parameters ${\omega_i}^{k+1}$ by a specific law. Under some special assumptions, it is proved that the dynamics of the SGD is well approximated by a partial differential equation with initial and boundary conditions and one shows the existence and uniqueness of a weak solution. The main result of the paper is that the SGD method assures convergence to a model with nearly optimal risk. The fourth section of the paper is devoted to a presentation of some numerical results. Proofs and additional technical details are provided in the supplementary material, see \url{doi:10.1214/20-AOS1945SUPP}.

0 references

Claudia Simionescu-Badea

0 references

zbMATH Keywords

neural networks

0 references

stochastic gradient descent

0 references

Wasserstein gradient flow

0 references

function regression

0 references

convergence rate

0 references

displacement convexity

0 references

describes a project that uses

0 references

MaRDI profile type

MaRDI publication profile

0 references

0 references

Neural Network Learning

0 references

Breaking the Curse of Dimensionality with Convex Neural Networks

0 references

Universal approximation bounds for superpositions of a sigmoidal function

0 references

The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network

0 references

Boosting With the<i>L</i><sub>2</sub>Loss

0 references

Towards a Mathematical Theory of Super‐resolution

0 references

Entropy dissipation methods for degenerate parabolic problems and generalized Sobolev inequalities

0 references

Kinetic equilibration rates for granular media and related equations: entropy dissipation and mass transportation estimates

0 references

Contractions in the 2-Wasserstein length space and thermalization of granular media

0 references

Generalized Additive and Index Models with Shape Constraints

0 references

0 references

Approximation by superpositions of a sigmoidal function

0 references

Vlasov equations

0 references

Superresolution via Sparsity Constraints

0 references

0 references

Greedy function approximation: A gradient boosting machine.

0 references

0 references

Stochastic differential equations with reflecting boundary conditions

0 references

A convexity principle for interacting gases

0 references

The landscape of empirical risk for nonconvex losses

0 references

A mean field view of the landscape of two-layer neural networks

0 references

Simulation of the Solution of a Viscous Porous Medium Equation by a Particle Method

0 references

Interacting diffusions approximating the porous medium equation and propagation of chaos

0 references

0 references

Optimal transport for applied mathematicians. Calculus of variations, PDEs, and modeling

0 references

0 references

Mean field analysis of neural networks: a central limit theorem

0 references

Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks

0 references

0 references

Stochastic differential equations with reflecting boundary condition in convex regions

0 references

0 references

Optimal Transport

0 references

Identifiers

zbMATH Open document ID

0 references

10.1214/20-AOS1945

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

zbMATH DE Number

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:1996787

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Item:Q1996787&oldid=36948444"