Deep relaxation: partial differential equations for optimizing deep neural networks
DOI10.1007/s40687-018-0148-yzbMath1427.82032arXiv1704.04932OpenAlexW2963480765MaRDI QIDQ2319762
Guillaume Carlier, Pratik Chaudhari, Adam M. Oberman, Stefano Soatto, Stanley J. Osher
Publication date: 20 August 2019
Published in: Research in the Mathematical Sciences (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1704.04932
optimal controlneural networkspartial differential equationsstochastic gradient descentdeep learning
Numerical optimization and variational techniques (65K10) KdV equations (Korteweg-de Vries equations) (35Q53) Neural networks for/in biological studies, artificial life and related topics (92B20) Optimal stochastic control (93E20) Neural nets applied to problems in time-dependent statistical mechanics (82C32) Viscosity solutions to PDEs (35D40)
Related Items
Uses Software
Cites Work
- Minimizing finite sums with the stochastic average gradient
- Semiconcave functions, Hamilton-Jacobi equations, and optimal control
- Smoothing methods for nonsmooth, nonconvex minimization
- The Fokker-Planck equation. Methods of solution and applications
- Contractions in the 2-Wasserstein length space and thermalization of granular media
- Mean field games
- Optimal transport for applied mathematicians. Calculus of variations, PDEs, and modeling
- Replica symmetry breaking condition exposed by random matrix calculation of landscape complexity
- Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle
- Controlled Markov processes and viscosity solutions
- Local entropy as a measure for sampling solutions in constraint satisfaction problems
- Monotone Operators and the Proximal Point Algorithm
- The Variational Formulation of the Fokker--Planck Equation
- Optimization Methods for Large-Scale Machine Learning
- Stochastic Processes and Applications
- Learning representations by back-propagating errors
- Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming
- Multiscale Methods
- Convergent Difference Schemes for Degenerate Elliptic and Parabolic Equations: Hamilton--Jacobi Equations and Free Boundary Problems
- Proximité et dualité dans un espace hilbertien
- A Stochastic Approximation Method
- Entropy-SGD: biasing gradient descent into wide valleys
- Inequalities: theory of majorization and its applications
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item