Flat Minima

From MaRDI portal

Revision as of 21:51, 3 February 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:3123284

Jump to:navigation, search

DOI10.1162/NECO.1997.9.1.1zbMath0872.68150OpenAlexW2912811302WikidataQ34422981 ScholiaQ34422981MaRDI QIDQ3123284

Sepp Hochreiter, Jürgen Schmidhuber

Publication date: 6 March 1997

Published in: Neural Computation (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1162/neco.1997.9.1.1

zbMATH Keywords

low-complexity neural networks

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05)

Related Items (28)

Deep networks on toroids: removing symmetries reveals the structure of flat regions in the landscape geometry* ⋮ Global optimization issues in deep network regression: an overview ⋮ ‘Place-cell’ emergence and learning of invariant data with restricted Boltzmann machines: breaking and dynamical restoration of continuous symmetries in the weight space ⋮ On Different Facets of Regularization Theory ⋮ Machine learning the kinematics of spherical particles in fluid flows ⋮ Archetypal landscapes for deep neural networks ⋮ The inverse variance–flatness relation in stochastic gradient descent is critical for finding flat minima ⋮ Lipschitzness is all you need to tame off-policy generative adversarial imitation learning ⋮ Unnamed Item ⋮ Geometric characterization of the Eyring-Kramers formula ⋮ Lotka-Volterra model with mutations and generative adversarial networks ⋮ Diametrical risk minimization: theory and computations ⋮ Optimization for deep learning: an overview ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Interpretable machine learning: fundamental principles and 10 grand challenges ⋮ A spin glass model for the loss surfaces of generative adversarial networks ⋮ Noise-induced degeneration in online learning ⋮ Minimum description length revisited ⋮ Unnamed Item ⋮ Entropy-SGD: biasing gradient descent into wide valleys ⋮ Universal statistics of Fisher information in deep neural networks: mean field approach^* ⋮ Wide flat minima and optimal generalization in classifying high-dimensional Gaussian mixtures ⋮ Structure-preserving deep learning ⋮ Hausdorff dimension, heavy tails, and generalization in neural networks* ⋮ Entropic gradient descent algorithms and wide flat minima* ⋮ Adaptive regularization parameter selection method for enhancing generalization capability of neural networks ⋮ Prediction errors for penalized regressions based on generalized approximate message passing

Cites Work

This page was built for publication: Flat Minima

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:3123284&oldid=16209380"