The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions

From MaRDI portal
Publication:5291707

DOI10.1142/S0218488598000094zbMath1087.68616WikidataQ56621169 ScholiaQ56621169MaRDI QIDQ5291707

Sepp Hochreiter

Publication date: 19 May 2006

Published in: International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems (Search for Journal in Brave)




Related Items (max. 100)

Application of neural network to model rainfall pattern of EthiopiaActivation function design for deep networks: linearity and effective initialisationState estimation with limited sensors -- a deep learning based approachA Taxonomy for Spatiotemporal Connectionist Networks Revisited: The Unsupervised CaseLearning deep implicit Fourier neural operators (IFNOs) with applications to heterogeneous material modelingA homotopy gated recurrent unit for predicting high dimensional hyperchaosRecurrent neural network-based internal model control design for stable nonlinear systemsNonlocal kernel network (NKN): a stable and resolution-independent deep neural networkA two-stage deep learning architecture for model reduction of parametric time-dependent problemsLSTM-based approach for predicting periodic motions of an impacting system via transient dynamicsSensitivity -- local index to control chaoticity or gradient globallyLearning model predictive control with long short‐term memory networksSuccessfully and efficiently training deep multi-layer perceptrons with logistic activation function simply requires initializing the weights with an appropriate negative meanApplication of Attention Technique for Digital Pre-distortionPhysics-informed online learning of gray-box models by moving horizon estimationQSurfNet: a hybrid quantum convolutional neural network for surface defect recognitionA mathematical perspective of machine learningNSNO: Neumann series neural operator for solving Helmholtz equations in inhomogeneous mediumRobust High-Dimensional Regression with Coefficient Thresholding and Its Application to Imaging Data AnalysisIncorporating financial news for forecasting Bitcoin prices based on long short-term memory networksSparsity through evolutionary pruning prevents neuronal networks from overfittingRHONN identifier-control scheme for nonlinear discrete-time systems with unknown time-delaysA turbulent eddy-viscosity surrogate modeling framework for Reynolds-averaged Navier-Stokes simulationsBayesian framework for simulation of dynamical systems from multidimensional data using recurrent neural networkThe Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural NetworksA Local Deep Learning Method for Solving High Order Partial Differential EquationsGeneralised latent assimilation in heterogeneous reduced spaces with machine learning surrogate modelsA neural network ensemble approach for GDP forecasting


Uses Software



This page was built for publication: The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions