Training recurrent neural networks by sequential least squares and the alternating direction method of multipliers
From MaRDI portal
Publication:6136124
Abstract: This paper proposes a novel algorithm for training recurrent neural network models of nonlinear dynamical systems from an input/output training dataset. Arbitrary convex and twice-differentiable loss functions and regularization terms are handled by sequential least squares and either a line-search (LS) or a trust-region method of Levenberg-Marquardt (LM) type for ensuring convergence. In addition, to handle non-smooth regularization terms such as , , and group-Lasso regularizers, as well as to impose possibly non-convex constraints such as integer and mixed-integer constraints, we combine sequential least squares with the alternating direction method of multipliers (ADMM). We call the resulting algorithm NAILS (nonconvex ADMM iterations and least squares) in the case line search (LS) is used, or NAILM if a trust-region method (LM) is employed instead. The training method, which is also applicable to feedforward neural networks as a special case, is tested in three nonlinear system identification problems.
Recommendations
- Recurrent Neural Networks Training Using Derivative Free Nonlinear Bayesian Filters
- Variable projections neural network training
- Advances in Neural Networks – ISNN 2005
- Nonlinear dynamical system modeling via recurrent neural networks and a weighted state space search algorithm
- Active neuron least squares: a training method for multivariate rectified neural networks
Cites work
- scientific article; zbMATH DE number 5060482 (Why is no real title available?)
- A BFGS-SQP method for nonsmooth, nonconvex, constrained optimization and its evaluation using relative minimization profiles
- A method for the solution of certain non-linear problems in least squares
- A simple effective heuristic for embedded mixed-integer quadratic programming
- An Algorithm for Least-Squares Estimation of Nonlinear Parameters
- An augmented Lagrangian based algorithm for distributed nonconvex optimization
- CasADi: a software framework for nonlinear optimization and optimal control
- Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems
- Douglas--Rachford Splitting and ADMM for Nonconvex Optimization: Tight Convergence Results
- Global convergence of ADMM in nonconvex nonsmooth optimization
- Identification of Hammerstein systems without explicit parameterisation of non-linearity
- Learning nonlinear state-space models using autoencoders
- Model Selection and Estimation in Regression with Grouped Variables
- On the smoothness of nonlinear system identification
- PSwarm: a hybrid solver for linearly constrained global derivative-free optimization
- Recurrent Neural Network Training With Convex Loss and Regularization Functions by Extended Kalman Filtering
- Survey of sequential convex programming and generalized Gauss-Newton methods
- Variable Elimination in Model Predictive Control Based on K-SVD and QR Factorization
Cited in
(7)- Discriminative training of feed-forward and recurrent sum-product networks by extended Baum-Welch
- A constrained regularization approach for input-driven recurrent neural networks
- 17th Workshop on Logic, Language, Information and Computation (WoLLIC 2010)
- An augmented Lagrangian method for training recurrent neural networks
- A sequential quadratic Hamiltonian algorithm for training explicit RK neural networks
- Active neuron least squares: a training method for multivariate rectified neural networks
- A conjugate gradient learning algorithm for recurrent neural networks
This page was built for publication: Training recurrent neural networks by sequential least squares and the alternating direction method of multipliers
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6136124)