torchopt
Advanced Optimizers for Torch
Daniel Falbel, Rolf Simoes, Felipe Souza, Gilberto Camara
Last update: 6 June 2023
Copyright license: Apache License
Software version identifier: 0.1.3, 0.1.1, 0.1.2, 0.1.4
Optimizers for 'torch' deep learning library. These functions include recent results published in the literature and are not part of the optimizers offered in 'torch'. Prospective users should test these optimizers with their data, since performance depends on the specific problem being solved. The packages includes the following optimizers: (a) 'adabelief' by Zhuang et al (2020), <arXiv:2010.07468>; (b) 'adabound' by Luo et al.(2019), <arXiv:1902.09843>; (c) 'adahessian' by Yao et al.(2021) <arXiv:2006.00719>; (d) 'adamw' by Loshchilov & Hutter (2019), <arXiv:1711.05101>; (e) 'madgrad' by Defazio and Jelassi (2021), <arXiv:2101.11075>; (f) 'nadam' by Dozat (2019), <https://openreview.net/pdf/OM0jvwB8jIp57ZJjtNEZ.pdf>; (g) 'qhadam' by Ma and Yarats(2019), <arXiv:1810.06801>; (h) 'radam' by Liu et al. (2019), <arXiv:1908.03265>; (i) 'swats' by Shekar and Sochee (2018), <arXiv:1712.07628>; (j) 'yogi' by Zaheer et al.(2019), <https://papers.nips.cc/paper/8186-adaptive-methods-for-nonconvex-optimization>.
- AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients
- Adaptive Gradient Methods with Dynamic Bound of Learning Rate
- ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
- Decoupled Weight Decay Regularization
- Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization
- Quasi-hyperbolic momentum and Adam for deep learning
- On the Variance of the Adaptive Learning Rate and Beyond
- Improving Generalization Performance by Switching from Adam to SGD
This page was built for software: torchopt