Accelerating gradient descent and Adam via fractional gradients (Q6057934)
From MaRDI portal
scientific article; zbMATH DE number 7755615
Language | Label | Description | Also known as |
---|---|---|---|
English | Accelerating gradient descent and Adam via fractional gradients |
scientific article; zbMATH DE number 7755615 |
Statements
Accelerating gradient descent and Adam via fractional gradients (English)
0 references
26 October 2023
0 references
The paper proposed a general class of fractional-order optimization algorithms via the Caputo fractional derivatives. Introducing an interesting theorem called Theorem 2.5 which is serves as the theoretical motivation on using the Caputo fractional derivatives in optimization. Based on Theorem 2.5, defined the Caputo fractional-based gradient, which generalizes the standard integer-order gradient. An efficient implementation developed. By replacing integer-order gradients with the Caputo fractional-based ones, proposed the Caputo fractional gradient descent (CfGD) and the Caputo fractional Adam (CfAdam) that generalize GD and Adam, respectively. Given concrete algorithms, consider gradient descent(GD) and Adam, and extend them to the Caputo fractional GD (CfGD) and the Caputo fractional Adam (CfAdam). Demonstrate the superiority of CfGD and CfAdam on several large scale optimization problems that arise from scientific machine learning applications, such as ill-conditioned least squares problem on real-world data and the training of neural networks involving non-convex objective functions. Numerical examples show that both CfGD and CfAdam result in acceleration over GD and Adam, respectively.
0 references
Caputo fractional derivative
0 references
non-local calculus
0 references
optimization
0 references
Adam
0 references
neural networks
0 references
0 references
0 references