Steepest descent method with random step lengths (Q2397745): Difference between revisions
From MaRDI portal
Created a new Item |
Added link to MaRDI item. |
||
links / mardi / name | links / mardi / name | ||
Revision as of 19:35, 2 February 2024
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Steepest descent method with random step lengths |
scientific article |
Statements
Steepest descent method with random step lengths (English)
0 references
23 May 2017
0 references
The approximative methods for finding of the minimum of twice continuously differentiable functions is a modern approach for solving of this problem. In the present paper, the steepest descent method applied to the minimization of a twice continuously differentiable function is studied. It is proved that under certain conditions, the random choice of the step length parameter generates a process that is almost surely \(R\)-convergent for quadratic functions. The convergence properties of this random procedure are characterized based on the mean value function related to the distribution of the step length parameter. The distribution of the random step length, which guarantees the maximum asymptotic convergence rate independent of the detailed properties of the Hessian matrix of the minimized function, is founded and its uniqueness is proved. It is shown that the asymptotic convergence rate of this optimally created random procedure is equal to the convergence rate of the Chebychev polynomial method. The proposed random procedure is also applied to the minimization of a general non-quadratic functions. An algorithm needed to estimate relevant bounded for the Hessian matrix spectrum is created. In the introduction of the paper, the problem of finding the minimum of bounded from below twice continuously differentiable function \(V(x) : {\mathbb R}^M \to {\mathbb R}\) is formulated. It is shown that the above problem is equivalent to the solution of the matrix equation \(A x = b\), where \(A\) is the Hessian matrix of the function \(V\). The distribution of the inverse step length is discussed. In Section 2, the basic algorithm and statements concerning the convergence properties of the created sequence of iterations for a general distribution of the inverse step length parameter are formulated. Algorithm 1 is presented as a tool to solve the main matrix equation. The properties of Algorithm 1 are characterized in Theorem 1. In Section 3, conditions necessary for the best convergence properties of the random process, without knowing the spectrum of the matrix \(A\) are formulated. The distribution of the inverse step length parameter satisfying these conditions is calculated. The asymptotic convergence rate of the process with this optimal step length parameter distribution is computed. In Section 4, some results as the finite difference scheme for the homogeneous Dirichlet's problem are presented. Here, Algorithm 1 is modified to Algorithm 2 for solving the matrix equation \(A x = b\). The results of the calculations, realized by application of the both Algorithms 1 and 2 are compared and graphically illustrated. In Section 5, the complete random steepest descent algorithm designed for the minimization of a continuously differentiable functions of several variables is formulated as Algorithm 3. This algorithm is applied to concrete functions. The results of the procedure, generated by Algorithm 3 with concrete distribution functions associated with given densities are compared with various concurrent methods as the Polak-Ribière conjugate gradient, the Bazzilai-Borwein algorithm and the Pronzato-Zhigljavsky algorithm. The random character of the created optimization process may be advantageous when solving problems requiring randomized processing. A typical representative of such a problem is searching for the global minimum of a function with several local minima. Section 6 documents the capabilities of the suggested method. In Section 7, the possible improvements of the random procedure is discussed. The paper ends with six appendices, where the presented statements are proved.
0 references
optimization
0 references
gradient methods
0 references
stochastic processes
0 references
algorithms
0 references
convergence rate
0 references
computational experiments
0 references