Steepest descent method with random step lengths (Q2397745): Difference between revisions

The approximative methods for finding of the minimum of twice continuously differentiable functions is a modern approach for solving of this problem. In the present paper, the steepest descent method applied to the minimization of a twice continuously differentiable function is studied. It is proved that under certain conditions, the random choice of the step length parameter generates a process that is almost surely \(R\)-convergent for quadratic functions. The convergence properties of this random procedure are characterized based on the mean value function related to the distribution of the step length parameter. The distribution of the random step length, which guarantees the maximum asymptotic convergence rate independent of the detailed properties of the Hessian matrix of the minimized function, is founded and its uniqueness is proved. It is shown that the asymptotic convergence rate of this optimally created random procedure is equal to the convergence rate of the Chebychev polynomial method. The proposed random procedure is also applied to the minimization of a general non-quadratic functions. An algorithm needed to estimate relevant bounded for the Hessian matrix spectrum is created. In the introduction of the paper, the problem of finding the minimum of bounded from below twice continuously differentiable function \(V(x) : {\mathbb R}^M \to {\mathbb R}\) is formulated. It is shown that the above problem is equivalent to the solution of the matrix equation \(A x = b\), where \(A\) is the Hessian matrix of the function \(V\). The distribution of the inverse step length is discussed. In Section 2, the basic algorithm and statements concerning the convergence properties of the created sequence of iterations for a general distribution of the inverse step length parameter are formulated. Algorithm 1 is presented as a tool to solve the main matrix equation. The properties of Algorithm 1 are characterized in Theorem 1. In Section 3, conditions necessary for the best convergence properties of the random process, without knowing the spectrum of the matrix \(A\) are formulated. The distribution of the inverse step length parameter satisfying these conditions is calculated. The asymptotic convergence rate of the process with this optimal step length parameter distribution is computed. In Section 4, some results as the finite difference scheme for the homogeneous Dirichlet's problem are presented. Here, Algorithm 1 is modified to Algorithm 2 for solving the matrix equation \(A x = b\). The results of the calculations, realized by application of the both Algorithms 1 and 2 are compared and graphically illustrated. In Section 5, the complete random steepest descent algorithm designed for the minimization of a continuously differentiable functions of several variables is formulated as Algorithm 3. This algorithm is applied to concrete functions. The results of the procedure, generated by Algorithm 3 with concrete distribution functions associated with given densities are compared with various concurrent methods as the Polak-Ribière conjugate gradient, the Bazzilai-Borwein algorithm and the Pronzato-Zhigljavsky algorithm. The random character of the created optimization process may be advantageous when solving problems requiring randomized processing. A typical representative of such a problem is searching for the global minimum of a function with several local minima. Section 6 documents the capabilities of the suggested method. In Section 7, the possible improvements of the random procedure is discussed. The paper ends with six appendices, where the presented statements are proved.

0 references

reviewed by

Vassil St. Grozdanov

0 references

zbMATH Keywords

optimization

0 references

gradient methods

0 references

stochastic processes

0 references

algorithms

0 references

convergence rate

0 references

computational experiments

0 references

MaRDI profile type

MaRDI publication profile

0 references

full work available at URL

https://doi.org/10.1007/s10208-015-9290-8

0 references

cites work

On a successive transformation of probability distribution and its application to the analysis of the optimum gradient method

0 references

Two-Point Step Size Gradient Methods

0 references

Projected Barzilai-Borwein methods for large-scale box-constrained quadratic programming

0 references

R-linear convergence of the Barzilai and Borwein gradient method

0 references

Function minimization by conjugate gradients

0 references

On the asymptotic directions of the s-dimensional optimum gradient method

0 references

Gradient Method with Retards and Generalizations

0 references

Relaxed steepest descent and Cauchy-Barzilai-Borwein method

0 references

Gradient method with dynamical retards for large-scale optimization problems

0 references

Q3161692

0 references

A Dynamical-System Analysis of the Optimum s-Gradient Algorithm

0 references

Gradient algorithms for quadratic optimization with fast convergence rates

0 references

The Barzilai and Borwein Gradient Method for the Large Scale Unconstrained Minimization Problem

0 references

Identifiers

zbMATH Open document ID

1376.49042

0 references

DOI

10.1007/s10208-015-9290-8

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:2397745

@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1007/s10208-015-9290-8
+Normal rank
@@ Property / OpenAlex ID @@
+W2224301425
@@ Property / OpenAlex ID: W2224301425 / rank @@
+Normal rank
@@ Property / cites work @@
+On a successive transformation of probability distribution and its application to the analysis of the optimum gradient method
+Normal rank
@@ Property / cites work @@
+Two-Point Step Size Gradient Methods
@@ Property / cites work: Two-Point Step Size Gradient Methods / rank @@
+Normal rank
@@ Property / cites work @@
+Projected Barzilai-Borwein methods for large-scale box-constrained quadratic programming
+Normal rank
@@ Property / cites work @@
+R-linear convergence of the Barzilai and Borwein gradient method
+Normal rank
@@ Property / cites work @@
+Function minimization by conjugate gradients
@@ Property / cites work: Function minimization by conjugate gradients / rank @@
+Normal rank
@@ Property / cites work @@
+On the asymptotic directions of the s-dimensional optimum gradient method
+Normal rank
@@ Property / cites work @@
+Gradient Method with Retards and Generalizations
@@ Property / cites work: Gradient Method with Retards and Generalizations / rank @@
+Normal rank
@@ Property / cites work @@
+Relaxed steepest descent and Cauchy-Barzilai-Borwein method
+Normal rank
@@ Property / cites work @@
+Gradient method with dynamical retards for large-scale optimization problems
+Normal rank
@@ Property / cites work @@
+Q3161692
@@ Property / cites work: Q3161692 / rank @@
+Normal rank
@@ Property / cites work @@
+A Dynamical-System Analysis of the Optimum s-Gradient Algorithm
+Normal rank
@@ Property / cites work @@
+Gradient algorithms for quadratic optimization with fast convergence rates
+Normal rank
@@ Property / cites work @@
+The Barzilai and Borwein Gradient Method for the Large Scale Unconstrained Minimization Problem
+Normal rank
@@ links / mardi / name / links / mardi / name @@
+Publication:2397745