Steepest descent method with random step lengths (Q2397745): Difference between revisions

From MaRDI portal
Importer (talk | contribs)
Created a new Item
 
Added link to MaRDI item.
links / mardi / namelinks / mardi / name
 

Revision as of 19:35, 2 February 2024

scientific article
Language Label Description Also known as
English
Steepest descent method with random step lengths
scientific article

    Statements

    Steepest descent method with random step lengths (English)
    0 references
    0 references
    23 May 2017
    0 references
    The approximative methods for finding of the minimum of twice continuously differentiable functions is a modern approach for solving of this problem. In the present paper, the steepest descent method applied to the minimization of a twice continuously differentiable function is studied. It is proved that under certain conditions, the random choice of the step length parameter generates a process that is almost surely \(R\)-convergent for quadratic functions. The convergence properties of this random procedure are characterized based on the mean value function related to the distribution of the step length parameter. The distribution of the random step length, which guarantees the maximum asymptotic convergence rate independent of the detailed properties of the Hessian matrix of the minimized function, is founded and its uniqueness is proved. It is shown that the asymptotic convergence rate of this optimally created random procedure is equal to the convergence rate of the Chebychev polynomial method. The proposed random procedure is also applied to the minimization of a general non-quadratic functions. An algorithm needed to estimate relevant bounded for the Hessian matrix spectrum is created. In the introduction of the paper, the problem of finding the minimum of bounded from below twice continuously differentiable function \(V(x) : {\mathbb R}^M \to {\mathbb R}\) is formulated. It is shown that the above problem is equivalent to the solution of the matrix equation \(A x = b\), where \(A\) is the Hessian matrix of the function \(V\). The distribution of the inverse step length is discussed. In Section 2, the basic algorithm and statements concerning the convergence properties of the created sequence of iterations for a general distribution of the inverse step length parameter are formulated. Algorithm 1 is presented as a tool to solve the main matrix equation. The properties of Algorithm 1 are characterized in Theorem 1. In Section 3, conditions necessary for the best convergence properties of the random process, without knowing the spectrum of the matrix \(A\) are formulated. The distribution of the inverse step length parameter satisfying these conditions is calculated. The asymptotic convergence rate of the process with this optimal step length parameter distribution is computed. In Section 4, some results as the finite difference scheme for the homogeneous Dirichlet's problem are presented. Here, Algorithm 1 is modified to Algorithm 2 for solving the matrix equation \(A x = b\). The results of the calculations, realized by application of the both Algorithms 1 and 2 are compared and graphically illustrated. In Section 5, the complete random steepest descent algorithm designed for the minimization of a continuously differentiable functions of several variables is formulated as Algorithm 3. This algorithm is applied to concrete functions. The results of the procedure, generated by Algorithm 3 with concrete distribution functions associated with given densities are compared with various concurrent methods as the Polak-Ribière conjugate gradient, the Bazzilai-Borwein algorithm and the Pronzato-Zhigljavsky algorithm. The random character of the created optimization process may be advantageous when solving problems requiring randomized processing. A typical representative of such a problem is searching for the global minimum of a function with several local minima. Section 6 documents the capabilities of the suggested method. In Section 7, the possible improvements of the random procedure is discussed. The paper ends with six appendices, where the presented statements are proved.
    0 references
    optimization
    0 references
    gradient methods
    0 references
    stochastic processes
    0 references
    algorithms
    0 references
    convergence rate
    0 references
    computational experiments
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references