From perturbation analysis to Markov decision processes and reinforcement learning (Q1870309)

From MaRDI portal
scientific article
Language Label Description Also known as
English
From perturbation analysis to Markov decision processes and reinforcement learning
scientific article

    Statements

    From perturbation analysis to Markov decision processes and reinforcement learning (English)
    0 references
    0 references
    11 May 2003
    0 references
    There are various ways, such as perturbation analysis (PA), Markov decision processes (MDPs) and reinforcement learning (RL) etc., to achieve performance optimization of a dynamical system. Here, the author studies the relationships among these closely related fields. The author shows that performance potentials play a crucial role in PA, MDPs and other optimization approaches. RL, neuro-dynamic programming, etc. are sample-path-based efficient ways of estimating the performance potentials and \(Q\)-factors. It is pointed out here that the potential-based approach is practically important due to its on-line application to real systems, which is discussed through an example.
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    on-line algorithms
    0 references
    Poisson equations
    0 references
    gradient-based policy iteration
    0 references
    perturbation analysis
    0 references
    Q-learning
    0 references
    TD(\(\lambda\))
    0 references
    Markov decision processes
    0 references
    reinforcement learning
    0 references
    performance potentials
    0 references