From perturbation analysis to Markov decision processes and reinforcement learning (Q1870309)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | From perturbation analysis to Markov decision processes and reinforcement learning |
scientific article |
Statements
From perturbation analysis to Markov decision processes and reinforcement learning (English)
0 references
11 May 2003
0 references
There are various ways, such as perturbation analysis (PA), Markov decision processes (MDPs) and reinforcement learning (RL) etc., to achieve performance optimization of a dynamical system. Here, the author studies the relationships among these closely related fields. The author shows that performance potentials play a crucial role in PA, MDPs and other optimization approaches. RL, neuro-dynamic programming, etc. are sample-path-based efficient ways of estimating the performance potentials and \(Q\)-factors. It is pointed out here that the potential-based approach is practically important due to its on-line application to real systems, which is discussed through an example.
0 references
on-line algorithms
0 references
Poisson equations
0 references
gradient-based policy iteration
0 references
perturbation analysis
0 references
Q-learning
0 references
TD(\(\lambda\))
0 references
Markov decision processes
0 references
reinforcement learning
0 references
performance potentials
0 references