From perturbation analysis to Markov decision processes and reinforcement learning (Q1870309)

There are various ways, such as perturbation analysis (PA), Markov decision processes (MDPs) and reinforcement learning (RL) etc., to achieve performance optimization of a dynamical system. Here, the author studies the relationships among these closely related fields. The author shows that performance potentials play a crucial role in PA, MDPs and other optimization approaches. RL, neuro-dynamic programming, etc. are sample-path-based efficient ways of estimating the performance potentials and \(Q\)-factors. It is pointed out here that the potential-based approach is practically important due to its on-line application to real systems, which is discussed through an example.

0 references

reviewed by

Wu Chengxun

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references