Approximate stochastic annealing for online control of infinite horizon Markov decision processes (Q1937498): Difference between revisions

@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1016/j.automatica.2012.06.010
+Normal rank
@@ Property / OpenAlex ID @@
+W1988071557
@@ Property / OpenAlex ID: W1988071557 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4209222
@@ Property / cites work: Q4209222 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4257216
@@ Property / cites work: Q4257216 / rank @@
+Normal rank
@@ Property / cites work @@
+New algorithms of the Q-learning type
@@ Property / cites work: New algorithms of the Q-learning type / rank @@
+Normal rank
@@ Property / cites work @@
+A Simultaneous Perturbation Stochastic Approximation-Based Actor–Critic Algorithm for Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+An Adaptive Sampling Algorithm for Solving Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+Recursive Learning Automata Approach to Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+A survey of some simulation-based algorithms for Markov decision processes
+Normal rank
@@ Property / cites work @@
+An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming
+Normal rank
@@ Property / cites work @@
+Conditions for the uniqueness of optimal policies of discounted Markov decision processes
+Normal rank
@@ Property / cites work @@
+On the almost sure convergence of a general stochastic approximation procedure
+Normal rank
@@ Property / cites work @@
+Reinforcement Learning: A Tutorial Survey and Recent Advances
+Normal rank
@@ Property / cites work @@
+Cooling Schedules for Optimal Annealing
@@ Property / cites work: Cooling Schedules for Optimal Annealing / rank @@
+Normal rank
@@ Property / cites work @@
+Probability Inequalities for Sums of Bounded Random Variables
+Normal rank
@@ Property / cites work @@
+OnActor-Critic Algorithms
@@ Property / cites work: OnActor-Critic Algorithms / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic approximation methods for constrained and unconstrained systems
+Normal rank
@@ Property / cites work @@
+Q4346705
@@ Property / cites work: Q4346705 / rank @@
+Normal rank
@@ Property / cites work @@
+A Stochastic Approximation Method
@@ Property / cites work: A Stochastic Approximation Method / rank @@
+Normal rank
@@ Property / cites work @@
+Convergence results for single-step on-policy reinforcement-learning algorithms
+Normal rank
@@ Property / cites work @@
+Introduction to Stochastic Search and Optimization
+Normal rank
@@ Property / cites work @@
+Asynchronous stochastic approximation and Q-learning
+Normal rank