Pages that link to "Item:Q5322077"
From MaRDI portal
The following pages link to An Adaptive Sampling Algorithm for Solving Markov Decision Processes (Q5322077):
Displaying 14 items.
- Adaptive aggregation for reinforcement learning in average reward Markov decision processes (Q378753) (← links)
- Sampled fictitious play for approximate dynamic programming (Q547121) (← links)
- A variable neighborhood search based algorithm for finite-horizon Markov decision processes (Q613296) (← links)
- Sensitivity-based nested partitions for solving finite-horizon Markov decision processes (Q1728313) (← links)
- Approximate stochastic annealing for online control of infinite horizon Markov decision processes (Q1937498) (← links)
- From reinforcement learning to optimal control: a unified framework for sequential decisions (Q2094027) (← links)
- Multi-agent reinforcement learning: a selective overview of theories and algorithms (Q2094040) (← links)
- Multi-armed bandits based on a variant of simulated annealing (Q2520136) (← links)
- Online Sequential Optimization with Biased Gradients: Theory and Applications to Censored Demand (Q2967620) (← links)
- Dynamic Pricing and Learning with Finite Inventories (Q3465597) (← links)
- Decomposition and Adaptive Sampling for Data-Driven Inverse Linear Optimization (Q5058012) (← links)
- Nonasymptotic Analysis of Monte Carlo Tree Search (Q5060499) (← links)
- Nonstationary Bandits with Habituation and Recovery Dynamics (Q5144777) (← links)
- Optimistic Monte Carlo Tree Search with Sampled Information Relaxation Dual Bounds (Q5144789) (← links)