Pages that link to "Item:Q399890"
From MaRDI portal
The following pages link to Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model (Q399890):
Displaying 12 items.
- Near-optimal PAC bounds for discounted MDPs (Q465258) (← links)
- Toward theoretical understandings of robust Markov decision processes: sample complexity and asymptotics (Q2112808) (← links)
- A lexicographic approach to constrained MDP admission control (Q2792714) (← links)
- Solving for Best Responses and Equilibria in Extensive-Form Games with Reinforcement Learning Methods (Q3299845) (← links)
- Variance Reduced Value Iteration and Faster Algorithms for Solving Markov Decision Processes (Q4607932) (← links)
- Fast Global Convergence of Natural Policy Gradient Methods with Entropy Regularization (Q5106383) (← links)
- Randomized Linear Programming Solves the Markov Decision Problem in Nearly Linear (Sometimes Sublinear) Time (Q5119845) (← links)
- Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis (Q5162625) (← links)
- Softmax policy gradient methods can take exponential time to converge (Q6110457) (← links)
- Recent advances in reinforcement learning in finance (Q6146668) (← links)
- Robustness and sample complexity of model-based MARL for general-sum Markov games (Q6159508) (← links)
- Settling the sample complexity of model-based offline reinforcement learning (Q6192326) (← links)