Pages that link to "Item:Q1583226"
From MaRDI portal
The following pages link to A near-optimal polynomial time algorithm for learning in certain classes of stochastic games (Q1583226):
Displaying 8 items.
- Efficient learning equilibrium (Q814627) (← links)
- Perspectives on multiagent learning (Q1028921) (← links)
- Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function (Q1886590) (← links)
- Multi-agent reinforcement learning: a selective overview of theories and algorithms (Q2094040) (← links)
- Computer science and decision theory (Q2271874) (← links)
- Controller exploitation-exploration reinforcement learning architecture for computing near-optimal policies (Q2318167) (← links)
- AWESOME: a general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents (Q2384141) (← links)
- Value iteration for simple stochastic games: stopping criterion and learning algorithm (Q2672267) (← links)