Pages that link to "Item:Q1583226"

From MaRDI portal

← A near-optimal polynomial time algorithm for learning in certain classes of stochastic games (Q1583226)

Jump to:navigation, search

The following pages link to A near-optimal polynomial time algorithm for learning in certain classes of stochastic games (Q1583226):

Displaying 8 items.

Efficient learning equilibrium (Q814627) (← links)
Perspectives on multiagent learning (Q1028921) (← links)
Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function (Q1886590) (← links)
Multi-agent reinforcement learning: a selective overview of theories and algorithms (Q2094040) (← links)
Computer science and decision theory (Q2271874) (← links)
Controller exploitation-exploration reinforcement learning architecture for computing near-optimal policies (Q2318167) (← links)
AWESOME: a general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents (Q2384141) (← links)
Value iteration for simple stochastic games: stopping criterion and learning algorithm (Q2672267) (← links)

Retrieved from "https://portal.mardi4nfdi.de/wiki/Special:WhatLinksHere/Item:Q1583226"