Online regret bounds for Markov decision processes with deterministic transitions

From MaRDI portal
Publication:982638