Uniform convergence of value iteration policies for discounted Markov decision processes

From MaRDI portal
Publication:2467010