scientific article; zbMATH DE number 3822951
From MaRDI portal
Publication:3668675
zbMATH Open0519.62065MaRDI QIDQ3668675FDOQ3668675
Authors: Radu Theodorescu, Dieter Kalin
Publication date: 1982
Title of this publication is not available (Why is that?)
finite horizonlearning algorithmtwo-armed bandit problemcharacterizations of optimal policiesmonotonicity properties for expected cumulative discounted reward
Learning and adaptive systems in artificial intelligence (68T05) Sequential statistical design (62L05) Dynamic programming (90C39) Optimal stopping in statistics (62L15)
This page was built for publication:
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3668675)