scientific article; zbMATH DE number 3822951

From MaRDI portal

Publication:3668675

Jump to:navigation, search

MaRDI QIDQ3668675zbMATH OpenFDO

Authors Radu Theodorescu, Dieter Kalin

Publication date 1982

zbMATH Keywords

finite horizon learning algorithm two-armed bandit problem characterizations of optimal policies monotonicity properties for expected cumulative discounted reward

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05) Sequential statistical design (62L05) Dynamic programming (90C39) Optimal stopping in statistics (62L15)

This page was built for publication:

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3668675)

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:3668675&oldid=17134076"