Multi-armed bandit processes with optimal selection of the operating times (Q2387146)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Multi-armed bandit processes with optimal selection of the operating times
scientific article

    Statements

    Multi-armed bandit processes with optimal selection of the operating times (English)
    0 references
    0 references
    0 references
    0 references
    1 September 2005
    0 references
    A multi-armed Bandit Problem is considered such that at each decision epoch it is to be decided the next project to be undertaken and the span of time to be spent in this project, instead of reconsidering the new project at each stage. This extended model, inspired in sequentially planned decision procedures [\textit{W. Schmitz} ``Optimal sequentially planned decision procedures. Lect. Notes Stat. 79. New York: Springer-Verlag (1993; Zbl 0771.62057)], is formulated in Section 1 and tries to exploit the reduction of costs produced by longer periods dedicated to the same activity. Following the method by \textit{P. Whittle} [J. R. Stat. Soc., Ser. B. 42, 143--149 (1980; Zbl 0439.90096), Section 2 introduces a retirement option with a variable reward \(M\), and Section 3 extends Gittins indexes to this case. Another relevant conclusion is that the optimal period of activity for each project does not depend on the retirement reward \(M\). Finally, we show that the optimal strategy is to choose the project with the highest Gittins index.
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    multi-armed bandit processes
    0 references
    Gittins index
    0 references
    0 references