Indices for families of competing Markov decision processes with influence (Q1313068)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Indices for families of competing Markov decision processes with influence |
scientific article |
Statements
Indices for families of competing Markov decision processes with influence (English)
0 references
14 September 1994
0 references
In 1980 Whittle (a) explained Gittins indices for multiarmed bandit processes (MBP) in terms of equivalent retirement rewards; and (b) gave conditions under which there is an optimal policy of index type for MBP, where each arm has its own decision apparatus. In the same year Nash proved indexation for a further class of MBP in which the current states of all of the arms can influence the reward earned from the active one. In the present paper the author develops analogues of (a) and (b) for Nash's model. He then illustrates the theory by examining stoppable MBP, where each arm has two actions: continue and stop.
0 references
Gittins indices
0 references
multiarmed bandit processes
0 references