Combining multiple strategies for multiarmed bandit problems and asymptotic optimality (Q892592): Difference between revisions

From MaRDI portal
Created claim: Wikidata QID (P12): Q59112383, #quickstatements; #temporary_batch_1703710783098
ReferenceBot (talk | contribs)
Changed an Item
 
(3 intermediate revisions by 3 users not shown)
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1155/2015/264953 / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2010356817 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Online Learning Methods for Networking / rank
 
Normal rank
Property / cites work
 
Property / cites work: Prediction, Learning, and Games / rank
 
Normal rank
Property / cites work
 
Property / cites work: Some aspects of the sequential design of experiments / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3329417 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Randomised allocation of treatments in sequential trials / rank
 
Normal rank
Property / cites work
 
Property / cites work: Finite-time analysis of the multiarmed bandit problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4057976 / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Nonstochastic Multiarmed Bandit Problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Combining expert advice in reactive environments / rank
 
Normal rank
links / mardi / namelinks / mardi / name
 

Latest revision as of 02:12, 11 July 2024

scientific article
Language Label Description Also known as
English
Combining multiple strategies for multiarmed bandit problems and asymptotic optimality
scientific article

    Statements

    Combining multiple strategies for multiarmed bandit problems and asymptotic optimality (English)
    0 references
    0 references
    0 references
    19 November 2015
    0 references
    Summary: This brief paper provides a simple algorithm that selects a strategy at each time in a given set of multiple strategies for stochastic multiarmed bandit problems, thereby playing the arm by the chosen strategy at each time. The algorithm follows the idea of the probabilistic \(\epsilon_t\)-switching in the \(\epsilon_t\)-greedy strategy and is asymptotically optimal in the sense that the selected strategy converges to the best in the set under some conditions on the strategies in the set and the sequence of \(\epsilon_t\).
    0 references
    multiarmed bandit problems
    0 references
    asymptotic optimality
    0 references
    multiple strategies
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references