A dynamic programming strategy to balance exploration and exploitation in the bandit problem (Q647433)

From MaRDI portal





scientific article; zbMATH DE number 5977712
Language Label Description Also known as
default for all languages
No label defined
    English
    A dynamic programming strategy to balance exploration and exploitation in the bandit problem
    scientific article; zbMATH DE number 5977712

      Statements

      A dynamic programming strategy to balance exploration and exploitation in the bandit problem (English)
      0 references
      0 references
      0 references
      23 November 2011
      0 references
      multi-armed bandit problem
      0 references
      greedy
      0 references
      estimation
      0 references
      0 references
      0 references

      Identifiers