Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit

From MaRDI portal
Publication:6233074

arXiv1205.4094MaRDI QIDQ6233074FDOQ6233074

Rémi Munos, Alexandra Carpentier

Publication date: 18 May 2012

Abstract: We consider a linear stochastic bandit problem where the dimension K of the unknown parameter heta is larger than the sampling budget n. In such cases, it is in general impossible to derive sub-linear regret bounds since usual linear bandit algorithms have a regret in O(Ksqrtn). In this paper we assume that heta is Ssparse, i.e. has at most Snon-zero components, and that the space of arms is the unit ball for the ||.||2 norm. We combine ideas from Compressed Sensing and Bandit Theory and derive algorithms with regret bounds in O(Ssqrtn).












This page was built for publication: Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6233074)