Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit

Abstract: We consider a linear stochastic bandit problem where the dimension

K

of the unknown parameter

h e t a

is larger than the sampling budget

n

. In such cases, it is in general impossible to derive sub-linear regret bounds since usual linear bandit algorithms have a regret in

O (K s q r t n)

. In this paper we assume that

h e t a

is

S -

sparse, i.e. has at most

S -

non-zero components, and that the space of arms is the unit ball for the

| | . | |_{2}

norm. We combine ideas from Compressed Sensing and Bandit Theory and derive algorithms with regret bounds in

O (S s q r t n)

.

This page was built for publication: Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit