Reading policies for joins: an asymptotic analysis

DOI10.1214/105051606000000646zbMATH Open1163.90789arXivmath/0703019OpenAlexW3101031228MaRDI QIDQ2467117FDOQ2467117

Authors: Ralph P. Russo, Nariankadu D. Shyamalkumar

Publication date: 18 January 2008

Published in: The Annals of Applied Probability (Search for Journal in Brave)

Abstract: Suppose that

m_{n}

observations are made from the distribution

m a t h b f R

and

n - m_{n}

from the distribution

m a t h b f S

. Associate with each pair,

x

from

m a t h b f R

and

y

from

m a t h b f S

, a nonnegative score

p h i (x, y)

. An optimal reading policy is one that yields a sequence

m_{n}

that maximizes

m a t h b b E (M (n))

, the expected sum of the

(n - m_{n}) m_{n}

observed scores, uniformly in

n

. The alternating policy, which switches between the two sources, is the optimal nonadaptive policy. In contrast, the greedy policy, which chooses its source to maximize the expected gain on the next step, is shown to be the optimal policy. Asymptotics are provided for the case where the

m a t h b f R

and

m a t h b f S

distributions are discrete and

p h i (x, y) = 1 o r 0

according as

x = y

or not (i.e., the observations match). Specifically, an invariance result is proved which guarantees that for a wide class of policies, including the alternating and the greedy, the variable M(n) obeys the same CLT and LIL. A more delicate analysis of the sequence

m a t h b b E (M (n))

and the sample paths of M(n), for both alternating and greedy, reveals the slender sense in which the latter policy is asymptotically superior to the former, as well as a sense of equivalence of the two and robustness of the former.

Full work available at URL: https://arxiv.org/abs/math/0703019

Recommendations

zbMATH Keywords

bandit problem greedy policies tax problem

Mathematics Subject Classification ID

Central limit and other weak theorems (60F05) Strong limit theorems (60F15) Stopping times; optimal stopping problems; gambling theory (60G40) Markov and semi-Markov decision processes (90C40)

Cites Work

Cited In (1)

Optimal policies to obtain the most join results

This page was built for publication: Reading policies for joins: an asymptotic analysis

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2467117)