Information-regret compromise in covariate-adaptive treatment allocation (Q1687118)

In the article, a sequential treatment allocation problem with \(K\) treatments is considered. The response \(Y_k=Y_k(X)\) of a subject to treatment \(k\) satisfies \(E\{Y_k|X=x\}=\eta_k(x,\theta_k)\), \(k=1,\dots,K\). Here, \(E\) denotes the mathematical expectation, \(\theta_k\) is an unknown model parameter, \(\eta_k(\cdot)\) is the known model and \(X\) is observable side information called covariates. It is assumed that all covariates \(\{X_i\}\) are i.i.d. with common measure \(\mu\), and all \(\{X_i,Y_1(X_i),\dots,Y_k(X_i)\}\) are i.i.d., too. Here, \(i=1,2.\dots,n\) is the number of treatment allocation. If the parameters \(\theta_k\) in each model \(\eta_k\) were known, one could determine the best treatments \(k^*\) such that \(\eta_{*}(X)=\eta_{k^*}(X)=\max_{k=1,\dots,K}\eta_k(X,\theta_k)\). For unknown \(\theta_k\), one should define the cumulative regret after \(n\) allocations, i.e. \[ R_n(\theta)=n^{-1} \sum_{i=1}^n[\eta_*(X_i)-\eta_{k_i}(X_i,\theta_{k_i})], \] where \(\theta\) denotes the vector of all parameters \(\theta_k\) in the \(K\) models. For some fixed allocation rule \(\pi\), which assigns the treatment \(k\) with probability \(\pi_k(X)\), the regret \(R_n(\theta)\) can be expressed as a functional \(\phi(\pi,\theta)\). The minimization of the regret, which stimulates the usage of the most effective treatments, is the classical goal of the multi-armed bandit problem which was applied to sequential treatment allocations by many authors. However, the rule \(\pi\) is not optimal as it is a ``myopic'' one. Therefore, an additional functional \(\psi(\pi,\theta)\) is introduced, which depends on the expected Fisher information matrix, and stimulates the usage of all treatments. Then a compound design problem \[ (1-\alpha)\psi(\pi,\theta)+\alpha \phi(\pi,\theta) \to \min_{\pi} \] is considered which allows one to find an optimal \(\pi\) for any \(0<\alpha<1\). Sequential treatment allocations are implemented according to \(\pi\) as follows. The \((n+1)\)st patient with covariates \(X_{n+1}\) is assigned to treatment \(k\) with probability \(\pi_k(X_{n+1},\hat{\theta}^n)\) where \(\hat{\theta}^n\) denotes the current estimate of \(\theta\). Several examples are considered.

0 references

reviewed by

Alex V. Kolnogorov

0 references

zbMATH Keywords

optimal design

0 references

treatment allocation

0 references

bounded design measure

0 references

equivalence theorem