Learning the distribution of latent variables in paired comparison models with round-robin scheduling (Q2203618)

The authors deal with the paired comparison problem involving a large number \(N\) of individuals in couples. The abilities of individuals are characterized by independent and identically distributed random variables \(V_i\), \(i=1,\dots,N\), taking values in a measurable set \(\mathcal V\) with common unknown distribution \(\pi\). These random variables are indirectly observed with the help of discrete valued variables \(X_{i,j}\) taking values in a finite set \(\mathcal X\) such that, conditionally on \(V = (V_1,\dots,V_N)\), the random variables \(X_{i,j}: (i,j)\in E=\{ (i,j):1\leq i<j\leq N\}\), are independent with conditional distributions given by \[ \mathbb P(X_{i,j} = x\vert V ) = k(x,V_i,V_j), \] where \(k:\mathcal{X}\times\mathcal{V}\times\mathcal{V} \to [0,1]\) is a known function. It is supposed that the sets \(\mathcal X\), \(\mathcal V\) and the scores \(X_{i,j}: (i,j)\in E\) are available while the vector \(V\) is unknown and the objective is to estimate the distribution \(\pi\) of the hidden variables \(V = (V_1,\dots,V_N)\) from the observations \(X^E=X_{i,j}\), \((i,j)\in E\). Let \(\mathcal A\) be a \(\sigma\)-field on \(\mathcal V\) and let \(\Pi\) be a set of probability measures on \((\mathcal{V}, \mathcal{A})\). For all \(\pi\in \Pi\), the joint distribution of \((X^E,V )\) is given, for any \(x^E \in\mathcal{X}^{|E|}\) and all \(A \in\mathcal{A}^{\otimes N}\) by \[ \mathbb{P}^E_{\pi}(X^E=x^E,V\in A) = \int \mathbb{I}_A (v) \Pi_{(i,j)\in E } k(x^E_{i,j},v_i,v_j )\pi^{\otimes N } (dv), \] where \(\mathbb{I}_A\) is the indicator function of the set \(A\). In this paper, \(\pi\) is estimated by the maximum likelihood estimator \(\hat{\pi}^E\) defined as any maximizer of the log-likelihood: \[ \hat{\pi}^E\in \arg\max_{\pi\in\Pi} \left\{ \log\{ {\mathbb P}^E_{\pi}(X^E=x^E,V\in {\mathcal V}^N) \} \right\}. \] Risk bounds for this estimator are obtained by sub-Gaussian deviation results for Markov chains applied to the graphical model.

0 references

zbMATH Keywords

latent variables

0 references

nonasymptotic risk bounds

0 references

nonparametric estimation

0 references

paired comparisons data

0 references

multiple comparisons

0 references