Rates of convergence for Gibbs sampling in the analysis of almost exchangeable data (Q6056577)

The paper analyses a Gibbs sampler for sampling \(\mathbf p = (p_1, ..., p_d) \in [0,1]^d\) from a distribution with density proportional to \[ \exp\Bigl( - A^2 \sum_{i,j \in [d] : i < j} c_{i,j} (p_i - p_j)^2 \Bigr), \] where \(A\) is large and the \(c_{i,j}\) are non-negative weights. The Gibbs sampler used is \textit{Glauber dynamics}: a single, uniformly chosen coordinate is updated in each step according to the conditional distribution given all the others. The non-negative weights can be viewed as a weighted graph. Extending these weights to a symmetric matrix \(C = (c_{i,j})_{i,j=1}^d\) with diagonal \(c_{i,i} := \sum_{j \in [d] : j \ne i} c_{i,j}\), the weights generate a reversible, discrete-time Markov chain on \([d]\): \[ \textstyle p_{i,j} = c_{i,j} / c_i, \quad\text{where}\quad c_i := \sum_{j \in [d] : j \ne i} c_{i,j}. \] It has equilibrium distribution \(c = (c_i)_{i=1}^d\). Letting \(\Delta = I - P\) denote the Laplacian of the Markov chain, \[ \textstyle \sum_{i,j \in [d] : i < j} c_{i,j} (p_i - p_j)^2 = \langle \mathbf p, \Delta \mathbf p \rangle, \quad\text{where}\quad \langle \mathbf x, \mathbf y \rangle := \sum_{i \in [d]} \pi_i x_i y_i. \] The main result analyses the mixing time of the Gibbs sampler, in the limit \(A \to \infty\) with other variables fixed. The mixing metric is the \(\ell_\infty\)-Wasserstein distance: \[ \textstyle d_\infty(\mu, \pi) := \inf_{(X, Y)} \mathbb E( \| X - Y \|_\infty ), \] where the infimum is over all couplings \((X, Y)\) of \(\mu\) and \(\pi\). For \(\varepsilon \in (0,1)\), define \[ \textstyle t_\mathsf{mix}(\varepsilon) := \inf\{ t \ge 0 \mid \sup_{\mathbf p \in [0,1]^d} d_\infty(K_t(\mathbf p), \pi_{A,C}) \le \varepsilon \}, \] where \(K_t(\mathbf p)\) are the time-\(t\) transition probabilities of the Gibbs sampler started from \(\mathbf p \in [0,1]^d\) and \(\pi_{A,C}\) is the target distribution: \[ \frac{d\pi_{A,C}}{d\mathbf p} \propto \exp\Bigl( - A^2 \sum_{i,j \in [d] : i < j} c_{i,j} (p_i - p_j)^2 \Bigr). \] In short, the mixing time is shown to be of order \(A^2\): \[ t_\mathsf{mix}(\tfrac14 - \varepsilon) \gtrsim \tfrac \lambda\gamma A^2 \quad\text{and}\quad t_\mathsf{mix}(\varepsilon) \lesssim d A^2, \] where \(\lambda\) is the spectral gap of \(\Delta\) and \(\gamma\) the absolute spectral gap. The implicit constants in \(\gtrsim\)/\(\lesssim\) depend only on \(\varepsilon\). This paper is motivated by work of de Finetti, who was particularly interested in understanding situations where \(\mathbf p\) is approximately concentrated around a main diagonal of the hypercube: \[ \mathcal D := \{ p \mathbf 1 \mid p \in [0,1] \} \subseteq [0,1]^d, \quad\text{where}\quad \mathbf 1 := (1, \ldots, 1). \] This situation is referred to as \textit{almost exchangeability}, because it captures the idea that the \(d\) variables are almost indistinguishable.

0 references

reviewed by

Sam Thomas

0 references

zbMATH Keywords

Gibbs sampler

0 references

Markov chain

0 references

mixing time

0 references

MaRDI profile type