Conditional formulae for Gibbs-type exchangeable random partitions (Q373830): Difference between revisions

Let \((X_{n})_{n\geq 1}\) be an \({\mathcal X}\)-valued exchangeable sequence, \(\operatorname{P}\) the random probability on \({\mathcal X}\) in the de Finetti representation. \(\operatorname{P}\) is supposed to be concentrated on the set of discrete probabilities and in the representation \(\operatorname{P}=\sum_{i\in I}p_{i}\varepsilon_{Y_{i}}\), where \((p_{i})\) and \((Y_{i})\) are independent. For every \(n\), consider the random partition \(\Pi_{n}\) of \(\{1,\dots,n\}\), defined by the exchangeable equivalence relation \(i\sim j\) if \(X_{i}=X_{j}\). It is characterized by the probabilities \[ p_{k}^{(n)}(n_{1},\dots,n_{k}), \text{ where }\sum_{i=1}^{k}n_{i}=n, \] that the number \(M_{i,n}\) of sets of cardinal \(i\) in \(\Pi_{n}\) is \(n_{i}\); \(k\) is denoted \(K_{n}\). If \[ p_{k}^{(n)}(n_{1},\dots,n_{k})=V_{n,k}\Pi_{i=1}^{k}(1-\sigma )_{n_{i}-1},\, \sigma \in (-\infty ,1), \] where generally \(a_{n}= a(a+1)\cdot \cdot \cdot (a+n-1)\), and \[ V_{n,k}=V_{n+1,k+1}+(n-\sigma k) V_{n+1,k},\, k\leq n,\text{ with } V_{1,1}=1, \] is called of Gibbs type. Let \(O_{i,m}^{n}\) be the number of sets of size \(i\) in \(\Pi_{n+m}\) intersecting \(\{1,\dots,n\}\), \(N_{i,m}^{n}\) the number of sets of size \(i\) in \(\Pi_{n+m}\) not intersecting \(\{1,\dots,n\}\), \(M_{i,m}^{n}=O_{i,m}^{n}+N_{i,m}^{n}\). The authors establish formulas for \(\operatorname{E}((M_{i,n})_{[q]})\) (\(a_{[q]} =a(a-1)\cdot \cdot \cdot (a-q+1)\)) and for \[ \operatorname{E}((O_{i,m}^{(n)})_{|q|}), \operatorname{E}((N_{i,m}^{(n)})_{|q|}), \text{ and }\operatorname{E}((M_{i,m}^{(n)})_{|q|}) \] being \(\cdot_{i,m}^{n}\) conditioned on \((K_{n},M_{1,n},\dots,M_{K_{n},n})\). The results are applied to three examples: D with \(\sigma =0\) and \(V_{n,k}=\theta^{k}/\theta_{n}\), \(\theta >0\), PD with \[ \sigma \in (0,1),\,V_{n,k}=\Pi_{i=0}^{k-1}(\theta +i\sigma )/\theta_{n},\, \theta > -\sigma, \] and Gnedin with \[ \sigma =-1,\, V_{n,k}=\gamma_{n-k}\Pi_{i=1}^{k-1}(i^{2}-\gamma i)\Pi_{i=1}^{n-1}(i^{2}+\gamma i)^{-1},\, \gamma \in [0,1). \] Explicit formulas for the distributions of \(O_{i,m}^{(n)}\), \(N_{i,m}^{(n)}\), \(M_{i,m}^{(n)}\) and for their means are obtained. Convergence in distribution results: For D, \(M_{i,n}\rightarrow \pi_{\theta /i}\) (\(\pi\) distributed according to a Poisson distribution), \[ M_{i,m}^{(n)}, N_{i,m}^{(n)}\rightarrow \pi_{(\theta +n)/i} \] for \(m\rightarrow \infty\), in PD \[ N_{i,m}^{(n)}/ m^{\sigma }, M_{i,m}^{(n)}/ m^{\sigma }\rightarrow \sigma (1-\sigma )_{i-1}i!^{-1}B Y, \] \[ K_{m}^{(n)}/ m^{\sigma }\rightarrow BY, B, Y \] are independent, \[ B \beta(j+\theta /\sigma ,n/\sigma -j), \,j=K_{n}, Y \] having density \[ (\Gamma (q\sigma +1) y^{q-1/\sigma -1}f_{\sigma }(y^{-1/\sigma }))/(\sigma \Gamma (q+1)) \] where \(q=(\theta +n)/\sigma\) and \(f_{\sigma }\) the density of a \(\sigma\)-stable \(\geq 0\) r.v. In Gnedin \(M_{i,m}^{(n)}, N_{i,m}^{(n)}\rightarrow 0\). In the paragraph ``genomic applications'', the authors study 2586 data, in PD, estimating the parameters to maximize the corresponding \(p_{k}^{(n)}(n_{1},\dots,n_{k})\). They study \(O_{\tau }^{(n)}= O_{1,m}^{(n)}+\dots+O_{\tau ,m}^{(n)}\) (the number of new genes appearing at most \(\tau\) times in the \(m\) experiments following after \(n\) ones), \(\tau =3,4,5\) and similar for \(N\), \(M\). They split into \(n=1000\), \(m=1586\), compare \(O\), \(N\), \(M\) with the predicted ones (using \(\operatorname{E}\)), then they determine the prediction for \(n=2586\), \(m= 250,500,750,1000\).

0 references

zbMATH Keywords

de Finetti representation

0 references

exchangeable partitions

0 references

conditional distributions

0 references

Dirichlet (D)

0 references

two parameter Poisson Dirichlet (PD) process

0 references

Gnedin model

0 references

genomic applications

0 references

reviewed by

Ioan Cuculescu

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Poisson process approximations for the Ewens sampling formula

0 references

Logarithmic combinatorial structures: A probabilistic approach

0 references

Refined Approximations for the Ewens Sampling Formula

0 references

Combinatorial Methods in Discrete Distributions

0 references

The sampling theory of selectively neutral alleles

0 references

A Bayesian analysis of some nonparametric problems

0 references

A species sampling model with finitely many types

0 references

Q5695526

0 references

Record indices and age-ordered frequencies in exchangeable Gibbs partitions

0 references

Lamperti-type laws

0 references

The Representation of Partition Structures

0 references

The coalescent

0 references

Bayesian Nonparametric Estimation of the Probability of Discovering New Species

0 references

Bayesian nonparametric estimators derived from conditional Gibbs structures

0 references

Exchangeable and partially exchangeable random partitions

0 references

Combinatorial stochastic processes. Ecole d'Eté de Probabilités de Saint-Flour XXXII -- 2002.

0 references

The number of small blocks in exchangeable random partitions

0 references

Identifiers

zbMATH Open document ID

1287.60046

0 references

DOI

10.1214/12-AAP843

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:373830

Revision as of 23:22, 6 July 2024 ReferenceBot (talk \| contribs) Bots 1,935,357 edits ‎Changed an Item ← Older edit	Latest revision as of 09:50, 30 July 2024 Openalex240730090724 (talk \| contribs) 579,206 edits Set OpenAlex properties.
	Property / OpenAlex ID
		W2012039831
	Property / OpenAlex ID: W2012039831 / rank
		Normal rank