The polyserial correlation coefficient (Q792043): Difference between revisions

Suppose the joint probability distribution of the variables X with \(E(X)=\mu\), Var X\(=\sigma^ 2\), and \(\eta\) with \(E(\eta)=0\), Var \(\eta =I\), is binormal with correlation \(\rho_{X_{\eta}}=\rho\). Instead of the underlying continuous variable \(\eta\), the authors consider the ordinal categorical variable Y defined by a monotonic step function \(Y=y_ j\) if \(\tau_{j-1}\leq \eta<\tau_ j (j=1,2,...,r)\), with \(y_{j-1}<y_ j\) and \(\tau_ 0=-\infty\), \(\tau_{j-1}<\tau_ j\), \(\tau_ r=\infty\), whose probabilities are obviously \(p_ j=P(Y=y_ j)=\Phi(\tau_ j)-\Phi(\tau_{j-1})\) with \(\Phi(\tau)=\int^{\tau}_{-\infty}\phi(t)dt\), \(\phi(t)=\exp(-t^ 2/2)/\sqrt{2\pi}\), and thence derive the ''point-polyserial'' correlation between X and Y \[ {\tilde \rho}=\rho \sum^{r-1}_{j=1}\phi(\tau_ j)\cdot(y_{j+1}-y_ j)/\sigma_ y. \] This most general relation depends on r, on the threshold values \(\tau_ j\), and on the scoring ones \(y_ j\). It generalizes known results on biserial correlation \((r=2)\), and about other special scoring systems, as those studied by \textit{N. R. Cox} [Biometrics 30, 171-178 (1974; Zbl 0292.62022)] and \textit{N. Jaspen} [Serial correlation. Psychometrika 11, 23-30 (1946)]. The relation is used in estimating the polyserial correlation \(\rho\) from a sample of N observations \((x_ i,y_ i)\), \(i=1,...,N\), on the variable (X,Y). Assuming a scoring system with \(y_ j=\) consecutive entire numbers, there are to be estimated the unknown model parameters \(\rho,\mu,\sigma,\tau_ 1,...,\tau_{r-1}.\) The authors study three methods: 1) simultaneous estimation of all parameters by maximum likelihood, solving a complicated non-linear equation system; 2) the two-step method in which, after having estimated \(\mu\) and \(\sigma^ 2\) by the sample statistics \(\bar x\), \(s^ 2_ x\), and \(\tau_ 1,...,\tau_{r-1}\) by the inverse of a normal distribution function applied to the observed marginal distribution of Y, a conditional maximum likelihood estimate of \(\rho\) is computed; 3) an ad hoc estimator \({\hat \rho}=r_{xy}\cdot s_ y/\sum_{j}\phi({\hat \tau}_ j)\) of \(\rho\) is determined by inserting in the above-mentioned relation the sample estimates \(r_{xy}\) for \({\tilde \rho}\), \(s_ y\) for \(\sigma_ y\), \({\hat \tau}{}_ j\) for \(\tau_ j.\) The three methods are compared by Monte Carlo simulation (four-way 2\(\cdot 2\cdot 3\cdot 2\) factorial design with factors N, symmetry or asymmetry of threshold system (\(\tau)\), \(\rho\), r, and with 50 replications in each cell). All three methods perform well, whereas the direct use of \(r_{xy}\) would be rather misleading.

0 references

zbMATH Keywords

dichotomous variables

0 references

polychotomous variables

0 references

latent variables

0 references

ordinal categorical variable

0 references

monotonic step function

0 references

polyserial correlation

0 references

simultaneous estimation

0 references

maximum likelihood

0 references

non-linear equation system

0 references

two-step method

0 references

conditional maximum likelihood estimate

0 references

ad hoc estimator

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Q3667792

0 references

Identifiers

zbMATH Open document ID

0536.62045

0 references

DOI

10.1007/BF02294164

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:792043

Revision as of 01:13, 5 March 2024 Import240304020342 (talk \| contribs) 4,416,906 edits Set profile property. ← Older edit	Latest revision as of 11:30, 14 June 2024 ReferenceBot (talk \| contribs) Bots 1,895,659 edits ‎Changed an Item
	Property / cites work
		Q3667792
	Property / cites work: Q3667792 / rank
		Normal rank