Order statistics for discrete case with a numerical application to the binomial distribution (Q767612)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Order statistics for discrete case with a numerical application to the binomial distribution |
scientific article |
Statements
Order statistics for discrete case with a numerical application to the binomial distribution (English)
0 references
1956
0 references
Let \(x\) be a random variable which may take the value \(i\) with probability \(p(i)\) \((i = 0, 1,\ldots, M)\). Suppose that \(k\) independent observations on \(x\) are made. Be \(x_1\) the largest value among these \(k\), and \(x_2\) the smallest. Then the probability distributions of \(x_1\), \(x_2\), \(x_1- x_2 = R_k\) (range of \(k\) observed values), and the mean and variance of these distributions are computed, both for finite and infinite \(M\). The methods of proof are closely similar to the case of continuously distributed \(x\), but the algebra is necessarily more meticulous since the probability of more than one observation having for instance the maximum value is no longer negligible. The results are applied to the case that \(x\) is distributed according to the binomial distribution with parameters \(p\) and \(N\). This is used as a model for the following situation in taste testing. \(N\) \((= 17)\) judges taste each of \(k\) \((= 8)\) brands of wine and decide for each brand whether or not it is first class: \(x\) is the number of judges deciding it is; obviously \(x\) may, but need not, have different values for each brand. Now the range \(R_k\) is used to test if the brands are equally good. The largest value \(x_1\) is used to test against the alternative that one of the brands is substantially better. The discussion of examples of this situation seems to be somewhat unsatisfactory. No mention is made of pitfalls like the possibility that the observed values of \(x\) might depend on the order in which the different brands are being tasted by the judges. It is not quite clear what is the point in stating that the observed value of the range exceeds its mean by 2.98 times the standard deviation, since the range is not normally distributed. If a group of \(k\) values of \(x\) is found to differ significantly (tail probability 2.8\%) from the hypothesis of homogeneity, these \(x\)-values are divided by the author into 2 groups (``by simple inspection''), and these 2 groups, one of 6, the other of 2 values, are subsequently tested for homogeneity at the 5\% level without adjustment for the previous grouping. The author replaces the parameter \(p\) of the binomial distribution of \(x\) by its estimate from the pooled \(k\) samples without change in the distribution theory. He acts thereby on the assumption ``that we are in the situation that \dots \(N k\) is large enough'' so that this pooled estimate of \(p\) may ``be taken for the true value under the hypothesis \(H\)'', which hypothesis states that the parameter \(p\) of the binomial distribution assumed to underlie the \(k\) distributions of \(x\) is the same for each of these distributions. It should be mentioned that the author announces the publication of tables of the distributions of the range \(R_k\) and of the largest value \(x_1\), and of the mean and variance of the range, all four tables in case the distribution of \(x\) is binomial.
0 references
order statistics
0 references
binomial distribution
0 references