Statistical evaluation of algebraic constraints for social networks (Q1599169)

In order to evaluate constraints in a situation where the relational data are assumed to be subject to random variation, we need to make some distributional assumptions. There are three multivariate random (directed) graph distributions. These are: (a) basic random graph distributions, including Bernoulli graphs and their generalizations; (b) the dyad-independent \(p_{1}\) model and its relatives; (c) the \(p^{*}\) model, including Markov random graphs and their generalizations. The proposed procedures are based on conditional uniform random multigraph distribitions. In order to describe these distributions, we assume that each multigraph \(M\) defined on \(N\) has a fixed set of \(r\) relation labels. The multigraph is regarded as a random variable, with realizations that are elements of some sample space \(\Omega\) of random (directed) multigraphs. This sample space labels and assumes that every element of the space has the same probability of occurrence; that is, the distribution is uniform. It is allowed that the multigraphs in \(\Omega\) are restricted in some way; that is, they may be subject to some ``marginal'' constraints. Let us consider the general problem of how to evaluate the conformity between an observed multigraph and a constraint set. The approach is to evaluate the constraint set with respect to one or more conditional uniform random multigraph distributions. That is, one can evaluate a constraint set against the assumption that the multigraph under study is random and governed by a particular conditional uniform distribution. There is rarely only a single distribution \( U\mid Q \) that is appropriate to the evaluation and we instead conduct the evaluation using several such distributions. For a particular \(U\mid Q\), we consider the distribution of the index \(v\) () that is generated by this distributional hypothesis. We compute a \(p\)-value for a particular constraint set (), assuming the distribution \(U\mid Q\), as a tail area of the of statistic \(v\) (). We interpret an extreme (small or large) observed value of \(v\) () in this distribution as evidence that the (high or low) degree of conformity of the observed relations to the constraint set () is very unlikely if the underlying stochastic mechanism is \(U\mid Q\). In general, this procedure is not statistical inference in the usual sense. In particular, it is not claim that an extreme, small value of \(v\) () assuming some distribution \( U\mid Q\) provides a simple endorsement of the constraint set (), since one can never be sure that the underlying stochastic mechanism is indeed \(U\mid Q\). Rather, the computation of \(p\)-values signifying the extremity of \(V\) () assuming various distribution \(U\mid Q\) as ``thought-experiments'', providing useful evidence on which a deeper understanding of the structural features of observed relations might be based. In particular, we see this type of evidence as informing the development of more precise parametric of multigraph structure. It is clear, therefore, that an important aspect of the proposed framework for evaluating a constraint set () is the choice of the properties Q on which the evaluations are conditioned. A number of issues are relevant to the choice of properties. That is, the constraints in () may be hypothesized to characterize constraints among relational ties between the classes of some hypothesized partition of the nodes. As a result, conformity to the constraints would be expected to be extreme with respect to a random multigraph distribution that is not marginal to aspects of the hypothesized partition, but would not be expected to be extreme with respect to a distribution for which association of the relational ties with the partiton had been taken into account. Thus, if some algebraic constraint () is exactly associated with an induced multigraph under an hypothesized partition \(\varphi\), then the distribution of the statistic \(v\) () will be degenerate. More generally, if the algebraic constraint () is associated with an hypothesized partition \(\varphi\) and if the stochastic mechanism is \( U\mid Q_{\theta} \), where \(Q_{\theta}\) fixes some aspect of the multigraph induced by \(\varphi\), then we would not expect the statistic \(v\)() to be extreme in the randomization distribution associated with \( U\mid Q_{\theta} \) A second general form of the multivariate association model that may be helpful in assessing a set of algebraic constraints specifies that, for any pair \(i\) and \(j\) of nodes, ties of any type are equally likely to occur. This uniform ordering assumption implies that \[ \text{Pr}(X_{1}(i,j)=1)= \text{Pr}(X_{2}(i,j)=1=\dots= \text{Pr}(X_{r}(i,j)=1)\quad \text{for each }i,j \varepsilon \mathbb{N} \] and may be represented by the uniform random multigraph distribution \( U\mid \{M(+,i,j)\}\) that is conditional on the values \(\{M(+,i,j)\}\). If the observed value of \(v\)() for some constraint set () is small and extreme relative to the randomization distribution of \(v\)(), then there is evidence in favor of the hypothesized ordering constraints relative to the assumption of uniform ordering. Generalization of the uniform ordering hypothesis to properties that fix the number of ties from \(i\) to \(j\) for particular subsets of relations may also may be useful. Summarizing, the authors suggest that it is often useful to conduct a series of evaluations of a particular constraint set () with respect to a variety of conditional uniform random multigraph distributions. Though, that in those (currently rare) contexts in which assumptions about underlying stochastic mechanisms can be strongly defended, a single evaluation of an hypothesized constraint set with respect to the corresponding distribution may be preferable because it permits a more standard permutation-test interpretation of the computed \(p\)-values.

0 references

Mathematics Subject Classification ID

91D30

0 references

zbMATH DE Number

1750030

0 references

zbMATH Keywords

social networks

0 references

algebraic constraints

0 references