Are extreme value estimation methods useful for network data? (Q2303029)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Are extreme value estimation methods useful for network data? |
scientific article |
Statements
Are extreme value estimation methods useful for network data? (English)
0 references
28 February 2020
0 references
Let \(G(n)=(V(n),E(n))\) be a directed network, where \(V(n)\) is the set of nodes, \(E(n)\) is the set of edges, and \(n\) is the number of edges. Let \(N(n)\) denote the number of nodes in \(G(n)\) and \(N_n(i,j)\) be the number of nodes with in-degree \(i\) and out-degree \(j\). The marginal counts of nodes with in-degree \(i\) and out-degree \(j\) are the following \[ N_i^{\mathrm{in}}(n)=\sum_{j=0}^\infty N_n(i,j), \quad N_j^{\mathrm{out}}(n)=\sum_{i=0}^\infty N_n(i,j). \] It is supposed that the empirical degree frequency converges almost surely, i.e. \[ \frac{N_n(i,j)}{N(n)}\mathop{\rightarrow}_{n\longrightarrow\infty}p_{ij}\ \mathrm{ a.s.}, \] where \(p_{ij}\) are local probabilities of a bivariate integer-valued random variable. Also it is supposed that the network exhibits the power-law behavior, i.e. the following requirements hold \[ p_i^{\mathrm{in}}=\sum_{j=0}^\infty p_{ij}\mathop{\sim}_{i\rightarrow\infty}C_{\mathrm{in}}i^{-(1+\imath_{\mathrm{in}})},\quad p_j^{\mathrm{out}}=\sum_{i=0}^\infty p_{ij}\mathop{\sim}_{j\rightarrow\infty}C_{\mathrm{out}}j^{-(1+\imath_{\mathrm{out}})}, \] for some positive constants \(C_{\mathrm{in}}\), \(C_{\mathrm{out}}\), \(\imath_{\mathrm{in}}\) and \(\imath_{\mathrm{out}}\). The authors of the paper describe two classes of preferential attachment models that generate networks with power-law degree distributions. In addition, they consider semiparametric estimation of the model parameters based on an extreme value approach, compare the extreme value method with the existing parametric approaches and demonstrate how it can provide more robust estimates of parameters associated with the network when the data are corrupted or when the model is misspecified.
0 references
power laws
0 references
multivariate heavy-tailed statistics
0 references
preferential attachment
0 references
regular variation
0 references
estimation
0 references
extreme value
0 references
corrupted data
0 references
tail index
0 references
directed network
0 references
0 references
0 references