Are extreme value estimation methods useful for network data? (Q2303029)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Are extreme value estimation methods useful for network data?
scientific article

    Statements

    Are extreme value estimation methods useful for network data? (English)
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    28 February 2020
    0 references
    Let \(G(n)=(V(n),E(n))\) be a directed network, where \(V(n)\) is the set of nodes, \(E(n)\) is the set of edges, and \(n\) is the number of edges. Let \(N(n)\) denote the number of nodes in \(G(n)\) and \(N_n(i,j)\) be the number of nodes with in-degree \(i\) and out-degree \(j\). The marginal counts of nodes with in-degree \(i\) and out-degree \(j\) are the following \[ N_i^{\mathrm{in}}(n)=\sum_{j=0}^\infty N_n(i,j), \quad N_j^{\mathrm{out}}(n)=\sum_{i=0}^\infty N_n(i,j). \] It is supposed that the empirical degree frequency converges almost surely, i.e. \[ \frac{N_n(i,j)}{N(n)}\mathop{\rightarrow}_{n\longrightarrow\infty}p_{ij}\ \mathrm{ a.s.}, \] where \(p_{ij}\) are local probabilities of a bivariate integer-valued random variable. Also it is supposed that the network exhibits the power-law behavior, i.e. the following requirements hold \[ p_i^{\mathrm{in}}=\sum_{j=0}^\infty p_{ij}\mathop{\sim}_{i\rightarrow\infty}C_{\mathrm{in}}i^{-(1+\imath_{\mathrm{in}})},\quad p_j^{\mathrm{out}}=\sum_{i=0}^\infty p_{ij}\mathop{\sim}_{j\rightarrow\infty}C_{\mathrm{out}}j^{-(1+\imath_{\mathrm{out}})}, \] for some positive constants \(C_{\mathrm{in}}\), \(C_{\mathrm{out}}\), \(\imath_{\mathrm{in}}\) and \(\imath_{\mathrm{out}}\). The authors of the paper describe two classes of preferential attachment models that generate networks with power-law degree distributions. In addition, they consider semiparametric estimation of the model parameters based on an extreme value approach, compare the extreme value method with the existing parametric approaches and demonstrate how it can provide more robust estimates of parameters associated with the network when the data are corrupted or when the model is misspecified.
    0 references
    power laws
    0 references
    multivariate heavy-tailed statistics
    0 references
    preferential attachment
    0 references
    regular variation
    0 references
    estimation
    0 references
    extreme value
    0 references
    corrupted data
    0 references
    tail index
    0 references
    directed network
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references