Additive partially linear models for massive heterogeneous data (Q1722060)

From MaRDI portal
Revision as of 21:10, 26 August 2024 by Daniel (talk | contribs) (‎Created claim: Wikidata QID (P12): Q128423880, #quickstatements; #temporary_batch_1724702838668)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
scientific article
Language Label Description Also known as
English
Additive partially linear models for massive heterogeneous data
scientific article

    Statements

    Additive partially linear models for massive heterogeneous data (English)
    0 references
    0 references
    0 references
    0 references
    0 references
    14 February 2019
    0 references
    The authors of the paper generalize the partially linear model (PLM) presented in [\textit{T. Zhao} et al., Ann. Stat. 44, No 4, 1400--1437 (2016; Zbl 1358.62050)] and propose an additive partially linear model (APLM) for modeling massive heterogeneous data. Let $\{(Y_i,\mathbf{X}_i, \mathbf{Z}_i)\}$, $i=1,2,\ldots, N$, be the observations from a sample consisting of $ N$ subjects. According to APLM, there exist $s$ independent sub-populations, and the data from the $j$-th sub-population satisfy the following equalities, \[ Y^{(j)}=\mathbf{X}^T\,\overrightarrow{\beta}_0^{\,(j)}+\sum_{k=1}^{K}g_{0k}(Z_k)+\varepsilon,\qquad j\in\{1,2,\dots,s\}, \] where $\mathbf{X} = (X_1,\dots, X_d)^T$, $\mathbf{Z }= (Z_1, \dots, Z_K)$, $\overrightarrow{\beta}_{0}^{(j)}=(\beta_{01}^{(j)},\dots, \beta_{0d}^{(j)})^T$ is the vector of unknown parameters for the $j$-th sub-population, $g_{01}, \dots, g_{0K}$ are unknown smooth functions, and the random variable $\varepsilon$ has zero mean and a finite variance. Under the model proposed, $Y^{(j)}$ depends on $\mathbf{X}$ linearly but with coefficients varying across different sub-populations, whereas $Y^{(j)}$ depends on $\mathbf{Z}$ through additive non-linear functions that are common to all sub-populations. \par The main assumptions on the data structure and on the unknown parameters are described. The hypothesis testing procedures are presented. The asymptotic properties of estimators are derived. The performance of the proposed methods is evaluated via a simulated studies and a real data.
    0 references
    divide-and-conquer
    0 references
    homogeneity
    0 references
    heterogeneity
    0 references
    oracle property
    0 references
    regression splines
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references