The dimensionality reduction principle for generalized additive models (Q1082747)

From MaRDI portal
Revision as of 15:19, 10 December 2024 by Import241208061232 (talk | contribs) (Normalize DOI.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)





scientific article
Language Label Description Also known as
English
The dimensionality reduction principle for generalized additive models
scientific article

    Statements

    The dimensionality reduction principle for generalized additive models (English)
    0 references
    0 references
    1986
    0 references
    Consider an exponential family of distributions of the form \[ \int^{x}e^{b_ 1(\eta)y+b_ 2(\eta)} \nu (dy) \] with a real parameter \(\eta\) and a \(\sigma\)-finite measure \(\nu\) on \({\mathbb{R}}\). Under suitable assumptions their expectations are given by \(b_ 3(\eta):=-b_ 2'(\eta)/b_ 1'(\eta)\). Now assume that for a random vector (Y,X) with values in \({\mathbb{R}}\times [0,1]^ J\), (J\(\in {\mathbb{N}})\), the conditional distribution belongs to this exponential family with \(\eta =f(x)\), \(x\in [0,1]^ J\) and hence \(E(Y| X=x)=b_ 3(f(x))\). [Exponential response model, see e.g. \textit{S. J. Haberman}, ibid. 5, 815-841 (1977; Zbl 0368.62019) in case of linear f we have a generalized linear model as in \textit{J. A. Nelder} and \textit{R. W. M. Wedderburn}, J. R. Stat. Soc., Ser. A 135, 370-384 (1972).] It is shown that under suitable assumptions on the conditional distribution, on f and the density g(\(\cdot)\) of X, the expected log- likelihood \[ \Delta (a)=\int \{b_ 1(a(x))b_ 3(f(x))+b_ 2(a(x))\}g(x)dx \] can be maximized with respect to \(a\in {\mathcal A}=\{a(x_ 1,...,x_ J)=a_ 0+\sum^{J}_{1}a_ j(x_ j)\) s.t. \(E(a(X))=a_ 0\), \(E(a_ j(X_ j))=0\), \(1\leq j\leq J\}\) (Theorem 1). The maximizer is called the best additive approximation to the response function f and besides its advantages w.r. to interpretation compared to general approximations it can be estimated from a sample \((Y_ i,X_ i)^ n_{i=1}\) in a quality which does not decrease with increasing dimension J and furthermore the speed of convergence is optimal in the \(L_ 2\) sense (Theorem 2). [For a related result for regression functions, see the author, Adaptive regression and other nonparametric models. Ann. Stat. 13, 689-705 (1985)]. To show this, some spline estimator resulting from maximizing an empirical log-likelihood quantity is used.
    0 references
    dimensionality reduction principle
    0 references
    generalized additive models
    0 references
    logistic regression
    0 references
    link function
    0 references
    nonparametric additive estimator
    0 references
    quasi maximum likelihood estimate
    0 references
    exponential family
    0 references
    conditional distribution
    0 references
    Exponential response model
    0 references
    generalized linear model
    0 references
    expected log-likelihood
    0 references
    best additive approximation
    0 references
    speed of convergence
    0 references
    spline estimator
    0 references
    empirical log-likelihood
    0 references

    Identifiers