The dimensionality reduction principle for generalized additive models (Q1082747)

From MaRDI portal
Revision as of 00:05, 20 March 2024 by Openalex240319060354 (talk | contribs) (Set OpenAlex properties.)
scientific article
Language Label Description Also known as
English
The dimensionality reduction principle for generalized additive models
scientific article

    Statements

    The dimensionality reduction principle for generalized additive models (English)
    0 references
    0 references
    1986
    0 references
    Consider an exponential family of distributions of the form \[ \int^{x}e^{b_ 1(\eta)y+b_ 2(\eta)} \nu (dy) \] with a real parameter \(\eta\) and a \(\sigma\)-finite measure \(\nu\) on \({\mathbb{R}}\). Under suitable assumptions their expectations are given by \(b_ 3(\eta):=-b_ 2'(\eta)/b_ 1'(\eta)\). Now assume that for a random vector (Y,X) with values in \({\mathbb{R}}\times [0,1]^ J\), (J\(\in {\mathbb{N}})\), the conditional distribution belongs to this exponential family with \(\eta =f(x)\), \(x\in [0,1]^ J\) and hence \(E(Y| X=x)=b_ 3(f(x))\). [Exponential response model, see e.g. \textit{S. J. Haberman}, ibid. 5, 815-841 (1977; Zbl 0368.62019) in case of linear f we have a generalized linear model as in \textit{J. A. Nelder} and \textit{R. W. M. Wedderburn}, J. R. Stat. Soc., Ser. A 135, 370-384 (1972).] It is shown that under suitable assumptions on the conditional distribution, on f and the density g(\(\cdot)\) of X, the expected log- likelihood \[ \Delta (a)=\int \{b_ 1(a(x))b_ 3(f(x))+b_ 2(a(x))\}g(x)dx \] can be maximized with respect to \(a\in {\mathcal A}=\{a(x_ 1,...,x_ J)=a_ 0+\sum^{J}_{1}a_ j(x_ j)\) s.t. \(E(a(X))=a_ 0\), \(E(a_ j(X_ j))=0\), \(1\leq j\leq J\}\) (Theorem 1). The maximizer is called the best additive approximation to the response function f and besides its advantages w.r. to interpretation compared to general approximations it can be estimated from a sample \((Y_ i,X_ i)^ n_{i=1}\) in a quality which does not decrease with increasing dimension J and furthermore the speed of convergence is optimal in the \(L_ 2\) sense (Theorem 2). [For a related result for regression functions, see the author, Adaptive regression and other nonparametric models. Ann. Stat. 13, 689-705 (1985)]. To show this, some spline estimator resulting from maximizing an empirical log-likelihood quantity is used.
    0 references
    dimensionality reduction principle
    0 references
    generalized additive models
    0 references
    logistic regression
    0 references
    link function
    0 references
    nonparametric additive estimator
    0 references
    quasi maximum likelihood estimate
    0 references
    exponential family
    0 references
    conditional distribution
    0 references
    Exponential response model
    0 references
    generalized linear model
    0 references
    expected log-likelihood
    0 references
    best additive approximation
    0 references
    speed of convergence
    0 references
    spline estimator
    0 references
    empirical log-likelihood
    0 references

    Identifiers