A universal approximation theorem for mixture-of-experts models
From MaRDI portal
Publication:5380595
Abstract: The mixture of experts (MoE) model is a popular neural network architecture for nonlinear regression and classification. The class of MoE mean functions is known to be uniformly convergent to any unknown target function, assuming that the target function is from Sobolev space that is sufficiently differentiable and that the domain of estimation is a compact unit hypercube. We provide an alternative result, which shows that the class of MoE mean functions is dense in the class of all continuous functions over arbitrary compact domains of estimation. Our result can be viewed as a universal approximation theorem for MoE models.
Recommendations
- Approximations of conditional probability density functions in Lebesgue spaces via mixture of experts models
- Error bounds for functional approximation and estimation using mixtures of experts
- A flexible probabilistic framework for large-margin mixture of experts
- Mixture of experts architectures for neural networks as a special case of conditional expectation formula.
- Hierarchical mixtures-of-experts for exponential family regression models: Approximation and maximum likelihood estimation
Cites work
- Approximation by superpositions of a sigmoidal function
- Approximation of conditional densities by smooth mixtures of regressions
- Error bounds for functional approximation and estimation using mixtures of experts
- Estimating the dimension of a model
- Finite mixture models
- Fitting finite mixtures of generalized linear regressions in \textsf{R}
- Hierarchical mixtures-of-experts for exponential family regression models: Approximation and maximum likelihood estimation
- Laplace mixture of linear experts
- New estimation and feature selection methods in mixture-of-experts models
- On convergence rates of mixtures of polynomial experts
- On the asymptotic normality of hierarchical mixtures-of-experts for generalized linear models
Cited in
(8)- Laplace mixture of linear experts
- A non-asymptotic approach for model selection via penalization in high-dimensional mixture of experts models
- Error bounds for functional approximation and estimation using mixtures of experts
- scientific article; zbMATH DE number 7625172 (Why is no real title available?)
- Conditional sum-product networks: modular probabilistic circuits via gate functions
- Approximations of conditional probability density functions in Lebesgue spaces via mixture of experts models
- A class of mixture of experts models for general insurance: theoretical developments
- Uniform consistency in nonparametric mixture models
This page was built for publication: A universal approximation theorem for mixture-of-experts models
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5380595)