Nonlinear approximation via compositions
From MaRDI portal
parallel computingnonlinear approximationfunction compositiondeep neural networksReLU activation functionHölder continuity
Artificial neural networks and deep learning (68T07) Multidimensional problems (41A63) Functional equations in the complex plane, iteration and composition of analytic functions of one complex variable (30D05) Rate of convergence, degree of approximation (41A25) Special approximation methods (nonlinear Galerkin, etc.) for infinite-dimensional dissipative dynamical systems (37L65)
Abstract: Given a function dictionary and an approximation budget , nonlinear approximation seeks the linear combination of the best terms to approximate a given function with the minimum approximation error[varepsilon_{L,f}:=min_{{g_n}subseteq{mathbb{R}},{T_n}subseteq{cal D}}|f(x)-sum_{n=1}^N g_n T_n(x)|.]Motivated by recent success of deep learning, we propose dictionaries with functions in a form of compositions, i.e.,[T(x)=T^{(L)}circ T^{(L-1)}circcdotscirc T^{(1)}(x)]for all , and implement using ReLU feed-forward neural networks (FNNs) with hidden layers. We further quantify the improvement of the best -term approximation rate in terms of when is increased from to or to show the power of compositions. In the case when , our analysis shows that increasing cannot improve the approximation rate in terms of . In particular, for any function on , regardless of its smoothness and even the continuity, if can be approximated using a dictionary when with the best -term approximation rate , we show that dictionaries with can improve the best -term approximation rate to . We also show that for H"older continuous functions of order on , the application of a dictionary with in nonlinear approximation can achieve an essentially tight best -term approximation rate . Finally, we show that dictionaries consisting of wide FNNs with a few hidden layers are more attractive in terms of computational efficiency than dictionaries with narrow and very deep FNNs for approximating H"older continuous functions if the number of computer cores is larger than in parallel computing.
Recommendations
Cites work
- scientific article; zbMATH DE number 1215245 (Why is no real title available?)
- Adaptive subgradient methods for online learning and stochastic optimization
- Almost optimal estimates for approximation and learning by radial basis function networks
- Approximation by superpositions of a sigmoidal function
- Approximation of functions of finite variation by superpositions of a sigmoidal function.
- Approximation results in Orlicz spaces for sequences of Kantorovich MAX-product neural network operators
- Approximation using scattered shifts of a multivariate function
- Compressed sensing
- Constructive approximate interpolation by neural networks
- Convergence for a family of neural network operators in Orlicz spaces
- Efficient distribution-free learning of probabilistic concepts
- Error bounds for approximations with deep ReLU networks
- Exponential convergence of the deep neural network approximation for analytic functions
- Matching pursuits with time-frequency dictionaries
- Multilayer feedforward networks are universal approximators
- Multivariate \(n\)-term rational and piecewise polynomial approximation
- Neocognition: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position
- Nonlinear approximation using Gaussian kernels
- Optimal approximation of piecewise smooth functions using deep ReLU neural networks
- Saturation classes for MAX-product neural network operators activated by sigmoidal functions
- Ten Lectures on Wavelets
- The rate of approximation of Gaussian radial basis neural networks in continuous function space
- Universal approximation bounds for superpositions of a sigmoidal function
Cited in
(29)- Factor Augmented Sparse Throughput Deep ReLU Neural Networks for High Dimensional Regression
- Deep learning via dynamical systems: an approximation perspective
- Neural network approximation
- Chebyshev approximation of multivariable functions by the exponential expression
- Simultaneous neural network approximation for smooth functions
- Approximation results for gradient flow trained shallow neural networks in \(1d\)
- An efficient numerical method for solving dynamical systems with multiple time scales
- A deep network construction that adapts to intrinsic dimensionality beyond the domain
- Just least squares: binary compressive sampling with low generative intrinsic dimension
- Finite-Sample Two-Group Composite Hypothesis Testing via Machine Learning
- Approximation in shift-invariant spaces with deep ReLU neural networks
- Deep ReLU Networks Overcome the Curse of Dimensionality for Generalized Bandlimited Functions
- Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation
- Approximation properties of Gaussian-binary restricted Boltzmann machines and Gaussian-binary deep belief networks
- Convergence rate analysis for deep Ritz method
- Int-Deep: a deep learning initialized iterative method for nonlinear problems
- Approximation properties of deep ReLU CNNs
- Deep Network Approximation for Smooth Functions
- Deep network with approximation error being reciprocal of width to power of square root of depth
- Learning the Hodgkin-Huxley model with operator learning techniques
- SelectNet: self-paced learning for high-dimensional partial differential equations
- Neural network approximation: three hidden layers are enough
- Can dictionary-based computational models outperform the best linear ones?
- Optimal approximation rate of ReLU networks in terms of width and depth
- A review on deep learning in medical image reconstruction
- Deep Neural Networks for Solving Large Linear Systems Arising from High-Dimensional Problems
- Deep network approximation characterized by number of neurons
- Full error analysis for the training of deep neural networks
- Deep nonparametric regression on approximate manifolds: nonasymptotic error bounds with polynomial prefactors
This page was built for publication: Nonlinear approximation via compositions
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2185653)