Robust and resource-efficient identification of two hidden layer neural networks
From MaRDI portal
Publication:2117339
Abstract: We address the structure identification and the uniform approximation of two fully nonlinear layer neural networks of the type on from a small number of query samples. We approach the problem by sampling actively finite difference approximations to Hessians of the network. Gathering several approximate Hessians allows reliably to approximate the matrix subspace spanned by symmetric tensors formed by weights of the first layer together with the entangled symmetric tensors , formed by suitable combinations of the weights of the first and second layer as , , for a diagonal matrix depending on the activation functions of the first layer. The identification of the 1-rank symmetric tensors within is then performed by the solution of a robust nonlinear program. We provide guarantees of stable recovery under a posteriori verifiable conditions. We further address the correct attribution of approximate weights to the first or second layer. By using a suitably adapted gradient descent iteration, it is possible then to estimate, up to intrinsic symmetries, the shifts of the activations functions of the first layer and compute exactly the matrix . Our method of identification of the weights of the network is fully constructive, with quantifiable sample complexity, and therefore contributes to dwindle the black-box nature of the network training phase. We corroborate our theoretical results by extensive numerical experiments.
Recommendations
- Reconstructing a neural net from its output
- Efficient estimation of neural weights by polynomial approximation
- Neural network identifiability for a family of sigmoidal nonlinearities
- Affine symmetries and neural network identifiability
- On the approximation by neural networks with bounded number of neurons in hidden layers
Cites work
- scientific article; zbMATH DE number 3870398 (Why is no real title available?)
- scientific article; zbMATH DE number 177322 (Why is no real title available?)
- scientific article; zbMATH DE number 1083116 (Why is no real title available?)
- scientific article; zbMATH DE number 1404611 (Why is no real title available?)
- scientific article; zbMATH DE number 1405266 (Why is no real title available?)
- scientific article; zbMATH DE number 6026126 (Why is no real title available?)
- scientific article; zbMATH DE number 3288876 (Why is no real title available?)
- scientific article; zbMATH DE number 967931 (Why is no real title available?)
- A mathematical introduction to compressive sensing
- Active subspace methods in theory and practice: applications to kriging surfaces
- Active subspaces. Emerging ideas for dimension reduction in parameter studies
- Approximation by Ridge Functions and Neural Networks
- Breaking the curse of dimensionality with convex neural networks
- Capturing ridge functions in high dimensions from point queries
- Classes of finite equal norm Parseval frames
- Deep Neural Network Approximation Theory
- DeepStack: expert-level artificial intelligence in heads-up no-limit poker
- Direct estimation of the index coefficient in a single-index model
- Energy Propagation in Deep Convolutional Neural Networks
- Entropy and sampling numbers of classes of ridge functions
- Estimation of the mean of a multivariate normal distribution
- Finding a low-rank basis in a matrix subspace
- Finite normalized tight frames
- Greed is Good: Algorithmic Results for Sparse Approximation
- High-dimensional covariance decomposition into sparse Markov and independence models
- High-dimensional probability. An introduction with applications in data science
- Interpolation by ridge polynomials and its application in neural networks
- Learning functions of few arbitrary linear parameters in high dimensions
- Most tensor problems are NP-hard
- Neural Network Learning
- On Principal Hessian Directions for Data Visualization and Dimension Reduction: Another Application of Stein's Lemma
- Perturbation bounds in connection with singular value decomposition
- Provable approximation properties for deep neural networks
- Reconstructing a neural net from its output
- Robust and resource efficient identification of shallow neural networks by fewest samples
- Semiparametric least squares (SLS) and weighted SLS estimation of single-index models
- Size-independent sample complexity of neural networks
- Tensor Rank and the Ill-Posedness of the Best Low-Rank Approximation Problem
- Tensor rank is NP-complete
- Understanding machine learning. From theory to algorithms
- Weak convergence and empirical processes. With applications to statistics
Cited in
(6)- Information theory and recovery algorithms for data fusion in Earth observation
- Efficient Identification of Butterfly Sparse Matrix Factorizations
- Affine symmetries and neural network identifiability
- Stable recovery of entangled weights: towards robust identification of deep neural networks from minimal samples
- Parameter identifiability of a deep feedforward ReLU neural network
- Approximate real symmetric tensor rank
This page was built for publication: Robust and resource-efficient identification of two hidden layer neural networks
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2117339)