Affine symmetries and neural network identifiability
From MaRDI portal
Publication:2214101
Abstract: We address the following question of neural network identifiability: Suppose we are given a function and a nonlinearity . Can we specify the architecture, weights, and biases of all feed-forward neural networks with respect to giving rise to ? Existing literature on the subject suggests that the answer should be yes, provided we are only concerned with finding networks that satisfy certain "genericity conditions". Moreover, the identified networks are mutually related by symmetries of the nonlinearity. For instance, the function is odd, and so flipping the signs of the incoming and outgoing weights of a neuron does not change the output map of the network. The results known hitherto, however, apply either to single-layer networks, or to networks satisfying specific structural assumptions (such as full connectivity), as well as to specific nonlinearities. In an effort to answer the identifiability question in greater generality, we consider arbitrary nonlinearities with potentially complicated affine symmetries, and we show that the symmetries can be used to find a rich set of networks giving rise to the same function . The set obtained in this manner is, in fact, exhaustive (i.e., it contains all networks giving rise to ) unless there exists a network "with no internal symmetries" giving rise to the identically zero function. This result can thus be interpreted as an analog of the rank-nullity theorem for linear operators. We furthermore exhibit a class of "-type" nonlinearities (including the tanh function itself) for which such a network does not exist, thereby solving the identifiability question for these nonlinearities in full generality. Finally, we show that this class contains nonlinearities with arbitrarily complicated symmetries.
Recommendations
- Neural network identifiability for a family of sigmoidal nonlinearities
- Reconstructing a neural net from its output
- Absence of bottlenecks in a neural network determines its generic functional properties
- scientific article; zbMATH DE number 32618
- Robust and resource-efficient identification of two hidden layer neural networks
Cites work
- scientific article; zbMATH DE number 683513 (Why is no real title available?)
- scientific article; zbMATH DE number 1022658 (Why is no real title available?)
- A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction
- Deep learning
- Group invariant scattering
- Optimal approximation of piecewise smooth functions using deep ReLU neural networks
- Optimal approximation with sparsely connected deep neural networks
- Reconstructing a neural net from its output
Cited in
(9)- Stable recovery of entangled weights: towards robust identification of deep neural networks from minimal samples
- Neural network identifiability for a family of sigmoidal nonlinearities
- Robust and resource-efficient identification of two hidden layer neural networks
- Metric entropy limits on recurrent neural network learning of linear dynamical systems
- Parameter identifiability of a deep feedforward ReLU neural network
- Gauge symmetry and neural networks
- Information theory and recovery algorithms for data fusion in Earth observation
- An embedding of ReLU networks and an analysis of their identifiability
- Identification in the presence of symmetry: oscillator networks
This page was built for publication: Affine symmetries and neural network identifiability
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2214101)