Abstract: This article studies the Gram random matrix model , , classically found in the analysis of random feature maps and random neural networks, where is a (data) matrix of bounded norm, is a matrix of independent zero-mean unit variance entries, and is a Lipschitz continuous (activation) function --- being understood entry-wise. By means of a key concentration of measure lemma arising from non-asymptotic random matrix arguments, we prove that, as grow large at the same rate, the resolvent , for , has a similar behavior as that met in sample covariance matrix models, involving notably the moment , which provides in passing a deterministic equivalent for the empirical spectral measure of . Application-wise, this result enables the estimation of the asymptotic performance of single-layer random neural networks. This in turn provides practical insights into the underlying mechanisms into play in random neural networks, entailing several unexpected consequences, as well as a fast practical means to tune the network hyperparameters.
Recommendations
- Nonlinear random matrix theory for deep learning
- Eigenvalue distribution of some nonlinear models of random matrices
- On random matrices arising in deep neural networks: General I.I.D. case
- On Random Matrices Arising in Deep Neural Networks. Gaussian Case
- Products of many large random matrices and gradients in deep neural networks
Cites work
- scientific article; zbMATH DE number 876688 (Why is no real title available?)
- A Central Limit Theorem for the SINR at the LMMSE Estimator Output for Large-Dimensional Signals
- A Large Dimensional Analysis of Least Squares Support Vector Machines
- AN EXAMPLE IN THE THEORY OF THE SPECTRUM OF A FUNCTION
- Almost sure localization of the eigenvalues in a Gaussian information plus noise model. Application to the spiked models.
- Analysis of the limiting spectral distribution of large dimensional random matrices
- Concentration of measure and spectra of random matrices: applications to correlation matrices, elliptical distributions and beyond
- DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES
- Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?
- Eigenvalue distribution of large random matrices
- Hanson-Wright inequality and sub-Gaussian concentration
- Kernel spectral clustering of large dimensional data
- Multilayer feedforward networks are universal approximators
- No eigenvalues outside the support of the limiting spectral distribution of large-dimensional sample covariance matrices
- On the empirical distribution of eigenvalues of a class of large dimensional random matrices
- On the signal-to-interference ratio of CDMA systems in wireless communications
- Random Beamforming Over Quasi-Static and Fading Channels: A Deterministic Equivalent Approach
- Spectral analysis of large dimensional random matrices
- The random matrix regime of Maronna's M-estimator with elliptically distributed samples
- The singular values and vectors of low rank perturbations of large rectangular random matrices
- The spectrum of kernel random matrices
Cited in
(21)- A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent*
- On the empirical spectral distribution for certain models related to sample covariance matrices with different correlations
- Random interactions in higher order neural networks
- Deformed semicircle law and concentration of nonlinear random matrices for ultra-wide neural networks
- On Random Matrices Arising in Deep Neural Networks. Gaussian Case
- scientific article; zbMATH DE number 2233617 (Why is no real title available?)
- Large-dimensional random matrix theory and its applications in deep learning and wireless communications
- Deep learning in random neural fields: numerical experiments via neural tangent kernel
- Nonlinear random matrix theory for deep learning
- Products of many large random matrices and gradients in deep neural networks
- Designing universal causal deep learning models: The geometric (Hyper)transformer
- Free dynamics of feature learning processes
- scientific article; zbMATH DE number 7038932 (Why is no real title available?)
- Deep learning: a statistical viewpoint
- Generalisation error in learning with random features and the hidden manifold model*
- Eigenvalue distribution of some nonlinear models of random matrices
- FAST NON-NEGATIVE LEAST-SQUARES LEARNING IN THE RANDOM NEURAL NETWORK
- scientific article; zbMATH DE number 7415098 (Why is no real title available?)
- Halting time is predictable for large models: a universality property and average-case analysis
- A note on the Pennington-Worah distribution
- The curse of overparametrization in adversarial training: precise analysis of robust generalization for random features regression
This page was built for publication: A random matrix approach to neural networks
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1650102)