A random matrix approach to neural networks

Random matrices (algebraic aspects) (15B52) Random matrices (probabilistic aspects) (60B20) Neural nets and related approaches to inference from stochastic processes (62M45) Neural networks for/in biological studies, artificial life and related topics (92B20)

Abstract: This article studies the Gram random matrix model

G = f r a c 1 T S i g m a^{m T} S i g m a

,

S i g m a = s i g m a (W X)

, classically found in the analysis of random feature maps and random neural networks, where

X = [x_{1}, l d o t s, x_{T}] i n {m a t h b b R}^{p i m e s T}

is a (data) matrix of bounded norm,

W i n {m a t h b b R}^{n i m e s p}

is a matrix of independent zero-mean unit variance entries, and

s i g m a : m a t h b b R o m a t h b b R

is a Lipschitz continuous (activation) function ---

s i g m a (W X)

being understood entry-wise. By means of a key concentration of measure lemma arising from non-asymptotic random matrix arguments, we prove that, as

n, p, T

grow large at the same rate, the resolvent

Q = (G + g a m m a I_{T})^{- 1}

, for

g a m m a > 0

, has a similar behavior as that met in sample covariance matrix models, involving notably the moment

P h i = f r a c T n m a t h b b E [G]

, which provides in passing a deterministic equivalent for the empirical spectral measure of

G

. Application-wise, this result enables the estimation of the asymptotic performance of single-layer random neural networks. This in turn provides practical insights into the underlying mechanisms into play in random neural networks, entailing several unexpected consequences, as well as a fast practical means to tune the network hyperparameters.

Recommendations

Cites work

Cited in

(21)

Describes a project that uses

Uses Software

This page was built for publication: A random matrix approach to neural networks

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1650102)