A random matrix approach to neural networks

From MaRDI portal
Publication:1650102

DOI10.1214/17-AAP1328zbMATH Open1391.60010arXiv1702.05419OpenAlexW2963650649WikidataQ130038206 ScholiaQ130038206MaRDI QIDQ1650102FDOQ1650102


Authors: Cosme Louart, Zhenyu Liao, R. Couillet Edit this on Wikidata


Publication date: 29 June 2018

Published in: The Annals of Applied Probability (Search for Journal in Brave)

Abstract: This article studies the Gram random matrix model G=frac1TSigmamTSigma, Sigma=sigma(WX), classically found in the analysis of random feature maps and random neural networks, where X=[x1,ldots,xT]inmathbbRpimesT is a (data) matrix of bounded norm, WinmathbbRnimesp is a matrix of independent zero-mean unit variance entries, and sigma:mathbbRomathbbR is a Lipschitz continuous (activation) function --- sigma(WX) being understood entry-wise. By means of a key concentration of measure lemma arising from non-asymptotic random matrix arguments, we prove that, as n,p,T grow large at the same rate, the resolvent Q=(G+gammaIT)1, for gamma>0, has a similar behavior as that met in sample covariance matrix models, involving notably the moment Phi=fracTnmathbbE[G], which provides in passing a deterministic equivalent for the empirical spectral measure of G. Application-wise, this result enables the estimation of the asymptotic performance of single-layer random neural networks. This in turn provides practical insights into the underlying mechanisms into play in random neural networks, entailing several unexpected consequences, as well as a fast practical means to tune the network hyperparameters.


Full work available at URL: https://arxiv.org/abs/1702.05419




Recommendations




Cites Work


Cited In (21)

Uses Software





This page was built for publication: A random matrix approach to neural networks

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1650102)