Effects of depth, width, and initialization: a convergence analysis of layer-wise training for deep linear neural networks (Q5037872)

From MaRDI portal

Jump to:navigation, search

This is the item page for this Wikibase entity, intended for internal use and editing purposes.

Please use this page instead for the normal view: Effects of depth, width, and initialization: a convergence analysis of layer-wise training for deep linear neural networks

scientific article; zbMATH DE number 7484162

Language	Label	Description	Also known as
default for all languages	No label defined
English	Effects of depth, width, and initialization: a convergence analysis of layer-wise training for deep linear neural networks	scientific article; zbMATH DE number 7484162

Statements

scholarly article

0 references

Effects of depth, width, and initialization: A convergence analysis of layer-wise training for deep linear neural networks (English)

0 references

0 references

Analysis and Applications

0 references

publication date

4 March 2022

0 references

full work available at URL

https://arxiv.org/abs/1910.05874

0 references

zbMATH Keywords

deep linear neural networks

0 references

layer-wise training

0 references

block coordinate gradient descent

0 references

describes a project that uses

0 references

0 references

0 references

MaRDI profile type

MaRDI publication profile

0 references

Gradient descent with identity initialization efficiently learns positive-definite linear transformations by deep residual networks

0 references

Reducing the Dimensionality of Data with Neural Networks

0 references

Randomized methods for linear constraints: convergence rates and conditioning

0 references

Randomized Kaczmarz solver for noisy linear systems

0 references

0 references

A randomized Kaczmarz algorithm with exponential convergence

0 references

Block-coordinate gradient descent method for linearly constrained nonsmooth separable optimization

0 references

A coordinate gradient descent method for nonsmooth separable minimization

0 references

Theory of deep convolutional neural networks: downsampling

0 references

Universality of deep convolutional neural networks

0 references

Gradient descent optimizes over-parameterized deep ReLU networks

0 references

Randomized extended Kaczmarz for solving least squares

0 references

Recommended article

Non-convergence of stochastic gradient descent in the training of deep neural networks

Similarity Score

0.7138404250144958

Recommender Run

Recommender Run 4

0 references

Effect of depth and width on local minima in deep learning

Similarity Score

0.7013614773750305

Recommender Run

Recommender Run 4

0 references

Mean Field Analysis of Deep Neural Networks

Similarity Score

0.6972294449806213

Recommender Run

Recommender Run 4

0 references

Mean field analysis of neural networks: a law of large numbers

Similarity Score

0.6917617321014404

Recommender Run

Recommender Run 4

0 references

Convergence of stochastic gradient descent in deep neural network

Similarity Score

0.6912702918052673

Recommender Run

Recommender Run 4

0 references

Identifiers

zbMATH Open document ID

0 references

10.1142/S0219530521500263

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

zbMATH DE Number

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:5037872

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Item:Q5037872&oldid=54239519"