An optimal time variable learning framework for deep neural networks
From MaRDI portal
Publication:6190852
DOI10.4310/AMSA.2023.V8.N3.A4arXiv2204.08528OpenAlexW4388676682MaRDI QIDQ6190852FDOQ6190852
Hugo S. Díaz, Harbir Antil, Evelyn Herberg
Publication date: 6 February 2024
Published in: Annals of Mathematical Sciences and Applications (Search for Journal in Brave)
Abstract: Feature propagation in Deep Neural Networks (DNNs) can be associated to nonlinear discrete dynamical systems. The novelty, in this paper, lies in letting the discretization parameter (time step-size) vary from layer to layer, which needs to be learned, in an optimization framework. The proposed framework can be applied to any of the existing networks such as ResNet, DenseNet or Fractional-DNN. This framework is shown to help overcome the vanishing and exploding gradient issues. Stability of some of the existing continuous DNNs such as Fractional-DNN is also studied. The proposed approach is applied to an ill-posed 3D-Maxwell's equation.
Full work available at URL: https://arxiv.org/abs/2204.08528
deep learningexploding gradientsdeep neural networkfractional time derivativesresidual neural networkfractional neural networkvanishing gradientsoptimal network architecture
This page was built for publication: An optimal time variable learning framework for deep neural networks
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6190852)