An optimal time variable learning framework for deep neural networks

DOI10.4310/AMSA.2023.V8.N3.A4arXiv2204.08528OpenAlexW4388676682MaRDI QIDQ6190852FDOQ6190852

Hugo S. Díaz, Harbir Antil, Evelyn Herberg

Publication date: 6 February 2024

Published in: Annals of Mathematical Sciences and Applications (Search for Journal in Brave)

Abstract: Feature propagation in Deep Neural Networks (DNNs) can be associated to nonlinear discrete dynamical systems. The novelty, in this paper, lies in letting the discretization parameter (time step-size) vary from layer to layer, which needs to be learned, in an optimization framework. The proposed framework can be applied to any of the existing networks such as ResNet, DenseNet or Fractional-DNN. This framework is shown to help overcome the vanishing and exploding gradient issues. Stability of some of the existing continuous DNNs such as Fractional-DNN is also studied. The proposed approach is applied to an ill-posed 3D-Maxwell's equation.

Full work available at URL: https://arxiv.org/abs/2204.08528

zbMATH Keywords

deep learning exploding gradients deep neural network fractional time derivatives residual neural network fractional neural network vanishing gradients optimal network architecture

Mathematics Subject Classification ID

Artificial neural networks and deep learning (68T07)

This page was built for publication: An optimal time variable learning framework for deep neural networks

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6190852)