Categorical foundations of gradient-based learning
From MaRDI portal
Abstract: We propose a categorical semantics of gradient-based machine learning algorithms in terms of lenses, parametrised maps, and reverse derivative categories. This foundation provides a powerful explanatory and unifying framework: it encompasses a variety of gradient descent algorithms such as ADAM, AdaGrad, and Nesterov momentum, as well as a variety of loss functions such as as MSE and Softmax cross-entropy, shedding new light on their similarities and differences. Our approach to gradient-based learning has examples generalising beyond the familiar continuous domains (modelled in categories of smooth maps) and can be realized in the discrete setting of boolean circuits. Finally, we demonstrate the practical significance of our framework with an implementation in Python.
Recommendations
- Backprop as functor. A compositional perspective on supervised learning
- Reverse derivative ascent: a categorical approach to learning Boolean circuits
- Categorical semantics of a simple differential programming language
- Learners' languages
- Categories of Differentiable Polynomial Circuits for Machine Learning
Cites work
- scientific article; zbMATH DE number 7650831 (Why is no real title available?)
- A survey of graphical languages for monoidal categories
- Adaptive subgradient methods for online learning and stochastic optimization
- Boomerang, resourceful lenses for string data
- Categorical Stochastic Processes and Likelihood
- Categories in control
- Compositional game theory
- Control categories and duality: On the categorical semantics of the lambda-mu calculus
- Diagrammatic Semantics for Digital Circuits.
- Evaluating Derivatives
- Functorial data migration
- Infinite Horizon Extensive Form Games, Coalgebraically
- Lenses, fibrations and universal translations
- Monoidal indeterminates and categories of possible worlds
- Picturing quantum processes. A first course in quantum theory and diagrammatic reasoning
- Reverse derivative ascent: a categorical approach to learning Boolean circuits
- Some methods of speeding up the convergence of iteration methods
- Support-vector networks
- The calculus of signal flow diagrams. I: Linear relations on streams.
Cited in
(20)- Comodule representations of second-order functionals
- Effectful semantics in 2-dimensional categories: premonoidal and Freyd bicategories
- Constructor theory as process theory
- Monoidal structures on generalized polynomial categories
- Effectful semantics in bicategories: strong, commutative, and concurrent pseudomonads
- is for Dialectica
- Diegetic Representation of Feedback in Open Games
- Differentiable causal computations via delayed trace (extended version)
- Rewriting for symmetric monoidal categories with commutative (co)monoid structure
- Jacobians and gradients for Cartesian differential categories
- Learners' languages
- Categorical composable cryptography: extended version
- Categorical composable cryptography
- Algebraic dynamical systems in machine learning
- An ultrametric for Cartesian differential categories for Taylor series convergence
- Cartesian differential Kleisli categories
- The produoidal algebra of process decomposition
- Reverse tangent categories
- Monoidal closure of Grothendieck constructions via -tractable monoidal structures and Dialectica formulas
- Backprop as functor. A compositional perspective on supervised learning
This page was built for publication: Categorical foundations of gradient-based learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6166781)