Value Iteration is Optic Composition
From MaRDI portal
Publication:6190614
DOI10.4204/eptcs.380.24arXiv2206.04547MaRDI QIDQ6190614
Publication date: 5 March 2024
Published in: Electronic Proceedings in Theoretical Computer Science (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2206.04547
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Long-term values in Markov decision processes, (co)algebraically
- \({\mathcal Q}\)-learning
- Morphisms of open games
- A synthetic approach to Markov kernels, conditional independence and theorems on sufficient statistics
- Optimal control of Markov processes with incomplete state information
- (Co)end Calculus
- A Probability Monad as the Colimit of Spaces of Finite Samples
- Double Categories of Open Dynamical Systems (Extended Abstract)
- Compositional Game Theory
- Categories in Control
- Contraction Mappings in the Theory Underlying Dynamic Programming
- Closed categories generated by commutative monads