scientific article; zbMATH DE number 7307478
From MaRDI portal
Publication:5149240
Authors:
Publication date: 8 February 2021
Full work available at URL: https://jmlr.csail.mit.edu/papers/v21/19-144.html
Title of this publication is not available (Why is that?)
Recommendations
- Reinforcement learning for a class of continuous-time input constrained optimal control problems
- Continuous-time reinforcement learning for robust control under worst-case uncertainty
- Reinforcement Learning for Sequential Decision and Optimal Control
- Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning
- Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems
- From reinforcement learning to optimal control: a unified framework for sequential decisions
- Reinforcement learning control of unknown dynamic systems
Gaussian distributionstochastic controlreinforcement learningrelaxed controllinear-quadraticentropy regularization
Cites Work
- 10.1162/153244303765208377
- Title not available (Why is that?)
- Title not available (Why is that?)
- Finite-time analysis of the multiarmed bandit problem
- Compactification methods in the control of degenerate diffusions: existence of an optimal control
- On stochastic relaxed control for partially observed diffusions
- Finite state Markovian decision processes
- Reinforcement learning. An introduction
- Existence of Markov Controls and Characterization of Optimal Markov Controls
- Reinforcement learning in finite MDPs: PAC analysis
- Multi-armed bandits in discrete and continuous time
- Title not available (Why is that?)
- An analysis of model-based interval estimation for Markov decision processes
- Stationary solutions and forward equations for controlled and singular martingale problems
- Continuous multi-armed bandits and multiparameter processes
- Iterative linearization methods for approximately optimal control and estimation of non-linear stochastic system
- Learning to optimize via posterior sampling
- On the Existence of Optimal Relaxed Controls of Stochastic Partial Differential Equations
- Linear Thompson sampling revisited
- End-to-end training of deep visuomotor policies
- Continuous‐time mean–variance portfolio selection: A reinforcement learning framework
- Deep exploration via randomized value functions
Cited In (32)
- Policy iterations for reinforcement learning problems in continuous time and space -- fundamental theory and methods
- Reinforcement learning for a class of continuous-time input constrained optimal control problems
- The reinforcement learning Kelly strategy
- Continuous‐time receding‐horizon reinforcement learning and its application to path‐tracking control of autonomous ground vehicles
- Exploratory HJB equations and their convergence
- Title not available (Why is that?)
- Recent advances in reinforcement learning in finance
- Exploratory LQG mean field games with entropy regularization
- Robust risk-aware reinforcement learning
- Reinforcement learning and stochastic optimisation
- Zero-Sum Stackelberg Stochastic Linear-Quadratic Differential Games
- Connecting stochastic optimal control and reinforcement learning
- Uncertainty quantification and exploration for reinforcement learning
- Logarithmic regret bounds for continuous-time average-reward Markov decision processes
- Regularity and stability of feedback relaxed controls
- Optimal Scheduling of Entropy Regularizer for Continuous-Time Linear-Quadratic Reinforcement Learning
- Recent developments in machine learning methods for stochastic control and games
- Learning equilibrium mean‐variance strategy
- Choquet Regularization for Continuous-Time Reinforcement Learning
- Exploratory Control with Tsallis Entropy for Latent Factor Models
- Approximate Q Learning for Controlled Diffusion Processes and Its Near Optimality
- Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems
- State-Dependent Temperature Control for Langevin Diffusions
- Control randomisation approach for policy gradient and application to reinforcement learning in optimal switching
- Continuous time q-learning for mean-field control problems
- Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning
- A stochastic maximum principle approach for reinforcement learning with parameterized environment
- \(N\)-player and mean-field games in Itô-diffusion markets with competitive or homophilous interaction
- Mean-field linear-quadratic stochastic differential games
- Reinforcement learning for exploratory linear-quadratic two-person zero-sum stochastic differential games
- Tail probability estimates of continuous-time simulated annealing processes
- Survey on multi-period mean-variance portfolio selection model
This page was built for publication:
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5149240)