scientific article; zbMATH DE number 7307478
From MaRDI portal
Publication:5149240
Recommendations
- Reinforcement learning for a class of continuous-time input constrained optimal control problems
- Continuous-time reinforcement learning for robust control under worst-case uncertainty
- Reinforcement Learning for Sequential Decision and Optimal Control
- Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning
- Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems
- From reinforcement learning to optimal control: a unified framework for sequential decisions
- Reinforcement learning control of unknown dynamic systems
Cites work
- scientific article; zbMATH DE number 51724 (Why is no real title available?)
- scientific article; zbMATH DE number 3474804 (Why is no real title available?)
- scientific article; zbMATH DE number 1325009 (Why is no real title available?)
- 10.1162/153244303765208377
- An analysis of model-based interval estimation for Markov decision processes
- Compactification methods in the control of degenerate diffusions: existence of an optimal control
- Continuous multi-armed bandits and multiparameter processes
- Continuous‐time mean–variance portfolio selection: A reinforcement learning framework
- Deep exploration via randomized value functions
- End-to-end training of deep visuomotor policies
- Existence of Markov Controls and Characterization of Optimal Markov Controls
- Finite state Markovian decision processes
- Finite-time analysis of the multiarmed bandit problem
- Iterative linearization methods for approximately optimal control and estimation of non-linear stochastic system
- Learning to optimize via posterior sampling
- Linear Thompson sampling revisited
- Multi-armed bandits in discrete and continuous time
- On stochastic relaxed control for partially observed diffusions
- On the Existence of Optimal Relaxed Controls of Stochastic Partial Differential Equations
- Reinforcement learning in finite MDPs: PAC analysis
- Reinforcement learning. An introduction
- Stationary solutions and forward equations for controlled and singular martingale problems
Cited in
(32)- Survey on multi-period mean-variance portfolio selection model
- Tail probability estimates of continuous-time simulated annealing processes
- Policy iterations for reinforcement learning problems in continuous time and space -- fundamental theory and methods
- Reinforcement learning for a class of continuous-time input constrained optimal control problems
- The reinforcement learning Kelly strategy
- Continuous‐time receding‐horizon reinforcement learning and its application to path‐tracking control of autonomous ground vehicles
- Exploratory HJB equations and their convergence
- scientific article; zbMATH DE number 6795315 (Why is no real title available?)
- Exploratory LQG mean field games with entropy regularization
- Recent advances in reinforcement learning in finance
- Reinforcement learning and stochastic optimisation
- Robust risk-aware reinforcement learning
- Zero-Sum Stackelberg Stochastic Linear-Quadratic Differential Games
- Connecting stochastic optimal control and reinforcement learning
- Uncertainty quantification and exploration for reinforcement learning
- Logarithmic regret bounds for continuous-time average-reward Markov decision processes
- Regularity and stability of feedback relaxed controls
- Optimal Scheduling of Entropy Regularizer for Continuous-Time Linear-Quadratic Reinforcement Learning
- Recent developments in machine learning methods for stochastic control and games
- Learning equilibrium mean‐variance strategy
- Choquet Regularization for Continuous-Time Reinforcement Learning
- Exploratory Control with Tsallis Entropy for Latent Factor Models
- Approximate Q Learning for Controlled Diffusion Processes and Its Near Optimality
- Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems
- State-Dependent Temperature Control for Langevin Diffusions
- Control randomisation approach for policy gradient and application to reinforcement learning in optimal switching
- Continuous time q-learning for mean-field control problems
- Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning
- A stochastic maximum principle approach for reinforcement learning with parameterized environment
- \(N\)-player and mean-field games in Itô-diffusion markets with competitive or homophilous interaction
- Mean-field linear-quadratic stochastic differential games
- Reinforcement learning for exploratory linear-quadratic two-person zero-sum stochastic differential games
This page was built for publication:
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5149240)