scientific article; zbMATH DE number 7307478
From MaRDI portal
Publication:5149240
No author found.
Publication date: 8 February 2021
Full work available at URL: https://jmlr.csail.mit.edu/papers/v21/19-144.html
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
stochastic controlGaussian distributionreinforcement learningrelaxed controllinear-quadraticentropy regularization
Related Items
Zero-Sum Stackelberg Stochastic Linear-Quadratic Differential Games ⋮ Exploratory HJB Equations and Their Convergence ⋮ N-Player and Mean-Field Games in Itˆo-Diffusion Markets with Competitive or Homophilous Interaction ⋮ Robust Risk-Aware Reinforcement Learning ⋮ State-Dependent Temperature Control for Langevin Diffusions ⋮ The reinforcement learning Kelly strategy ⋮ Survey on multi-period mean-variance portfolio selection model ⋮ Choquet Regularization for Continuous-Time Reinforcement Learning ⋮ A stochastic maximum principle approach for reinforcement learning with parameterized environment ⋮ Approximate Q Learning for Controlled Diffusion Processes and Its Near Optimality ⋮ Recent advances in reinforcement learning in finance ⋮ Tail probability estimates of continuous-time simulated annealing processes ⋮ Optimal Scheduling of Entropy Regularizer for Continuous-Time Linear-Quadratic Reinforcement Learning ⋮ Learning equilibrium mean‐variance strategy ⋮ Exploratory Control with Tsallis Entropy for Latent Factor Models ⋮ Reinforcement learning for exploratory linear-quadratic two-person zero-sum stochastic differential games ⋮ Mean-field linear-quadratic stochastic differential games ⋮ Regularity and Stability of Feedback Relaxed Controls ⋮ Reinforcement learning and stochastic optimisation ⋮ Exploratory LQG mean field games with entropy regularization
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- An analysis of model-based interval estimation for Markov decision processes
- Continuous multi-armed bandits and multiparameter processes
- Multi-armed bandits in discrete and continuous time
- Linear Thompson sampling revisited
- Finite state Markovian decision processes
- 10.1162/153244303765208377
- Iterative linearization methods for approximately optimal control and estimation of non-linear stochastic system
- On the Existence of Optimal Relaxed Controls of Stochastic Partial Differential Equations
- Existence of Markov Controls and Characterization of Optimal Markov Controls
- Compactification methods in the control of degenerate diffusions: existence of an optimal control
- On stochastic relaxed control for partially observed diffusions
- Learning to Optimize via Posterior Sampling
- Continuous‐time mean–variance portfolio selection: A reinforcement learning framework
- Stationary solutions and forward equations for controlled and singular martingale problems
- Finite-time analysis of the multiarmed bandit problem