Approxrl
From MaRDI portal
Software:26214
swMATH14312MaRDI QIDQ26214FDOQ26214
Author name not available (Why is that?)
Cited In (37)
- A Markov decision process for response-adaptive randomization in clinical trials
- Approximate dynamic programming for stochastic \(N\)-stage optimization with application to optimal consumption under uncertainty
- Error bounds for constant step-size \(Q\)-learning
- Predictive market making via machine learning
- Dynamic treatment regimes: technical challenges and applications
- Decentralized reinforcement learning of robot behaviors
- An overview on recent machine learning techniques for port Hamiltonian systems
- Reinforcement learning endowed with safe veto policies to learn the control of linked-multicomponent robotic systems
- Population based optimization via differential evolution and adaptive fractional gradient descent
- Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems
- Active network management for electrical distribution systems: problem formulation, benchmark, and approximate solution
- Adaptive critic design with graph Laplacian for online learning control of nonlinear systems
- Optimized look-ahead tree policies: a bridge between look-ahead tree policies and direct policy search
- Chaotic dynamics and convergence analysis of temporal difference algorithms with bang-bang control
- Robust adaptive dynamic programming for linear and nonlinear systems: an overview
- A systematic study on meta-heuristic approaches for solving the graph coloring problem
- Stochastic optimal control of unknown linear networked control system in the presence of random delays and packet losses
- Finding multiple Nash equilibria via machine learning-supported Gröbner bases
- Adaptive cruise control via adaptive dynamic programming with experience replay
- Self-triggered control of probabilistic Boolean control networks: a reinforcement learning approach
- A lexicographic approach to constrained MDP admission control
- On the effect of probing noise in optimal control LQR via Q-learning using adaptive filtering algorithms
- Design and comparison base analysis of adaptive estimator for completely unknown linear systems in the presence of OE noise and constant input time delay
- Event-triggered optimal tracking control of nonlinear systems
- Fitted Q-iteration by functional networks for control problems
- Data-driven adaptive dynamic programming for partially observable nonzero-sum games via Q-learning method
- A deep reinforcement learning framework for continuous intraday market bidding
- A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
- Approximate policy iteration: a survey and some new methods
- A linear programming methodology for approximate dynamic programming
- Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach
- Bayesian Exploration for Approximate Dynamic Programming
- Batch mode reinforcement learning based on the synthesis of artificial trajectories
- Reinforcement learning algorithms with function approximation: recent advances and applications
- Proximal algorithms and temporal difference methods for solving fixed point problems
- Model-free event-triggered control algorithm for continuous-time linear systems with optimal performance
- A unified framework for stochastic optimization
This page was built for software: Approxrl