The following pages link to (Q3093261):
Displayed 38 items.
- Extreme state aggregation beyond Markov decision processes (Q329613) (← links)
- Making friends on the fly: cooperating with new teammates (Q343919) (← links)
- Batch mode reinforcement learning based on the synthesis of artificial trajectories (Q378762) (← links)
- Model selection in reinforcement learning (Q415618) (← links)
- Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains (Q460616) (← links)
- Hessian matrix distribution for Bayesian policy gradient reinforcement learning (Q545311) (← links)
- Regularized feature selection in reinforcement learning (Q747290) (← links)
- Reinforcement learning algorithms with function approximation: recent advances and applications (Q903601) (← links)
- Approximate dynamic programming with a fuzzy parameterization (Q980910) (← links)
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path (Q1009248) (← links)
- Learning output reference model tracking for higher-order nonlinear systems with unknown dynamics (Q2004902) (← links)
- A deep reinforcement learning framework for continuous intraday market bidding (Q2071376) (← links)
- Challenges of real-world reinforcement learning: definitions, benchmarks and analysis (Q2071388) (← links)
- Multi-agent reinforcement learning: a selective overview of theories and algorithms (Q2094040) (← links)
- Batch policy learning in average reward Markov decision processes (Q2112817) (← links)
- Data-driven switching modeling for MPC using regression trees and random forests (Q2178240) (← links)
- Recovery of simultaneous low rank and two-way sparse coefficient matrices, a nonconvex approach (Q2286374) (← links)
- Fitted Q-iteration by functional networks for control problems (Q2293779) (← links)
- Scalable transfer learning in heterogeneous, dynamic environments (Q2407414) (← links)
- A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning (Q2633537) (← links)
- Efficient approximate dynamic programming based on design and analysis of computer experiments for infinite-horizon optimization (Q2664400) (← links)
- Estimating optimal shared-parameter dynamic regimens with application to a multistage depression clinical trial (Q2827199) (← links)
- Reinforcement Learning Strategies for Clinical Trials in Nonsmall Cell Lung Cancer (Q2893403) (← links)
- Towards Min Max Generalization in Reinforcement Learning (Q3006026) (← links)
- Bounds for Multistage Stochastic Programs Using Supervised Learning Strategies (Q3646118) (← links)
- (Q5053314) (← links)
- Bandit Theory: Applications to Learning Healthcare Systems and Clinical Trials (Q5072150) (← links)
- The QLBS Q-Learner goes NuQLear: fitted Q iteration, inverse RL, and option portfolios (Q5234379) (← links)
- Epoch-incremental reinforcement learning algorithms (Q5396438) (← links)
- Quadratic approximate dynamic programming for input‐affine systems (Q5409145) (← links)
- Learning When-to-Treat Policies (Q5857115) (← links)
- Extremely randomized trees (Q5898262) (← links)
- Extremely randomized trees (Q5920614) (← links)
- Optimized ensemble value function approximation for dynamic programming (Q6112619) (← links)
- Tutorial on Amortized Optimization (Q6139544) (← links)
- Target Network and Truncation Overcome the Deadly Triad in \(\boldsymbol{Q}\)-Learning (Q6148353) (← links)
- Approximated multi-agent fitted Q iteration (Q6174070) (← links)
- Evolving interpretable decision trees for reinforcement learning (Q6193099) (← links)