The following pages link to 10.1162/1532443041827907 (Q4826001):
Displaying 50 items.
- Potential-based least-squares policy iteration for a parameterized feedback control system (Q289143) (← links)
- Approximate dynamic programming for the dispatch of military medical evacuation assets (Q323422) (← links)
- Probabilistic inference for determining options in reinforcement learning (Q331688) (← links)
- Batch mode reinforcement learning based on the synthesis of artificial trajectories (Q378762) (← links)
- Model selection in reinforcement learning (Q415618) (← links)
- Analysis and improvement of policy gradient estimation (Q448295) (← links)
- Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains (Q460616) (← links)
- Temporal difference-based policy iteration for optimal control of stochastic systems (Q467477) (← links)
- Parameterized Markov decision process and its application to service rate control (Q492972) (← links)
- Reducing reinforcement learning to KWIK online regression (Q616761) (← links)
- Optimization of heuristic search using recursive algorithm selection and reinforcement learning (Q647446) (← links)
- Approximate dynamic programming via direct search in the space of value function approximations (Q713118) (← links)
- Proximal algorithms and temporal difference methods for solving fixed point problems (Q721950) (← links)
- Regularized feature selection in reinforcement learning (Q747290) (← links)
- Learning with policy prediction in continuous state-action multi-agent decision processes (Q780283) (← links)
- A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning (Q859737) (← links)
- Model-based policy gradients with parameter-based exploration by least-squares conditional density estimation (Q889297) (← links)
- Reinforcement learning algorithms with function approximation: recent advances and applications (Q903601) (← links)
- Approximate dynamic programming with a fuzzy parameterization (Q980910) (← links)
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path (Q1009248) (← links)
- Natural actor-critic algorithms (Q1049136) (← links)
- An incremental off-policy search in a model-free Markov decision process using a single sample path (Q1621868) (← links)
- Reinforcement learning-based design of sampling policies under cost constraints in Markov random fields: application to weed map reconstruction (Q1623384) (← links)
- An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method (Q1631797) (← links)
- Least squares approximate policy iteration for learning bid prices in choice-based revenue management (Q1652041) (← links)
- Heuristic decision rules for short-term trading of renewable energy with co-located energy storage (Q1652308) (← links)
- Dynamic appointment scheduling with wait-dependent abandonment (Q1681154) (← links)
- Offline reinforcement learning with task hierarchies (Q1698854) (← links)
- Approximate dynamic programming for missile defense interceptor fire control (Q1751900) (← links)
- Adaptive importance sampling for value function approximation in off-policy reinforcement learning (Q1784527) (← links)
- Efficient exploration through active learning for value function approximation in reinforcement learning (Q1784573) (← links)
- Hybrid least-squares algorithms for approximate policy evaluation (Q1959511) (← links)
- Sell or store? An ADP approach to marketing renewable energy (Q2011830) (← links)
- A linear programming methodology for approximate dynamic programming (Q2023646) (← links)
- Rollout sampling approximate policy iteration (Q2036256) (← links)
- Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling (Q2051259) (← links)
- Challenges of real-world reinforcement learning: definitions, benchmarks and analysis (Q2071388) (← links)
- Model-free optimal control of discrete-time systems with additive and multiplicative noises (Q2103660) (← links)
- Batch policy learning in average reward Markov decision processes (Q2112817) (← links)
- Approximate dynamic programming for the military inventory routing problem (Q2173135) (← links)
- MREKLM: a fast multiple empirical kernel learning machine (Q2289596) (← links)
- Kernel dynamic policy programming: applicable reinforcement learning to robot systems with high dimensional states (Q2292214) (← links)
- An approximate dynamic programming approach for comparing firing policies in a networked air defense environment (Q2297577) (← links)
- Restricted gradient-descent algorithm for value-function approximation in reinforcement learning (Q2389624) (← links)
- Continual curiosity-driven skill acquisition from high-dimensional video inputs for humanoid robots (Q2407444) (← links)
- Anticipatory action selection for human-robot table tennis (Q2407448) (← links)
- A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning (Q2633537) (← links)
- Estimating optimal shared-parameter dynamic regimens with application to a multistage depression clinical trial (Q2827199) (← links)
- A simulation-based approach to stochastic dynamic programming (Q2863720) (← links)
- Approximate policy iteration: a survey and some new methods (Q2887629) (← links)