Pages that link to "Item:Q1009248"

From MaRDI portal

← Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path (Q1009248)

Jump to:navigation, search

The following pages link to Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path (Q1009248):

Displaying 13 items.

Model selection in reinforcement learning (Q415618) (← links)
Hybrid least-squares algorithms for approximate policy evaluation (Q1959511) (← links)
Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains (Q1959632) (← links)
Rollout sampling approximate policy iteration (Q2036256) (← links)
Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling (Q2051259) (← links)
Batch policy learning in average reward Markov decision processes (Q2112817) (← links)
Policy space identification in configurable environments (Q2163245) (← links)
Estimating optimal shared-parameter dynamic regimens with application to a multistage depression clinical trial (Q2827199) (← links)
A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications (Q2887630) (← links)
A Finite Time Analysis of Temporal Difference Learning with Linear Function Approximation (Q5003727) (← links)
Deep reinforcement trading with predictable returns (Q6098411) (← links)
Off-policy evaluation in partially observed Markov decision processes under sequential ignorability (Q6183750) (← links)
Value iteration for streaming data on a continuous space with gradient method in an RKHS (Q6488837) (← links)

Retrieved from "https://portal.mardi4nfdi.de/wiki/Special:WhatLinksHere"