Pages that link to "Item:Q5405216"

From MaRDI portal

Jump to:navigation, search

The following pages link to (Q5405216):

Displaying 7 items.

Batch mode reinforcement learning based on the synthesis of artificial trajectories (Q378762) (← links)
Offline reinforcement learning with task hierarchies (Q1698854) (← links)
Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling (Q2051259) (← links)
Toward theoretical understandings of robust Markov decision processes: sample complexity and asymptotics (Q2112808) (← links)
Policy space identification in configurable environments (Q2163245) (← links)
A Q-learning predictive control scheme with guaranteed stability (Q2220029) (← links)
A concentration bound for \(\operatorname{LSPE}( \lambda )\) (Q2677709) (← links)

Retrieved from "https://portal.mardi4nfdi.de/wiki/Special:WhatLinksHere/Item:Q5405216"