The following pages link to (Q4576234):
Displayed 3 items.
- An incremental off-policy search in a model-free Markov decision process using a single sample path (Q1621868) (← links)
- An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method (Q1631797) (← links)
- An Incremental Fast Policy Search Using a Single Sample Path (Q5045345) (← links)