The following pages link to (Q4422978):
Displaying 5 items.
- An incremental off-policy search in a model-free Markov decision process using a single sample path (Q1621868) (← links)
- A tutorial on the cross-entropy method (Q2485925) (← links)
- The cross-entropy method for network reliability estimation (Q2485928) (← links)
- Application of the cross-entropy method to the buffer allocation problem in a simulation-based environment (Q2485930) (← links)
- Basis function adaptation in temporal difference reinforcement learning (Q2485935) (← links)