scientific article; zbMATH DE number 6902561
From MaRDI portal
Publication:4576234
DOI10.3233/978-1-61499-672-9-1026zbMath1396.90053MaRDI QIDQ4576234
Shalabh Bhatnagar, Ajin George Joseph
Publication date: 12 July 2018
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Nonconvex programming, global optimization (90C26) Learning and adaptive systems in artificial intelligence (68T05) Stochastic programming (90C15)
Related Items (3)
An incremental off-policy search in a model-free Markov decision process using a single sample path ⋮ An Incremental Fast Policy Search Using a Single Sample Path ⋮ An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method
This page was built for publication: