Mathematical Research Data Initiative
Main page
Recent changes
Random page
SPARQL
MaRDI@GitHub
New item
In other projects
MaRDI portal item
Discussion
View source
View history
English
Log in

10.1162/jmlr.2003.3.4-5.921

From MaRDI portal
Publication:4656015
Jump to:navigation, search

DOI10.1162/JMLR.2003.3.4-5.921zbMATH Open1112.68452OpenAlexW4238778767MaRDI QIDQ4656015FDOQ4656015

Andrew W. Moore, Malcolm J. A. Strens

Publication date: 8 March 2005

Published in: CrossRef Listing of Deleted DOIs (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1162/jmlr.2003.3.4-5.921



zbMATH Keywords

reinforcement learning


Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05)



Cited In (1)

  • Inverse modeling of a solar collector involving Fourier and non-Fourier heat conduction


   Recommendations
  • Policy gradient in continuous time πŸ‘ πŸ‘Ž
  • Reward-weighted regression with sample reuse for direct policy search in reinforcement learning πŸ‘ πŸ‘Ž
  • Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm πŸ‘ πŸ‘Ž
  • Compatible natural gradient policy search πŸ‘ πŸ‘Ž
  • Dynamic programming or direct comparison? πŸ‘ πŸ‘Ž





This page was built for publication: 10.1162/jmlr.2003.3.4-5.921

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4656015)

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:4656015&oldid=18857279"
Tools
What links here
Related changes
Printable version
Permanent link
Page information
This page was last edited on 7 February 2024, at 16:52. Warning: Page may not contain recent updates.
Privacy policy
About MaRDI portal
Disclaimers
Imprint
Powered by MediaWiki