Mathematical Research Data Initiative
Main page
Recent changes
Random page
SPARQL
MaRDI@GitHub
New item
Special pages
In other projects
MaRDI portal item
Discussion
View source
View history
English
Log in

scientific article; zbMATH DE number 7370594

From MaRDI portal
Publication:4998982
Jump to:navigation, search

MaRDI QIDQ4998982FDOQ4998982


Authors: Yasuhiro Fujita, Prabhat Nagarajan, Toshiki Kataoka, Takahiro Ishikawa Edit this on Wikidata


Publication date: 9 July 2021


Full work available at URL: https://arxiv.org/abs/1912.03905

Title of this publication is not available (Why is that?)




zbMATH Keywords

reproducibilityreinforcement learningopen source softwaredeep reinforcement learningChainer


Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05)


Cites Work

  • Simple statistical gradient-following algorithms for connectionist reinforcement learning
  • A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
  • QT-Opt
  • End-to-end training of deep visuomotor policies
  • Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents


Cited In (2)

  • Title not available (Why is that?)
  • ChainerRL

Uses Software

  • Dopamine
  • rlpyt
  • RLlib
  • Catalyst.RL
  • AlphaZero
  • Stable Baselines
  • Baselines
  • GitHub
  • QT-Opt





This page was built for publication:

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4998982)

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:4998982&oldid=19452525"
Tools
What links here
Related changes
Printable version
Permanent link
Page information
This page was last edited on 8 February 2024, at 09:59. Warning: Page may not contain recent updates.
Privacy policy
About MaRDI portal
Disclaimers
Imprint
Powered by MediaWiki