A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
DOI10.1126/SCIENCE.AAR6404zbMATH Open1433.68320OpenAlexW2902907165WikidataQ59594962 ScholiaQ59594962MaRDI QIDQ5218653FDOQ5218653
Authors: David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy P. Lillicrap, Karen Simonyan, Demis Hassabis
Publication date: 4 March 2020
Published in: Science (Search for Journal in Brave)
Full work available at URL: https://discovery.ucl.ac.uk/id/eprint/10069050/
Recommendations
- Reinforcement learning and its application to the game of Go
- Review of deep reinforcement learning and discussions on the development of computer Go
- DeepStack: expert-level artificial intelligence in heads-up no-limit poker
- Superhuman AI for multiplayer poker
- Superhuman AI for heads-up no-limit poker: Libratus beats top professionals
Learning and adaptive systems in artificial intelligence (68T05) Artificial neural networks and deep learning (68T07) 2-person games (91A05) General topics in artificial intelligence (68T01)
Cited In (78)
- Comparison of deep neural networks and deep hierarchical models for spatio-temporal data
- Meta-modeling game for deriving theory-consistent, microstructure-based traction-separation laws via deep reinforcement learning
- Scalable Online Planning for Multi-Agent MDPs
- Recent progress of deep reinforcement learning: from AlphaGo to AlphaGo Zero
- World-class interpretable poker
- On solving the problem of 7-piece chess endgames
- Deep statistical model checking
- Making sense of sensory input
- The Hanabi challenge: a new frontier for AI research
- A Proof that Artificial Neural Networks Overcome the Curse of Dimensionality in the Numerical Approximation of Black–Scholes Partial Differential Equations
- Enhancing differential-neural cryptanalysis
- Reinforcement learning and its application to the game of Go
- Construction of symmetric orthogonal designs with deep Q-network and orthogonal complementary design
- Unsupervised basis function adaptation for reinforcement learning
- A non-cooperative meta-modeling game for automated third-party calibrating, validating and falsifying constitutive laws with parallelized adversarial attacks
- Comparative analysis of machine learning methods for active flow control
- Topological properties of the set of functions generated by neural networks of fixed size
- Review of deep reinforcement learning and discussions on the development of computer Go
- Dynamic selective maintenance optimization for multi-state systems over a finite horizon: a deep reinforcement learning approach
- Deep reinforcement learning for the optimal placement of cryptocurrency limit orders
- A cooperative game for automated learning of elasto-plasticity knowledge graphs and models with AI-guided experimentation
- Efficient multi-objective reinforcement learning via multiple-gradient descent with iteratively discovered weight-vector sets
- Benchmark and survey of automated machine learning frameworks
- Constrained multiagent Markov decision processes: a taxonomy of problems and algorithms
- A machine learning framework for LES closure terms
- Artificial intelligence, chaos, prediction and understanding in science
- Induction and exploitation of subgoal automata for reinforcement learning
- Optimal production ramp‐up in the smartphone manufacturing industry
- Learning to win by reading manuals in a Monte-Carlo framework
- Title not available (Why is that?)
- Planning for potential: efficient safe reinforcement learning
- A reinforcement learning approach to the stochastic cutting stock problem
- Title not available (Why is that?)
- Risk-aware shielding of partially observable Monte Carlo planning policies
- Inductive general game playing
- What will drive global economic growth in the digital age?
- Deliberative acting, planning and learning with hierarchical operational models
- Title not available (Why is that?)
- Explore and Exploit with Heterotic Line Bundle Models
- Reward is enough
- Smoothing policies and safe policy gradients
- Model-based Reinforcement Learning: A Survey
- Archetypal landscapes for deep neural networks
- Compact and efficient encodings for planning in factored state and action spaces with learned binarized neural network transition models
- The unreasonable effectiveness of deep learning in artificial intelligence
- The explanation game: a formal framework for interpretable machine learning
- Multi-agent reinforcement learning: a selective overview of theories and algorithms
- Automatic discovery of interpretable planning strategies
- Deep policy dynamic programming for vehicle routing problems
- Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management
- Full gradient DQN reinforcement learning: a provably convergent scheme
- Deep reinforcement learning for \textsf{FlipIt} security game
- Reinforcement learning: an industrial perspective
- What may lie ahead in reinforcement learning
- Working with machines in mathematics
- Improving strategic decisions in sequential games by exploiting positional similarity
- Deep learning of first-order nonlinear hyperbolic conservation law solvers
- A Comparative Tutorial of Bayesian Sequential Design and Reinforcement Learning
- A \(K\)-means supported reinforcement learning framework to multi-dimensional knapsack
- Exploring the constraints on artificial general intelligence: a game-theoretic model of human vs machine interaction
- Provable Training of a ReLU Gate with an Iterative Non-Gradient Algorithm
- A policy-based learning beam search for combinatorial optimization
- A Reinforcement Learning Based Slope Limiter for Two-Dimensional Finite Volume Schemes
- Pessimistic value iteration for multi-task data sharing in offline reinforcement learning
- A Recursively Recurrent Neural Network (R2N2) Architecture for Learning Iterative Algorithms
- Learning key steps to attack deep reinforcement learning agents
- Synthesizing explainable counterfactual policies for algorithmic recourse with program synthesis
- Scalable imaginary time evolution with neural network quantum states
- DSMC evaluation stages: fostering robust and safe behavior in deep reinforcement learning -- extended version
- Quantum circuit compilation for nearest-neighbor architecture based on reinforcement learning
- Routing in reinforcement learning Markov chains
- Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective
- Simulation-based search
- Solving optimal predictor-feedback control using approximate dynamic programming
- Almost surely safe exploration and exploitation for deep reinforcement learning with state safety estimation
- Is there a role for statistics in artificial intelligence?
- Spatial state-action features for general games
- Forecasting Hamiltonian dynamics without canonical coordinates
Uses Software
This page was built for publication: A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5218653)