A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05) Artificial neural networks and deep learning (68T07) 2-person games (91A05) General topics in artificial intelligence (68T01)

Recommendations

Reinforcement learning and its application to the game of Go
Review of deep reinforcement learning and discussions on the development of computer Go
DeepStack: expert-level artificial intelligence in heads-up no-limit poker
Superhuman AI for multiplayer poker
Superhuman AI for heads-up no-limit poker: Libratus beats top professionals

Cited in

(78)

Meta-modeling game for deriving theory-consistent, microstructure-based traction-separation laws via deep reinforcement learning
Comparison of deep neural networks and deep hierarchical models for spatio-temporal data
Working with machines in mathematics
Improving strategic decisions in sequential games by exploiting positional similarity
Scalable Online Planning for Multi-Agent MDPs
World-class interpretable poker
On solving the problem of 7-piece chess endgames
Recent progress of deep reinforcement learning: from AlphaGo to AlphaGo Zero
Deep statistical model checking
Deep learning of first-order nonlinear hyperbolic conservation law solvers
Making sense of sensory input
The Hanabi challenge: a new frontier for AI research
A Comparative Tutorial of Bayesian Sequential Design and Reinforcement Learning
A \(K\)-means supported reinforcement learning framework to multi-dimensional knapsack
Exploring the constraints on artificial general intelligence: a game-theoretic model of human vs machine interaction
A policy-based learning beam search for combinatorial optimization
A Reinforcement Learning Based Slope Limiter for Two-Dimensional Finite Volume Schemes
Provable Training of a ReLU Gate with an Iterative Non-Gradient Algorithm
Pessimistic value iteration for multi-task data sharing in offline reinforcement learning
A Proof that Artificial Neural Networks Overcome the Curse of Dimensionality in the Numerical Approximation of Black–Scholes Partial Differential Equations
Enhancing differential-neural cryptanalysis
Reinforcement learning and its application to the game of Go
A Recursively Recurrent Neural Network (R2N2) Architecture for Learning Iterative Algorithms
Learning key steps to attack deep reinforcement learning agents
Synthesizing explainable counterfactual policies for algorithmic recourse with program synthesis
Construction of symmetric orthogonal designs with deep Q-network and orthogonal complementary design
Unsupervised basis function adaptation for reinforcement learning
A non-cooperative meta-modeling game for automated third-party calibrating, validating and falsifying constitutive laws with parallelized adversarial attacks
Scalable imaginary time evolution with neural network quantum states
DSMC evaluation stages: fostering robust and safe behavior in deep reinforcement learning -- extended version
Quantum circuit compilation for nearest-neighbor architecture based on reinforcement learning
Comparative analysis of machine learning methods for active flow control
Routing in reinforcement learning Markov chains
Topological properties of the set of functions generated by neural networks of fixed size
Review of deep reinforcement learning and discussions on the development of computer Go
Dynamic selective maintenance optimization for multi-state systems over a finite horizon: a deep reinforcement learning approach
Deep reinforcement learning for the optimal placement of cryptocurrency limit orders
A cooperative game for automated learning of elasto-plasticity knowledge graphs and models with AI-guided experimentation
Efficient multi-objective reinforcement learning via multiple-gradient descent with iteratively discovered weight-vector sets
Benchmark and survey of automated machine learning frameworks
Constrained multiagent Markov decision processes: a taxonomy of problems and algorithms
A machine learning framework for LES closure terms
Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective
Artificial intelligence, chaos, prediction and understanding in science
Induction and exploitation of subgoal automata for reinforcement learning
Optimal production ramp‐up in the smartphone manufacturing industry
Learning to win by reading manuals in a Monte-Carlo framework
Planning for potential: efficient safe reinforcement learning
scientific article; zbMATH DE number 7306889 (Why is no real title available?)
Simulation-based search
A reinforcement learning approach to the stochastic cutting stock problem
Solving optimal predictor-feedback control using approximate dynamic programming
scientific article; zbMATH DE number 7370594 (Why is no real title available?)
Risk-aware shielding of partially observable Monte Carlo planning policies
Inductive general game playing
Deliberative acting, planning and learning with hierarchical operational models
What will drive global economic growth in the digital age?
scientific article; zbMATH DE number 1759680 (Why is no real title available?)
Reward is enough
Explore and Exploit with Heterotic Line Bundle Models
Compact and efficient encodings for planning in factored state and action spaces with learned binarized neural network transition models
Smoothing policies and safe policy gradients
Archetypal landscapes for deep neural networks
Model-based Reinforcement Learning: A Survey
Almost surely safe exploration and exploitation for deep reinforcement learning with state safety estimation
The unreasonable effectiveness of deep learning in artificial intelligence
Multi-agent reinforcement learning: a selective overview of theories and algorithms
Automatic discovery of interpretable planning strategies
Is there a role for statistics in artificial intelligence?
The explanation game: a formal framework for interpretable machine learning
Deep policy dynamic programming for vehicle routing problems
Spatial state-action features for general games
Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management
Forecasting Hamiltonian dynamics without canonical coordinates
Deep reinforcement learning for \textsf{FlipIt} security game
Full gradient DQN reinforcement learning: a provably convergent scheme
Reinforcement learning: an industrial perspective
What may lie ahead in reinforcement learning

Describes a project that uses

Uses Software

AlphaZero

This page was built for publication: A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5218653)