swMATH31400MaRDI QIDQ43111FDOQ43111
Author name not available (Why is that?)
Official website: https://deepmind.com/blog/article/alphazero-shedding-new-light-grand-games-chess-shogi-and-go
Cited In (99)
- Pomp++
- XPOMCP
- Unsupervised basis function adaptation for reinforcement learning
- DSAC
- Efficient multi-objective reinforcement learning via multiple-gradient descent with iteratively discovered weight-vector sets
- Benchmark and survey of automated machine learning frameworks
- Constrained multiagent Markov decision processes: a taxonomy of problems and algorithms
- Artificial intelligence, chaos, prediction and understanding in science
- Teaching People by Justifying Tree Search Decisions: An Empirical Study in Curling
- Sophisticated inference
- Comparison of deep neural networks and deep hierarchical models for spatio-temporal data
- Meta-modeling game for deriving theory-consistent, microstructure-based traction-separation laws via deep reinforcement learning
- Scalable Online Planning for Multi-Agent MDPs
- DataSHIELD
- On solving the problem of 7-piece chess endgames
- TensorFlow Quantum
- Making sense of sensory input
- The Hanabi challenge: a new frontier for AI research
- DARLA
- Construction of symmetric orthogonal designs with deep Q-network and orthogonal complementary design
- Metric entropy limits on recurrent neural network learning of linear dynamical systems
- A non-cooperative meta-modeling game for automated third-party calibrating, validating and falsifying constitutive laws with parallelized adversarial attacks
- A neural network multigrid solver for the Navier-Stokes equations
- Topological properties of the set of functions generated by neural networks of fixed size
- Dynamic selective maintenance optimization for multi-state systems over a finite horizon: a deep reinforcement learning approach
- Deep reinforcement learning for the optimal placement of cryptocurrency limit orders
- A cooperative game for automated learning of elasto-plasticity knowledge graphs and models with AI-guided experimentation
- A machine learning framework for LES closure terms
- Discovering faster matrix multiplication algorithms with reinforcement learning
- Induction and exploitation of subgoal automata for reinforcement learning
- Manifold learning for parameter reduction
- MOPO
- Reinforcement learning for combinatorial optimization: a survey
- Planning for potential: efficient safe reinforcement learning
- EPANET
- Ent
- Cryst
- BitBlaze
- Dynare
- GameShrink
- Approxrl
- TEXPLORE
- OpenAI Gym
- Uhlig Toolkit
- Albany/FELIX
- SHOP2
- JSBSim
- Title not available (Why is that?)
- Albany
- AI-Toolbox
- POMDPs.jl
- OccBin
- AlphaGo
- GENREG
- Gensys
- DeepStack
- DESPOT
- GDL
- Ray
- Libratus
- REBA
- ChainerRL
- SCIPPlan
- Dopamine
- Stockfish
- Pluribus
- MoHex
- adpProject
- Metagol
- Metaopt
- ORL
- Stable Baselines
- RLBench
- TORCS
- Stable Baselines3
- A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
- ActiveClean
- BigDansing
- AlphaClean
- Distil
- HyperparameterHunter
- SampleClean
- AIspace
- Agent57
- AWESOME
- D4RL
- Inductive general game playing
- Deliberative acting, planning and learning with hierarchical operational models
- Reward is enough
- Compact and efficient encodings for planning in factored state and action spaces with learned binarized neural network transition models
- AlphaTensor
- DeepSynth
- QT-Opt
- FactoredValueMCTS
- SUNRISE
- IMPALA
- Haiku
- Automatic discovery of interpretable planning strategies
- Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management
This page was built for software: AlphaZero