Simulation-based search (Q6198646)
From MaRDI portal
scientific article; zbMATH DE number 7821712
Language | Label | Description | Also known as |
---|---|---|---|
English | Simulation-based search |
scientific article; zbMATH DE number 7821712 |
Statements
Simulation-based search (English)
0 references
20 March 2024
0 references
Summary: Planning is one of the oldest and most important problems in artificial intelligence. Simulation-based search algorithms, such as AlphaZero, have achieved superhuman performance in chess and Go and are used widely in real-world applications of planning. In this paper we provide a unified framework for simulation-based search. Algorithms in this framework interleave operators for policy evaluation (better estimating the value function of the current policy) and policy improvement (using the value function to form a better policy). These operators are applied to states and actions that are sampled in sequential trajectories, and that may branch recursively into other sampled trajectories. The value function and policy may also be represented by a function approximator. Our framework includes a broad family of search algorithms that includes Monte-Carlo tree search, sparse sampling, nested Monte-Carlo search, classification-based policy iteration, and AlphaZero. For the entire collection see [Zbl 07816360].
0 references
planning
0 references
reinforcement learning
0 references
Markov decision processes
0 references
Monte-Carlo tree search
0 references
Monte-Carlo simulation
0 references
0 references
0 references