Temporal-difference search in Computer Go (Q420936): Difference between revisions

@@ Property / author @@
+H. S. Yoon
@@ Property / author: H. S. Yoon / rank @@
+Normal rank
@@ Property / Mathematics Subject Classification ID @@
+A46
@@ Property / Mathematics Subject Classification ID: 91A46 / rank @@
+Normal rank
@@ Property / Mathematics Subject Classification ID @@
+-08
@@ Property / Mathematics Subject Classification ID: 91-08 / rank @@
+Normal rank
@@ Property / Mathematics Subject Classification ID @@
+C40
@@ Property / Mathematics Subject Classification ID: 90C40 / rank @@
+Normal rank
@@ Property / Mathematics Subject Classification ID @@
+C05
@@ Property / Mathematics Subject Classification ID: 65C05 / rank @@
+Normal rank
@@ Property / zbMATH DE Number @@
+6037853
@@ Property / zbMATH DE Number: 6037853 / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+reinforcement learning
@@ Property / zbMATH Keywords: reinforcement learning / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+temporal-difference learning
@@ Property / zbMATH Keywords: temporal-difference learning / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+Monte Carlo search
@@ Property / zbMATH Keywords: Monte Carlo search / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+simulation based search
@@ Property / zbMATH Keywords: simulation based search / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+Computer Go
@@ Property / zbMATH Keywords: Computer Go / rank @@
+Normal rank
@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1007/s10994-012-5280-0
+Normal rank
@@ Property / OpenAlex ID @@
+W2153039919
@@ Property / OpenAlex ID: W2153039919 / rank @@
+Normal rank
@@ Property / cites work @@
+Finite-time analysis of the multiarmed bandit problem
+Normal rank
@@ Property / cites work @@
+Learning to play chess using temporal differences
@@ Property / cites work: Learning to play chess using temporal differences / rank @@
+Normal rank
@@ Property / cites work @@
+Whole-History Rating: A Bayesian Rating System for Players of Time-Varying Strength
+Normal rank
@@ Property / cites work @@
+Q4000104
@@ Property / cites work: Q4000104 / rank @@
+Normal rank
@@ Property / cites work @@
+Analytical mean squared error curves for temporal difference learning
+Normal rank
@@ Property / cites work @@
+Amazons Discover Monte-Carlo
@@ Property / cites work: Amazons Discover Monte-Carlo / rank @@
+Normal rank
@@ Property / cites work @@
+Computer Go
@@ Property / cites work: Computer Go / rank @@
+Normal rank
@@ Property / cites work @@
+Convergence results for single-step on-policy reinforcement-learning algorithms
+Normal rank
@@ Property / cites work @@
+An Analysis of UCT in Multi-player Games
@@ Property / cites work: An Analysis of UCT in Multi-player Games / rank @@
+Normal rank
@@ Property / cites work @@
+.1162/153244303768966102
@@ Property / cites work: 10.1162/153244303768966102 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3724211
@@ Property / cites work: Q3724211 / rank @@
+Normal rank
@@ links / mardi / name / links / mardi / name @@
+Publication:420936