Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function (Q1886590): Difference between revisions

Latest revision as of 15:24, 7 June 2024

scientific article

Language	Label	Description	Also known as
English	Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function	scientific article

Statements

instance of

scholarly article

0 references

title

Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function (English)

0 references

zbMATH Open document ID

1067.68591

0 references

DOI

10.1016/j.neunet.2004.05.004

0 references

0 references

0 references

0 references

18 November 2004

0 references

Mathematics Subject Classification ID

0 references

0 references

Internal prediction

0 references

Reliability

0 references

Model-free reinforcement learning

0 references

TD learning

0 references

Discount rate

0 references

Exploration-exploitation balance

0 references

Temperature parameter

0 references

Meta-learning

0 references

Wikidata QID

Q40489238

0 references

MaRDI profile type

MaRDI publication profile

0 references

full work available at URL

https://doi.org/10.1016/j.neunet.2004.05.004

0 references

OpenAlex ID

W2009424996

0 references

cites work

A near-optimal polynomial time algorithm for learning in certain classes of stochastic games

0 references

Dual-control theory. I

0 references

Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function

0 references

\({\mathcal Q}\)-learning

0 references

Mean, variance and probabilistic criteria in finite Markov decision processes: A review

0 references

Simple statistical gradient-following algorithms for connectionist reinforcement learning

0 references

The apparent conflict between estimation and control - a survey of the two-armed bandit problem

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:1886590

@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1016/j.neunet.2004.05.004
+Normal rank
@@ Property / OpenAlex ID @@
+W2009424996
@@ Property / OpenAlex ID: W2009424996 / rank @@
+Normal rank
@@ Property / cites work @@
+A near-optimal polynomial time algorithm for learning in certain classes of stochastic games
+Normal rank
@@ Property / cites work @@
+Dual-control theory. I
@@ Property / cites work: Dual-control theory. I / rank @@
+Normal rank
@@ Property / cites work @@
+Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function
+Normal rank
@@ Property / cites work @@
+\({\mathcal Q}\)-learning
@@ Property / cites work: \({\mathcal Q}\)-learning / rank @@
+Normal rank
@@ Property / cites work @@
+Mean, variance and probabilistic criteria in finite Markov decision processes: A review
+Normal rank
@@ Property / cites work @@
+Simple statistical gradient-following algorithms for connectionist reinforcement learning
+Normal rank
@@ Property / cites work @@
+The apparent conflict between estimation and control - a survey of the two-armed bandit problem
+Normal rank