An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions (Q5380403): Difference between revisions

Revision as of 23:36, 13 November 2024

scientific article; zbMATH DE number 7062532

Language	Label	Description	Also known as
English	An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions	scientific article; zbMATH DE number 7062532

Statements

instance of

scholarly article

0 references

title

An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions (English)

0 references

0 references

0 references

0 references

0 references

0 references

4 June 2019

0 references

MaRDI profile type

Publication

0 references

cites work

Online Markov Decision Processes

0 references

Q2921693

0 references

Logarithmic Regret Algorithms for Online Convex Optimization

0 references

Efficient algorithms for online decision problems

0 references

An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions

0 references

Online Markov Decision Processes Under Bandit Feedback

0 references

Q4626283

0 references

Simple statistical gradient-following algorithms for connectionist reinforcement learning

0 references

Markov Decision Processes with Arbitrary Reward Processes

0 references

full work available at URL

https://doi.org/10.1162/neco_a_00808

0 references

Identifiers

zbMATH Open document ID

1472.68149

0 references

DOI

10.1162/NECO_a_00808

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

journals/neco/MaZHS16

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:5380403

@@ Property / MaRDI profile type @@
+Publication
@@ Property / MaRDI profile type: Publication / rank @@
+Normal rank
@@ Property / cites work @@
+Online Markov Decision Processes
@@ Property / cites work: Online Markov Decision Processes / rank @@
+Normal rank
@@ Property / cites work @@
+Q2921693
@@ Property / cites work: Q2921693 / rank @@
+Normal rank
@@ Property / cites work @@
+Logarithmic Regret Algorithms for Online Convex Optimization
+Normal rank
@@ Property / cites work @@
+Efficient algorithms for online decision problems
@@ Property / cites work: Efficient algorithms for online decision problems / rank @@
+Normal rank
@@ Property / cites work @@
+An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions
+Normal rank
@@ Property / cites work @@
+Online Markov Decision Processes Under Bandit Feedback
+Normal rank
@@ Property / cites work @@
+Q4626283
@@ Property / cites work: Q4626283 / rank @@
+Normal rank
@@ Property / cites work @@
+Simple statistical gradient-following algorithms for connectionist reinforcement learning
+Normal rank
@@ Property / cites work @@
+Markov Decision Processes with Arbitrary Reward Processes
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1162/neco_a_00808
+Normal rank
@@ Property / OpenAlex ID @@
+W2225522132
@@ Property / OpenAlex ID: W2225522132 / rank @@
+Normal rank
@@ Property / DBLP publication ID @@
+journals/neco/MaZHS16
@@ Property / DBLP publication ID: journals/neco/MaZHS16 / rank @@
+Normal rank
@@ links / mardi / name / links / mardi / name @@
+Publication:5380403