An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions (Q5380403): Difference between revisions

From MaRDI portal
Created claim: Wikidata QID (P12): Q47600318, #quickstatements; #temporary_batch_1707252663060
Created claim: DBLP publication ID (P1635): journals/neco/MaZHS16, #quickstatements; #temporary_batch_1731530891435
(4 intermediate revisions by 4 users not shown)
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / cites work
 
Property / cites work: Online Markov Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q2921693 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Logarithmic Regret Algorithms for Online Convex Optimization / rank
 
Normal rank
Property / cites work
 
Property / cites work: Efficient algorithms for online decision problems / rank
 
Normal rank
Property / cites work
 
Property / cites work: An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions / rank
 
Normal rank
Property / cites work
 
Property / cites work: Online Markov Decision Processes Under Bandit Feedback / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4626283 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Simple statistical gradient-following algorithms for connectionist reinforcement learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Markov Decision Processes with Arbitrary Reward Processes / rank
 
Normal rank
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1162/neco_a_00808 / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2225522132 / rank
 
Normal rank
Property / DBLP publication ID
 
Property / DBLP publication ID: journals/neco/MaZHS16 / rank
 
Normal rank
links / mardi / namelinks / mardi / name
 

Revision as of 22:36, 13 November 2024

scientific article; zbMATH DE number 7062532
Language Label Description Also known as
English
An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions
scientific article; zbMATH DE number 7062532

    Statements

    Identifiers