Accelerating actor-critic-based algorithms via pseudo-labels derived from prior knowledge (Q6126872): Difference between revisions
From MaRDI portal
Created a new Item |
Normalize DOI. |
||
(4 intermediate revisions by 4 users not shown) | |||
Property / DOI | |||
Property / DOI: 10.1016/j.ins.2024.120182 / rank | |||
Property / full work available at URL | |||
Property / full work available at URL: https://doi.org/10.1016/j.ins.2024.120182 / rank | |||
Normal rank | |||
Property / OpenAlex ID | |||
Property / OpenAlex ID: W4391133699 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q4626283 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: MM Optimization Algorithms / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Simple statistical gradient-following algorithms for connectionist reinforcement learning / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Overcoming catastrophic forgetting in neural networks / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q5148970 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q5053301 / rank | |||
Normal rank | |||
Property / Wikidata QID | |||
Property / Wikidata QID: Q129390077 / rank | |||
Normal rank | |||
Property / DOI | |||
Property / DOI: 10.1016/J.INS.2024.120182 / rank | |||
Normal rank | |||
links / mardi / name | links / mardi / name | ||
Latest revision as of 18:42, 30 December 2024
scientific article; zbMATH DE number 7829860
Language | Label | Description | Also known as |
---|---|---|---|
English | Accelerating actor-critic-based algorithms via pseudo-labels derived from prior knowledge |
scientific article; zbMATH DE number 7829860 |
Statements
Accelerating actor-critic-based algorithms via pseudo-labels derived from prior knowledge (English)
0 references
10 April 2024
0 references
reinforcement learning
0 references
deep RL
0 references
actor-critic methods
0 references
policy optimization
0 references
sample efficiency
0 references
exploration
0 references
0 references