Near-optimal regret bounds for reinforcement learning (Q2896090)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Near-optimal regret bounds for reinforcement learning |
scientific article; zbMATH DE number 6055537
| Language | Label | Description | Also known as |
|---|---|---|---|
| default for all languages | No label defined |
||
| English | Near-optimal regret bounds for reinforcement learning |
scientific article; zbMATH DE number 6055537 |
Statements
13 July 2012
0 references
undiscounted reinforcement learning
0 references
Markov decision process
0 references
regret
0 references
online learning
0 references
sample complexity
0 references
Near-optimal regret bounds for reinforcement learning (English)
0 references
0.8222609162330627
0 references
0.8211166262626648
0 references
0.8152390122413635
0 references
0.8134222626686096
0 references