Off-policy temporal difference learning with distribution adaptation in fast mixing chains (Q1797759)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Off-policy temporal difference learning with distribution adaptation in fast mixing chains |
scientific article |
Statements
Off-policy temporal difference learning with distribution adaptation in fast mixing chains (English)
0 references
22 October 2018
0 references
reinforcement learning
0 references
off-policy evaluation
0 references
least-squares temporal difference
0 references
covariate shift adaptation
0 references
mixing time
0 references
0 references