Off-policy temporal difference learning with distribution adaptation in fast mixing chains (Q1797759)

From MaRDI portal





scientific article; zbMATH DE number 6960085
Language Label Description Also known as
default for all languages
No label defined
    English
    Off-policy temporal difference learning with distribution adaptation in fast mixing chains
    scientific article; zbMATH DE number 6960085

      Statements

      Off-policy temporal difference learning with distribution adaptation in fast mixing chains (English)
      0 references
      0 references
      0 references
      22 October 2018
      0 references
      reinforcement learning
      0 references
      off-policy evaluation
      0 references
      least-squares temporal difference
      0 references
      covariate shift adaptation
      0 references
      mixing time
      0 references

      Identifiers