Model-based policy gradients with parameter-based exploration by least-squares conditional density estimation (Q889297): Difference between revisions

Revision as of 00:23, 11 July 2024

scientific article

Language	Label	Description	Also known as
default for all languages	No label defined
English	Model-based policy gradients with parameter-based exploration by least-squares conditional density estimation	scientific article

Statements

instance of

scholarly article

0 references

title

Model-based policy gradients with parameter-based exploration by least-squares conditional density estimation (English)

0 references

0 references

0 references

0 references

0 references

0 references

0 references

6 November 2015

0 references

full work available at URL

https://arxiv.org/abs/1307.5118

0 references

zbMATH Keywords

reinforcement learning

0 references

transition model estimation

0 references

conditional density estimation

0 references

describes a project that uses

PILCO

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Efficient exploration through active learning for value function approximation in reinforcement learning

0 references

Using Expectation-Maximization for Reinforcement Learning

0 references

Q4704221

0 references

Adaptive importance sampling for value function approximation in off-policy reinforcement learning

0 references

A least-squares approach to direct importance estimation

0 references

Statistical analysis of kernel-based least-squares density-ratio estimation

0 references

Computational complexity of kernel-based density-ratio estimation: a condition number analysis

0 references

Policy search for motor primitives in robotics

0 references

Model-based contextual policy search for data-efficient generalization of robot skills

0 references

10.1162/1532443041827907

0 references

Q3394879

0 references

Density-ratio matching under the Bregman divergence: a unified framework of density-ratio estimation

0 references

Sufficient Dimension Reduction via Squared-Loss Mutual Information Estimation

0 references

Simple statistical gradient-following algorithms for connectionist reinforcement learning

0 references

Analysis and improvement of policy gradient estimation

0 references

Efficient Sample Reuse in Policy Gradients with Parameter-Based Exploration

0 references

Identifiers

zbMATH Open document ID

1325.68200

0 references

DOI

10.1016/j.neunet.2014.06.006

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:889297

@@ Property / cites work @@
+Efficient exploration through active learning for value function approximation in reinforcement learning
+Normal rank
@@ Property / cites work @@
+Using Expectation-Maximization for Reinforcement Learning
+Normal rank
@@ Property / cites work @@
+Q4704221
@@ Property / cites work: Q4704221 / rank @@
+Normal rank
@@ Property / cites work @@
+Adaptive importance sampling for value function approximation in off-policy reinforcement learning
+Normal rank
@@ Property / cites work @@
+A least-squares approach to direct importance estimation
+Normal rank
@@ Property / cites work @@
+Statistical analysis of kernel-based least-squares density-ratio estimation
+Normal rank
@@ Property / cites work @@
+Computational complexity of kernel-based density-ratio estimation: a condition number analysis
+Normal rank
@@ Property / cites work @@
+Policy search for motor primitives in robotics
@@ Property / cites work: Policy search for motor primitives in robotics / rank @@
+Normal rank
@@ Property / cites work @@
+Model-based contextual policy search for data-efficient generalization of robot skills
+Normal rank
@@ Property / cites work @@
+.1162/1532443041827907
@@ Property / cites work: 10.1162/1532443041827907 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3394879
@@ Property / cites work: Q3394879 / rank @@
+Normal rank
@@ Property / cites work @@
+Density-ratio matching under the Bregman divergence: a unified framework of density-ratio estimation
+Normal rank
@@ Property / cites work @@
+Sufficient Dimension Reduction via Squared-Loss Mutual Information Estimation
+Normal rank
@@ Property / cites work @@
+Simple statistical gradient-following algorithms for connectionist reinforcement learning
+Normal rank
@@ Property / cites work @@
+Analysis and improvement of policy gradient estimation
+Normal rank
@@ Property / cites work @@
+Efficient Sample Reuse in Policy Gradients with Parameter-Based Exploration
+Normal rank

Model-based policy gradients with parameter-based exploration by least-squares conditional density estimation (Q889297): Difference between revisions

ReferenceBot (talk | contribs)

Revision as of 00:23, 11 July 2024

Statements

Identifiers

Sitelinks

Mathematics(1 entry)