Efficient Multi-objective Reinforcement Learning via Multiple-gradient Descent with Iteratively Discovered Weight-Vector Sets (Q5145843): Difference between revisions

Latest revision as of 10:03, 24 July 2024

scientific article; zbMATH DE number 7299934

Language	Label	Description	Also known as
English	Efficient Multi-objective Reinforcement Learning via Multiple-gradient Descent with Iteratively Discovered Weight-Vector Sets	scientific article; zbMATH DE number 7299934

Statements

instance of

scholarly article

0 references

title

Efficient Multi-objective Reinforcement Learning via Multiple-gradient Descent with Iteratively Discovered Weight-Vector Sets (English)

0 references

0 references

0 references

Journal of Artificial Intelligence Research

0 references

publication date

22 January 2021

0 references

zbMATH Keywords

reinforcement learning

0 references

Markov decision processes

0 references

describes a project that uses

0 references

0 references

0 references

MaRDI publication profile

0 references

full work available at URL

https://doi.org/10.1613/jair.1.12270

0 references

cites work

Multiple-gradient descent algorithm (MGDA) for multiobjective optimization

0 references

Steepest descent methods for multicriteria optimization.

0 references

Q3103669

0 references

On min-norm and min-max methods of multi-objective optimization

0 references

Q3093188

0 references

Simulation-based optimization of Markov reward processes

0 references

Interactive bundle-based method for nondifferentiable multiobjeective optimization: nimbus<sup>§</sup>

0 references

Sequential Approximate Multiobjective Optimization Using Computational Intelligence

0 references

Multi-objective Reinforcement Learning through Continuous Pareto Manifold Approximation

0 references

Descent algorithm for nonsmooth stochastic multiobjective optimization

0 references

Scenarios and Policy Aggregation in Optimization Under Uncertainty

0 references

A Survey of Multi-Objective Sequential Decision-Making

0 references

Computing Convex Coverage Sets for Faster Multi-objective Coordination

0 references

A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

0 references

Multivariate stochastic approximation using a simultaneous perturbation gradient approximation

0 references

Q4626283

0 references

Average cost temporal-difference learning

0 references

Finding intrinsic rewards by embodied evolution and constrained reinforcement learning

0 references

Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem

0 references

Generalized properly efficient solutions of vector optimization problems

0 references

Identifiers

zbMATH Open document ID

1497.68414

0 references

DOI

10.1613/jair.1.12270

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:5145843

@@ Property / describes a project that uses @@
+TensorFlow
@@ Property / describes a project that uses: TensorFlow / rank @@
+Normal rank
@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1613/jair.1.12270
+Normal rank
@@ Property / OpenAlex ID @@
+W3121322767
@@ Property / OpenAlex ID: W3121322767 / rank @@
+Normal rank
@@ Property / cites work @@
+Multiple-gradient descent algorithm (MGDA) for multiobjective optimization
+Normal rank
@@ Property / cites work @@
+Steepest descent methods for multicriteria optimization.
+Normal rank
@@ Property / cites work @@
+Q3103669
@@ Property / cites work: Q3103669 / rank @@
+Normal rank
@@ Property / cites work @@
+On min-norm and min-max methods of multi-objective optimization
+Normal rank
@@ Property / cites work @@
+Q3093188
@@ Property / cites work: Q3093188 / rank @@
+Normal rank
@@ Property / cites work @@
+Simulation-based optimization of Markov reward processes
+Normal rank
@@ Property / cites work @@
+Interactive bundle-based method for nondifferentiable multiobjeective optimization: nimbus<sup>§</sup>
+Normal rank
@@ Property / cites work @@
+Sequential Approximate Multiobjective Optimization Using Computational Intelligence
+Normal rank
@@ Property / cites work @@
+Multi-objective Reinforcement Learning through Continuous Pareto Manifold Approximation
+Normal rank
@@ Property / cites work @@
+Descent algorithm for nonsmooth stochastic multiobjective optimization
+Normal rank
@@ Property / cites work @@
+Scenarios and Policy Aggregation in Optimization Under Uncertainty
+Normal rank
@@ Property / cites work @@
+A Survey of Multi-Objective Sequential Decision-Making
+Normal rank
@@ Property / cites work @@
+Computing Convex Coverage Sets for Faster Multi-objective Coordination
+Normal rank
@@ Property / cites work @@
+A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
+Normal rank
@@ Property / cites work @@
+Multivariate stochastic approximation using a simultaneous perturbation gradient approximation
+Normal rank
@@ Property / cites work @@
+Q4626283
@@ Property / cites work: Q4626283 / rank @@
+Normal rank
@@ Property / cites work @@
+Average cost temporal-difference learning
@@ Property / cites work: Average cost temporal-difference learning / rank @@
+Normal rank
@@ Property / cites work @@
+Finding intrinsic rewards by embodied evolution and constrained reinforcement learning
+Normal rank
@@ Property / cites work @@
+Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
+Normal rank
@@ Property / cites work @@
+Generalized properly efficient solutions of vector optimization problems
+Normal rank