A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning (Q2633537)

From MaRDI portal

Jump to:navigation, search

scientific article

Language	Label	Description	Also known as
English	A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning	scientific article

Statements

scholarly article

0 references

A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning (English)

0 references

0 references

0 references

0 references

Journal of Global Optimization

0 references

publication date

9 May 2019

0 references

This paper studies Difference of Convex functions (DC) programming and applies DC Algorithm (DCA) for reinforcement learning.The objective is to estimate an optimal learning policy in the MDP model. The authors solve the problem by finding the zero of the empirical optimal Bellman residual (OBR) via linear approximation. This is done by a unified approach based on DC programming and algorithms. The main contributions are as follows: 1) to develop attractive and efficient DC algorithms based on minimisation of the $l_p$-norm of the empirical OBR; 2) to propose DCA with successive DC decomposition for the squared $l_2$-norm of the empirical OBR; 3) to propose a new formulation of the OBR without using the $l_p$-norm. The results are illustrated by numerical examples.

0 references

Anna Jaśkiewicz

0 references

zbMATH Keywords

batch reinforcement learning

0 references

Markov decision process

0 references

DC programming

0 references

dca

0 references

optimal Bellman residual

0 references

MaRDI profile type

MaRDI publication profile

0 references

full work available at URL

https://doi.org/10.1007/s10898-018-0698-y

0 references

Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path

0 references

0 references

0 references

0 references

Natural actor-critic algorithms

0 references

Optimization of the norm of a vector-valued DC function and applications

0 references

On the norm of a dc function

0 references

Approximate dynamic programming with a fuzzy parameterization

0 references

0 references

An interior proximal linearized method for DC programming based on Bregman distance or second-order homogeneous kernels

0 references

0 references

A Method for Finding Structured Sparse Solutions to Nonnegative Least Squares Problems with Applications

0 references

Minimizing nonsmooth DC functions via successive DC piecewise-affine approximations

0 references

A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning

0 references

Reinforcement Learning: A Tutorial Survey and Recent Advances

0 references

Solving an Infinite-Horizon Discounted Markov Decision Process by DC Programming and DCA

0 references

Double Bundle Method for finding Clarke Stationary Points in Nonsmooth DC Programming

0 references

A proximal bundle method for nonsmooth DC optimization utilizing nonconvex cutting planes

0 references

Convergence of convex functions and duality

0 references

10.1162/1532443041827907

0 references

The DC (Difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems

0 references

Feature selection in machine learning: an exact penalty approach using a difference of convex function algorithm

0 references

Self-organizing maps by difference of convex functions optimization

0 references

A DC Programming Approach for Finding Communities in Networks

0 references

Solving a class of linearly constrained indefinite quadratic problems by DC algorithms

0 references

DC programming and DCA: thirty years of developments

0 references

DC approximation approaches for sparse optimization

0 references

Feature selection for linear SVMs under uncertain data: robust optimization based on difference of convex functions algorithms

0 references

Performance Bounds in $L_p$‐norm for Approximate Value Iteration

0 references

Proximal bundle methods for nonsmooth DC programming

0 references

An inertial algorithm for DC programming

0 references

0 references

Convex analysis approach to d. c. programming: Theory, algorithms and applications

0 references

A D.C. Optimization Algorithm for Solving the Trust-Region Subproblem

0 references

0 references

Convex Analysis

0 references

On the relations between two types of convergence for convex functions

0 references

Discrete tomography by convex--concave regularization and D.C. programming

0 references

Generalized polynomial approximations in Markovian decision processes

0 references

0 references

Convergence results for single-step on-policy reinforcement-learning algorithms

0 references

Global convergence of a proximal linearized algorithm for difference of convex functions

0 references

Algorithms for Reinforcement Learning

0 references

Aggregate codifferential method for nonsmooth DC optimization

0 references

0 references

Reinforcement learning algorithms with function approximation: recent advances and applications

0 references

Identifiers

zbMATH Open document ID

0 references

10.1007/s10898-018-0698-y

0 references

Mathematics Subject Classification ID

0 references

0 references

zbMATH DE Number

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:2633537

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Item:Q2633537&oldid=37850546"