A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning (Q2633537): Difference between revisions

From MaRDI portal
RedirectionBot (talk | contribs)
Changed an Item
Created claim: Wikidata QID (P12): Q129400449, #quickstatements; #temporary_batch_1727994688163
 
(3 intermediate revisions by 3 users not shown)
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1007/s10898-018-0698-y / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2888238026 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3245701 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3795523 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4257216 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Natural actor-critic algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimization of the norm of a vector-valued DC function and applications / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the norm of a dc function / rank
 
Normal rank
Property / cites work
 
Property / cites work: Approximate dynamic programming with a fuzzy parameterization / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4420767 / rank
 
Normal rank
Property / cites work
 
Property / cites work: An interior proximal linearized method for DC programming based on Bregman distance or second-order homogeneous kernels / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3093261 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Method for Finding Structured Sparse Solutions to Nonnegative Least Squares Problems with Applications / rank
 
Normal rank
Property / cites work
 
Property / cites work: Minimizing nonsmooth DC functions via successive DC piecewise-affine approximations / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Reinforcement Learning: A Tutorial Survey and Recent Advances / rank
 
Normal rank
Property / cites work
 
Property / cites work: Solving an Infinite-Horizon Discounted Markov Decision Process by DC Programming and DCA / rank
 
Normal rank
Property / cites work
 
Property / cites work: Double Bundle Method for finding Clarke Stationary Points in Nonsmooth DC Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: A proximal bundle method for nonsmooth DC optimization utilizing nonconvex cutting planes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Convergence of convex functions and duality / rank
 
Normal rank
Property / cites work
 
Property / cites work: 10.1162/1532443041827907 / rank
 
Normal rank
Property / cites work
 
Property / cites work: The DC (Difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Feature selection in machine learning: an exact penalty approach using a difference of convex function algorithm / rank
 
Normal rank
Property / cites work
 
Property / cites work: Self-organizing maps by difference of convex functions optimization / rank
 
Normal rank
Property / cites work
 
Property / cites work: A DC Programming Approach for Finding Communities in Networks / rank
 
Normal rank
Property / cites work
 
Property / cites work: Solving a class of linearly constrained indefinite quadratic problems by DC algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: DC programming and DCA: thirty years of developments / rank
 
Normal rank
Property / cites work
 
Property / cites work: DC approximation approaches for sparse optimization / rank
 
Normal rank
Property / cites work
 
Property / cites work: Feature selection for linear SVMs under uncertain data: robust optimization based on difference of convex functions algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Performance Bounds in $L_p$‐norm for Approximate Value Iteration / rank
 
Normal rank
Property / cites work
 
Property / cites work: Proximal bundle methods for nonsmooth DC programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: An inertial algorithm for DC programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3780016 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Convex analysis approach to d. c. programming: Theory, algorithms and applications / rank
 
Normal rank
Property / cites work
 
Property / cites work: A D.C. Optimization Algorithm for Solving the Trust-Region Subproblem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4315289 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Convex Analysis / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the relations between two types of convergence for convex functions / rank
 
Normal rank
Property / cites work
 
Property / cites work: Discrete tomography by convex--concave regularization and D.C. programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Generalized polynomial approximations in Markovian decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5850827 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Convergence results for single-step on-policy reinforcement-learning algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Global convergence of a proximal linearized algorithm for difference of convex functions / rank
 
Normal rank
Property / cites work
 
Property / cites work: Algorithms for Reinforcement Learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Aggregate codifferential method for nonsmooth DC optimization / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4261789 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Reinforcement learning algorithms with function approximation: recent advances and applications / rank
 
Normal rank
Property / Wikidata QID
 
Property / Wikidata QID: Q129400449 / rank
 
Normal rank

Latest revision as of 23:35, 3 October 2024

scientific article
Language Label Description Also known as
English
A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning
scientific article

    Statements

    A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning (English)
    0 references
    0 references
    0 references
    0 references
    9 May 2019
    0 references
    This paper studies Difference of Convex functions (DC) programming and applies DC Algorithm (DCA) for reinforcement learning.The objective is to estimate an optimal learning policy in the MDP model. The authors solve the problem by finding the zero of the empirical optimal Bellman residual (OBR) via linear approximation. This is done by a unified approach based on DC programming and algorithms. The main contributions are as follows: 1) to develop attractive and efficient DC algorithms based on minimisation of the $l_p$-norm of the empirical OBR; 2) to propose DCA with successive DC decomposition for the squared $l_2$-norm of the empirical OBR; 3) to propose a new formulation of the OBR without using the $l_p$-norm. The results are illustrated by numerical examples.
    0 references
    batch reinforcement learning
    0 references
    Markov decision process
    0 references
    DC programming
    0 references
    dca
    0 references
    optimal Bellman residual
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references

    Identifiers