Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm (Q300040): Difference between revisions

From MaRDI portal
Importer (talk | contribs)
Created a new Item
 
Normalize DOI.
 
(9 intermediate revisions by 8 users not shown)
Property / DOI
 
Property / DOI: 10.1016/j.ejor.2014.08.023 / rank
Normal rank
 
Property / author
 
Property / author: Edilson F. Arruda / rank
Normal rank
 
Property / author
 
Property / author: Marcelo Dutra Fragoso / rank
Normal rank
 
Property / Mathematics Subject Classification ID
 
Property / Mathematics Subject Classification ID: 90C40 / rank
 
Normal rank
Property / Mathematics Subject Classification ID
 
Property / Mathematics Subject Classification ID: 90C39 / rank
 
Normal rank
Property / zbMATH DE Number
 
Property / zbMATH DE Number: 6597117 / rank
 
Normal rank
Property / zbMATH Keywords
 
dynamic programming
Property / zbMATH Keywords: dynamic programming / rank
 
Normal rank
Property / zbMATH Keywords
 
Markov decision processes
Property / zbMATH Keywords: Markov decision processes / rank
 
Normal rank
Property / zbMATH Keywords
 
embedding
Property / zbMATH Keywords: embedding / rank
 
Normal rank
Property / zbMATH Keywords
 
time aggregation
Property / zbMATH Keywords: time aggregation / rank
 
Normal rank
Property / zbMATH Keywords
 
stochastic optimal control
Property / zbMATH Keywords: stochastic optimal control / rank
 
Normal rank
Property / author
 
Property / author: Edilson F. Arruda / rank
 
Normal rank
Property / author
 
Property / author: Marcelo Dutra Fragoso / rank
 
Normal rank
Property / describes a project that uses
 
Property / describes a project that uses: FODD-Planner / rank
 
Normal rank
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1016/j.ejor.2014.08.023 / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2072370473 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Time aggregated Markov decision processes via standard dynamic programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Approximate dynamic programming via direct search in the space of value function approximations / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stability and optimality of a multi-product production and storage system under demand uncertainty / rank
 
Normal rank
Property / cites work
 
Property / cites work: Accelerating the convergence of value iteration by using partial transition functions / rank
 
Normal rank
Property / cites work
 
Property / cites work: A New Value Iteration method for the Average Cost Dynamic Programming Problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q2925454 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4257216 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4256521 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Approximate dynamic programming with a fuzzy parameterization / rank
 
Normal rank
Property / cites work
 
Property / cites work: A time aggregation approach to Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Simulation-based algorithms for Markov decision processes. / rank
 
Normal rank
Property / cites work
 
Property / cites work: Sufficient Classes of Strategies in Discrete Dynamic Programming I: Decomposition of Randomized Strategies and Embedded Models / rank
 
Normal rank
Property / cites work
 
Property / cites work: LAO*: A heuristic search algorithm that finds solutions with loops / rank
 
Normal rank
Property / cites work
 
Property / cites work: Probabilistic Relational Planning with First Order Decision Diagrams / rank
 
Normal rank
Property / cites work
 
Property / cites work: Exact finite approximations of average-cost countable Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Reducing reinforcement learning to KWIK online regression / rank
 
Normal rank
Property / cites work
 
Property / cites work: Kernel-based reinforcement learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Distributed Actor-Critic Algorithm and Applications to Mobile Sensor Network Coordination Problems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Approximate Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4315289 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Markov decision Processes with fractional costs / rank
 
Normal rank
Property / cites work
 
Property / cites work: Incremental Value Iteration for Time-Aggregated Markov-Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: An analysis of temporal-difference learning with function approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Lebesgue-Sampling-Based Optimal Control Problems With Time Aggregation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Performance gradient estimation for the very large finite Markov chains / rank
 
Normal rank
Property / DOI
 
Property / DOI: 10.1016/J.EJOR.2014.08.023 / rank
 
Normal rank
links / mardi / namelinks / mardi / name
 

Latest revision as of 13:49, 9 December 2024

scientific article
Language Label Description Also known as
English
Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm
scientific article

    Statements

    Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm (English)
    0 references
    23 June 2016
    0 references
    dynamic programming
    0 references
    Markov decision processes
    0 references
    embedding
    0 references
    time aggregation
    0 references
    stochastic optimal control
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references

    Identifiers