Learning Algorithms for Markov Decision Processes with Average Cost

From MaRDI portal
Revision as of 14:24, 3 February 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:2753225

DOI10.1137/S0363012999361974zbMath1001.93091OpenAlexW2154204727MaRDI QIDQ2753225

Dimitri P. Bertsekas, Vivek S. Borkar, Jinane Abounadi

Publication date: 29 October 2001

Published in: SIAM Journal on Control and Optimization (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1137/s0363012999361974




Related Items (26)

Multiscale Q-learning with linear function approximationA sojourn-based approach to semi-Markov reinforcement learningDeep reinforcement learning for wireless sensor scheduling in cyber-physical systemsRisk-Sensitive Reinforcement Learning via Policy Gradient SearchA framework for transforming specifications in reinforcement learningOptimal sensor scheduling for remote state estimation with limited bandwidth: a deep reinforcement learning approachStochastic Fixed-Point Iterations for Nonexpansive Maps: Convergence and Error BoundsApproachability in Stackelberg stochastic games with vector costsAnalyzing anonymity attacks through noisy channelsSolutions of the average cost optimality equation for Markov decision processes with weakly continuous kernel: the fixed-point approach revisitedQ-learning for Markov decision processes with a satisfiability criterionVariance-penalized Markov decision processes: dynamic programming and reinforcement learning techniquesLearning dynamic prices in electronic retail markets with customer segmentationOptimal Distributed Uplink Channel Allocation: A Constrained MDP FormulationOpportunistic Transmission over Randomly Varying ChannelsEmpirical Dynamic ProgrammingFitted Q-iteration by functional networks for control problemsA perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costsLook-ahead control of conveyor-serviced production station by using potential-based online policy iterationNatural actor-critic algorithmsDynamic pricing models for electronic businessFundamental design principles for reinforcement learning algorithmsEmpirical Q-Value IterationApproximation of average cost Markov decision processes using empirical distributions and concentration inequalitiesBatch policy learning in average reward Markov decision processesWhittle index based Q-learning for restless bandits with average reward







This page was built for publication: Learning Algorithms for Markov Decision Processes with Average Cost