Pages that link to "Item:Q2504518"
From MaRDI portal
The following pages link to An actor-critic algorithm for constrained Markov decision processes (Q2504518):
Displaying 15 items.
- An online actor-critic algorithm with function approximation for constrained Markov decision processes (Q438776) (← links)
- Quasi-Newton smoothed functional algorithms for unconstrained and constrained simulation optimization (Q523576) (← links)
- An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes (Q616967) (← links)
- A new learning algorithm for optimal stopping (Q839001) (← links)
- Variance-constrained actor-critic algorithms for discounted and average reward MDPs (Q1689603) (← links)
- Approachability in Stackelberg stochastic games with vector costs (Q1707454) (← links)
- Delay-aware online service scheduling in high-speed railway communication systems (Q1717936) (← links)
- Whittle index based Q-learning for restless bandits with average reward (Q2116660) (← links)
- A note on linear function approximation using random projections (Q2519761) (← links)
- Opportunistic Transmission over Randomly Varying Channels (Q3616977) (← links)
- Risk-Constrained Reinforcement Learning with Percentile Risk Criteria (Q4558492) (← links)
- Finite-Time Analysis and Restarting Scheme for Linear Two-Time-Scale Stochastic Approximation (Q5009779) (← links)
- Risk-Sensitive Reinforcement Learning via Policy Gradient Search (Q5102286) (← links)
- Optimal Distributed Uplink Channel Allocation: A Constrained MDP Formulation (Q5198538) (← links)
- Safety-constrained reinforcement learning with a distributional safety critic (Q6106435) (← links)