The following pages link to Natural actor-critic algorithms (Q1049136):
Displaying 39 items.
- A constrained optimization perspective on actor-critic algorithms and application to network routing (Q286519) (← links)
- Multiscale Q-learning with linear function approximation (Q312650) (← links)
- An online actor-critic algorithm with function approximation for constrained Markov decision processes (Q438776) (← links)
- Autonomous reinforcement learning with experience replay (Q461126) (← links)
- Parameterized Markov decision process and its application to service rate control (Q492972) (← links)
- Hessian matrix distribution for Bayesian policy gradient reinforcement learning (Q545311) (← links)
- The Borkar-Meyn theorem for asynchronous stochastic approximations (Q553371) (← links)
- An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes (Q616967) (← links)
- The factored policy-gradient planner (Q835832) (← links)
- Reinforcement learning algorithms with function approximation: recent advances and applications (Q903601) (← links)
- Natural actor-critic algorithms (Q1049136) (← links)
- An incremental off-policy search in a model-free Markov decision process using a single sample path (Q1621868) (← links)
- Variance-constrained actor-critic algorithms for discounted and average reward MDPs (Q1689603) (← links)
- Learning and control of exploration primitives (Q1732626) (← links)
- Real-time reinforcement learning by sequential actor-critics and experience replay (Q1784532) (← links)
- Preference-based reinforcement learning: a formal framework and a policy iteration algorithm (Q1945130) (← links)
- Multi-agent reinforcement learning: a selective overview of theories and algorithms (Q2094040) (← links)
- A stability criterion for two timescale stochastic approximation schemes (Q2409333) (← links)
- A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning (Q2633537) (← links)
- On linear and super-linear convergence of natural policy gradient algorithm (Q2670744) (← links)
- Dynamics and risk sharing in groups of selfish individuals (Q2693210) (← links)
- Adaptive critic design with graph Laplacian for online learning control of nonlinear systems (Q2795795) (← links)
- Risk-Constrained Reinforcement Learning with Percentile Risk Criteria (Q4558492) (← links)
- (Q4614110) (← links)
- (Q4969098) (← links)
- (Q4999029) (← links)
- (Q5020557) (← links)
- Actor-Critic Method for High Dimensional Static Hamilton--Jacobi--Bellman Partial Differential Equations based on Neural Networks (Q5021407) (← links)
- Temporal concatenation for Markov decision processes (Q5051192) (← links)
- Risk-Sensitive Reinforcement Learning via Policy Gradient Search (Q5102286) (← links)
- Fast Global Convergence of Natural Policy Gradient Methods with Entropy Regularization (Q5106383) (← links)
- Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies (Q5139670) (← links)
- Deep Reinforcement Learning: A State-of-the-Art Walkthrough (Q5145831) (← links)
- Actor-Critic Algorithms with Online Feature Adaptation (Q5270681) (← links)
- Nonconvex Policy Search Using Variational Inequalities (Q5380851) (← links)
- Multi-agent off-policy actor-critic algorithm for distributed multi-task reinforcement learning (Q6092463) (← links)
- On the sample complexity of actor-critic method for reinforcement learning with function approximation (Q6134324) (← links)
- Multi-agent natural actor-critic reinforcement learning algorithms (Q6159507) (← links)
- Reinforced mixture learning (Q6488829) (← links)