Pages that link to "Item:Q4443033"

From MaRDI portal

← OnActor-Critic Algorithms (Q4443033)

Jump to:navigation, search

What links here

⧼whatlinkshere-whatlinkshere-target⧽

Page:

⧼whatlinkshere-whatlinkshere-ns⧽

Namespace:

Invert selection

⧼whatlinkshere-whatlinkshere-filter⧽

Hide transclusions

Hide links

Hide redirects

The following pages link to OnActor-Critic Algorithms (Q4443033):

Displaying 50 items.

A constrained optimization perspective on actor-critic algorithms and application to network routing (Q286519) (← links)
Multiscale Q-learning with linear function approximation (Q312650) (← links)
Efficient model-based reinforcement learning for approximate online optimal control (Q340682) (← links)
Dynamic treatment regimes: technical challenges and applications (Q405345) (← links)
An online actor-critic algorithm with function approximation for constrained Markov decision processes (Q438776) (← links)
Stabilization of stochastic approximation by step size adaptation (Q450652) (← links)
Autonomous reinforcement learning with experience replay (Q461126) (← links)
An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes (Q616967) (← links)
A new learning algorithm for optimal stopping (Q839001) (← links)
An adaptive actor-critic algorithm with multi-step simulated experiences for controlling nonholonomic mobile robots (Q855223) (← links)
Actor-critic algorithms for hierarchical Markov decision processes (Q856510) (← links)
Convergence rate and averaging of nonlinear two-time-scale stochastic approximation algo\-rithms (Q862224) (← links)
Tutorial series on brain-inspired computing. IV: Reinforcement learning: machine learning and natural learning (Q867508) (← links)
Immediate return preference emerged from a synaptic learning rule for return maximization (Q889365) (← links)
Model-based reinforcement learning for approximate optimal regulation (Q899267) (← links)
Reinforcement learning algorithms with function approximation: recent advances and applications (Q903601) (← links)
Reinforcement learning in the brain (Q1042310) (← links)
Natural actor-critic algorithms (Q1049136) (← links)
Estimation and approximation bounds for gradient-based reinforcement learning (Q1604222) (← links)
An incremental off-policy search in a model-free Markov decision process using a single sample path (Q1621868) (← links)
Variance-constrained actor-critic algorithms for discounted and average reward MDPs (Q1689603) (← links)
Totally model-free actor-critic recurrent neural-network reinforcement learning in non-Markovian domains (Q1699932) (← links)
Asymptotic bias of stochastic gradient search (Q1704136) (← links)
Reinforcement learning for a class of continuous-time input constrained optimal control problems (Q1716659) (← links)
Real-time reinforcement learning by sequential actor-critics and experience replay (Q1784532) (← links)
Convergence rate of linear two-time-scale stochastic approximation. (Q1879892) (← links)
Stochastic optimization for real time service capacity allocation under random service demand (Q1931638) (← links)
Finding intrinsic rewards by embodied evolution and constrained reinforcement learning (Q1932114) (← links)
Approximate stochastic annealing for online control of infinite horizon Markov decision processes (Q1937498) (← links)
Preference-based reinforcement learning: a formal framework and a policy iteration algorithm (Q1945130) (← links)
Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains (Q1959632) (← links)
Sell or store? An ADP approach to marketing renewable energy (Q2011830) (← links)
Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling (Q2051259) (← links)
Fundamental design principles for reinforcement learning algorithms (Q2094028) (← links)
Mixed density methods for approximate dynamic programming (Q2094030) (← links)
Multi-agent reinforcement learning: a selective overview of theories and algorithms (Q2094040) (← links)
Neural circuits for learning context-dependent associations of stimuli (Q2182880) (← links)
Weak convergence of dynamical systems in two timescales (Q2203452) (← links)
Control strategy of speed servo systems based on deep reinforcement learning (Q2283851) (← links)
TD-regularized actor-critic methods (Q2320580) (← links)
Performance optimization for a class of generalized stochastic Petri nets (Q2348377) (← links)
Reinforcement learning for a biped robot based on a CPG-actor-critic method (Q2383520) (← links)
A tutorial on the cross-entropy method (Q2485925) (← links)
Linear stochastic approximation driven by slowly varying Markov chains (Q2503529) (← links)
An actor-critic algorithm for constrained Markov decision processes (Q2504518) (← links)
Dynamic programming and suboptimal control: a survey from ADP to MPC (Q2511993) (← links)
Reinforcement learning based algorithms for average cost Markov decision processes (Q2643632) (← links)
Two-timescale stochastic gradient descent in continuous time with applications to joint online parameter estimation and optimal sensor placement (Q2692526) (← links)
On Generalized Bellman Equations and Temporal-Difference Learning (Q3305109) (← links)
A Spiking Neural Network Model of an Actor-Critic Learning Agent (Q3612121) (← links)

Retrieved from "https://portal.mardi4nfdi.de/wiki/Special:WhatLinksHere/Item:Q4443033"