Pages that link to "Item:Q1812928"

From MaRDI portal

← Simple statistical gradient-following algorithms for connectionist reinforcement learning (Q1812928)

Jump to:navigation, search

What links here

⧼whatlinkshere-whatlinkshere-target⧽

Page:

⧼whatlinkshere-whatlinkshere-ns⧽

Namespace:

Invert selection

⧼whatlinkshere-whatlinkshere-filter⧽

Hide transclusions

Hide links

Hide redirects

The following pages link to Simple statistical gradient-following algorithms for connectionist reinforcement learning (Q1812928):

Displaying 50 items.

Adaptive learning via selectionism and Bayesianism. I: Connection between the two (Q280319) (← links)
Adaptive playouts for online learning of policies during Monte Carlo tree search (Q307776) (← links)
A stochastic policy search model for matching behavior (Q350884) (← links)
Active inference and agency: optimal control without cost functions (Q353847) (← links)
Policy search for motor primitives in robotics (Q413874) (← links)
Analysis and improvement of policy gradient estimation (Q448295) (← links)
Autonomous reinforcement learning with experience replay (Q461126) (← links)
Node perturbation learning without noiseless baseline (Q553265) (← links)
Estimation of distributions involving unobservable events: the case of optimal search with unknown target distributions (Q710617) (← links)
The factored policy-gradient planner (Q835832) (← links)
Tutorial series on brain-inspired computing. IV: Reinforcement learning: machine learning and natural learning (Q867508) (← links)
Synaptic dynamics: linear model and adaptation algorithm (Q889273) (← links)
Model-based policy gradients with parameter-based exploration by least-squares conditional density estimation (Q889297) (← links)
Immediate return preference emerged from a synaptic learning rule for return maximization (Q889365) (← links)
Reinforcement learning in the brain (Q1042310) (← links)
Natural actor-critic algorithms (Q1049136) (← links)
Autonomous vehicle navigation using evolutionary reinforcement learning (Q1296036) (← links)
Estimation and approximation bounds for gradient-based reinforcement learning (Q1604222) (← links)
Variance-constrained actor-critic algorithms for discounted and average reward MDPs (Q1689603) (← links)
Optimal node perturbation in linear perceptrons with uncertain eligibility trace (Q1784544) (← links)
Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function (Q1886590) (← links)
Continuous action set learning automata for stochastic optimization (Q1898166) (← links)
Two forms of immediate reward reinforcement learning for exploratory data analysis (Q1932033) (← links)
Preference-based reinforcement learning: a formal framework and a policy iteration algorithm (Q1945130) (← links)
A projected primal-dual gradient optimal control method for deep reinforcement learning (Q1980960) (← links)
Novelty detection improves performance of reinforcement learners in fluctuating, partially observable environments (Q2001717) (← links)
Importance sampling in reinforcement learning with an estimated behavior policy (Q2051319) (← links)
HNS: hierarchical negative sampling for network representation learning (Q2053869) (← links)
Revisiting the ODE method for recursive algorithms: fast convergence using quasi stochastic approximation (Q2070010) (← links)
Dealing with multiple experts and non-stationarity in inverse reinforcement learning: an application to real-life problems (Q2071401) (← links)
Deep reinforcement learning for inventory control: a roadmap (Q2076812) (← links)
Risk-averse policy optimization via risk-neutral policy optimization (Q2082514) (← links)
Measurement error models: from nonparametric methods to deep neural networks (Q2092892) (← links)
Neural large neighborhood search for routing problems (Q2093389) (← links)
Multi-agent reinforcement learning: a selective overview of theories and algorithms (Q2094040) (← links)
Opportunities for reinforcement learning in stochastic dynamic vehicle routing (Q2108171) (← links)
Heavy-tails and randomized restarting beam search in goal-oriented neural sequence decoding (Q2117205) (← links)
Semi-discrete optimization through semi-discrete optimal transport: a framework for neural architecture search (Q2121586) (← links)
Learning the travelling salesperson problem requires rethinking generalization (Q2152276) (← links)
Learning to compute the metric dimension of graphs (Q2161840) (← links)
Hybrid offline/online optimization for energy management via reinforcement learning (Q2170215) (← links)
Approximate Bayesian model inversion for PDEs with heterogeneous and state-dependent coefficients (Q2222341) (← links)
Deep reinforcement learning for the optimal placement of cryptocurrency limit orders (Q2242354) (← links)
Enhance load forecastability: optimize data sampling policy by reinforcing user behaviors (Q2242379) (← links)
Smoothed functional-based gradient algorithms for off-policy reinforcement learning: a non-asymptotic viewpoint (Q2242923) (← links)
A review on deep reinforcement learning for fluid mechanics (Q2245392) (← links)
Model-based reinforcement learning with dimension reduction (Q2281680) (← links)
Branes with brains: exploring string vacua with deep reinforcement learning (Q2314876) (← links)
Compatible natural gradient policy search (Q2320577) (← links)
TD-regularized actor-critic methods (Q2320580) (← links)

Retrieved from "https://portal.mardi4nfdi.de/wiki/Special:WhatLinksHere"