Pages that link to "Item:Q1812928"
From MaRDI portal
The following pages link to Simple statistical gradient-following algorithms for connectionist reinforcement learning (Q1812928):
Displaying 50 items.
- Adaptive learning via selectionism and Bayesianism. I: Connection between the two (Q280319) (← links)
- Adaptive playouts for online learning of policies during Monte Carlo tree search (Q307776) (← links)
- A stochastic policy search model for matching behavior (Q350884) (← links)
- Active inference and agency: optimal control without cost functions (Q353847) (← links)
- Policy search for motor primitives in robotics (Q413874) (← links)
- Analysis and improvement of policy gradient estimation (Q448295) (← links)
- Autonomous reinforcement learning with experience replay (Q461126) (← links)
- Node perturbation learning without noiseless baseline (Q553265) (← links)
- Estimation of distributions involving unobservable events: the case of optimal search with unknown target distributions (Q710617) (← links)
- The factored policy-gradient planner (Q835832) (← links)
- Tutorial series on brain-inspired computing. IV: Reinforcement learning: machine learning and natural learning (Q867508) (← links)
- Synaptic dynamics: linear model and adaptation algorithm (Q889273) (← links)
- Model-based policy gradients with parameter-based exploration by least-squares conditional density estimation (Q889297) (← links)
- Immediate return preference emerged from a synaptic learning rule for return maximization (Q889365) (← links)
- Reinforcement learning in the brain (Q1042310) (← links)
- Natural actor-critic algorithms (Q1049136) (← links)
- Autonomous vehicle navigation using evolutionary reinforcement learning (Q1296036) (← links)
- Estimation and approximation bounds for gradient-based reinforcement learning (Q1604222) (← links)
- Variance-constrained actor-critic algorithms for discounted and average reward MDPs (Q1689603) (← links)
- Optimal node perturbation in linear perceptrons with uncertain eligibility trace (Q1784544) (← links)
- Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function (Q1886590) (← links)
- Continuous action set learning automata for stochastic optimization (Q1898166) (← links)
- Two forms of immediate reward reinforcement learning for exploratory data analysis (Q1932033) (← links)
- Preference-based reinforcement learning: a formal framework and a policy iteration algorithm (Q1945130) (← links)
- A projected primal-dual gradient optimal control method for deep reinforcement learning (Q1980960) (← links)
- Novelty detection improves performance of reinforcement learners in fluctuating, partially observable environments (Q2001717) (← links)
- Importance sampling in reinforcement learning with an estimated behavior policy (Q2051319) (← links)
- HNS: hierarchical negative sampling for network representation learning (Q2053869) (← links)
- Revisiting the ODE method for recursive algorithms: fast convergence using quasi stochastic approximation (Q2070010) (← links)
- Dealing with multiple experts and non-stationarity in inverse reinforcement learning: an application to real-life problems (Q2071401) (← links)
- Deep reinforcement learning for inventory control: a roadmap (Q2076812) (← links)
- Risk-averse policy optimization via risk-neutral policy optimization (Q2082514) (← links)
- Measurement error models: from nonparametric methods to deep neural networks (Q2092892) (← links)
- Neural large neighborhood search for routing problems (Q2093389) (← links)
- Multi-agent reinforcement learning: a selective overview of theories and algorithms (Q2094040) (← links)
- Opportunities for reinforcement learning in stochastic dynamic vehicle routing (Q2108171) (← links)
- Heavy-tails and randomized restarting beam search in goal-oriented neural sequence decoding (Q2117205) (← links)
- Semi-discrete optimization through semi-discrete optimal transport: a framework for neural architecture search (Q2121586) (← links)
- Learning the travelling salesperson problem requires rethinking generalization (Q2152276) (← links)
- Learning to compute the metric dimension of graphs (Q2161840) (← links)
- Hybrid offline/online optimization for energy management via reinforcement learning (Q2170215) (← links)
- Approximate Bayesian model inversion for PDEs with heterogeneous and state-dependent coefficients (Q2222341) (← links)
- Deep reinforcement learning for the optimal placement of cryptocurrency limit orders (Q2242354) (← links)
- Enhance load forecastability: optimize data sampling policy by reinforcing user behaviors (Q2242379) (← links)
- Smoothed functional-based gradient algorithms for off-policy reinforcement learning: a non-asymptotic viewpoint (Q2242923) (← links)
- A review on deep reinforcement learning for fluid mechanics (Q2245392) (← links)
- Model-based reinforcement learning with dimension reduction (Q2281680) (← links)
- Branes with brains: exploring string vacua with deep reinforcement learning (Q2314876) (← links)
- Compatible natural gradient policy search (Q2320577) (← links)
- TD-regularized actor-critic methods (Q2320580) (← links)