Simple statistical gradient-following algorithms for connectionist reinforcement learning
From MaRDI portal
(Redirected from Publication:1812928)
Recommendations
Cites work
- scientific article; zbMATH DE number 4066707 (Why is no real title available?)
- scientific article; zbMATH DE number 3657150 (Why is no real title available?)
- scientific article; zbMATH DE number 3551675 (Why is no real title available?)
- A new approach to the design of reinforcement schemes for learning automata
- An N-player sequential stochastic game with identical payoffs
- Associative search network: A reinforcement learning associative memory
- Decentralized learning in finite Markov chains
- Pattern-recognizing stochastic learning automata
Cited in
(only showing first 100 items - show all)- Approximate Bayesian model inversion for PDEs with heterogeneous and state-dependent coefficients
- Deep reinforcement learning for option pricing and hedging under dynamic expectile risk measures
- Neural architecture search: a survey
- Automated Deep Learning: Neural Architecture Search Is Not the End
- Accelerating actor-critic-based algorithms via pseudo-labels derived from prior knowledge
- Efficient multi-objective neural architecture search framework via policy gradient algorithm
- Leveraging randomized smoothing for optimal control of nonsmooth dynamical systems
- Differentiable particle filters with smoothly jittered resampling
- Development of a machine learning-based design optimization method for crashworthiness analysis
- Pattern-recognizing stochastic learning automata
- Branes with brains: exploring string vacua with deep reinforcement learning
- Ancestral Gumbel-top-\(k\) sampling for sampling without replacement
- TD-regularized actor-critic methods
- Geometry and convergence of natural policy gradient methods
- Supervised Visual Attention for Simultaneous Multimodal Machine Translation
- A differential Hebbian framework for biologically-plausible motor control
- Knowledge graph embedding with shared latent semantic units
- Variational actor-critic algorithms,
- The factored policy-gradient planner
- Estimation and approximation bounds for gradient-based reinforcement learning
- Actor-Critic–Like Stochastic Adaptive Search for Continuous Simulation Optimization
- Robust flow control and optimal sensor placement using deep reinforcement learning
- Variance-constrained actor-critic algorithms for discounted and average reward MDPs
- Learning flexible sensori-motor mappings in a complex network
- Novelty detection improves performance of reinforcement learners in fluctuating, partially observable environments
- Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm
- Heavy-tails and randomized restarting beam search in goal-oriented neural sequence decoding
- Estimation of distributions involving unobservable events: the case of optimal search with unknown target distributions
- Recent advances in reinforcement learning in finance
- A novel online gait optimization approach for biped robots with point-feet
- Adaptive learning via selectionism and Bayesianism. I: Connection between the two
- Node perturbation learning without noiseless baseline
- Semi-discrete optimization through semi-discrete optimal transport: a framework for neural architecture search
- Learn and route: learning implicit preferences for vehicle routing
- Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function
- Search-engine-augmented dialogue response generation with cheaply supervised query production
- scientific article; zbMATH DE number 7306857 (Why is no real title available?)
- Enhance load forecastability: optimize data sampling policy by reinforcing user behaviors
- Smoothed functional-based gradient algorithms for off-policy reinforcement learning: a non-asymptotic viewpoint
- Convergence of entropy-regularized natural policy gradient with linear function approximation
- Reconstruction of incomplete wildfire data using deep generative models
- Restricted gradient-descent algorithm for value-function approximation in reinforcement learning
- Tutorial series on brain-inspired computing. IV: Reinforcement learning: machine learning and natural learning
- A reward-maximizing spiking neuron as a bounded rational decision maker
- Mining gold from implicit models to improve likelihood-free inference
- A tutorial survey of reinforcement learning
- A two-step algorithm for learning from unspecific reinforcement
- Nonconvex policy search using variational inequalities
- Connecting stochastic optimal control and reinforcement learning
- Deep learning in computational mechanics: a review
- Approximate Newton Policy Gradient Algorithms
- Employing reinforcement learning to enhance particle swarm optimization methods
- Autonomous vehicle navigation using evolutionary reinforcement learning
- Deep Reinforcement Learning: A State-of-the-Art Walkthrough
- Solving non-permutation flow-shop scheduling problem via a novel deep reinforcement learning approach
- Adaptive playouts for online learning of policies during Monte Carlo tree search
- Premium control with reinforcement learning
- Active inference and agency: optimal control without cost functions
- Solving the traveling salesperson problem with precedence constraints by deep reinforcement learning
- Zeroth-order optimization with orthogonal random directions
- A study of mechanisms for improving robotic group performance
- Artificial intelligence for games
- Reinforcement learning
- HiAM: a hierarchical attention based model for knowledge graph multi-hop reasoning
- Analysis and improvement of policy gradient estimation
- Machine Learning: ECML 2004
- STDP-compatible approximation of backpropagation in an energy-based model
- Stochastic learning approach for binary optimization: application to Bayesian optimal design of experiments
- Deep reinforcement learning for the optimal placement of cryptocurrency limit orders
- From Reinforcement Learning to Deep Reinforcement Learning: An Overview
- A review on deep reinforcement learning for fluid mechanics
- Natural actor-critic algorithms
- scientific article; zbMATH DE number 67800 (Why is no real title available?)
- Continuous action set learning automata for stochastic optimization
- Stochastic dynamics of reinforcement learning
- Optimal node perturbation in linear perceptrons with uncertain eligibility trace
- Non-parametric policy search with limited information loss
- scientific article; zbMATH DE number 7370615 (Why is no real title available?)
- Autonomous reinforcement learning with experience replay
- scientific article; zbMATH DE number 7307467 (Why is no real title available?)
- Attention-based exploitation and exploration strategy for multi-hop knowledge graph reasoning
- Model-based policy gradients with parameter-based exploration by least-squares conditional density estimation
- Immediate return preference emerged from a synaptic learning rule for return maximization
- Synaptic dynamics: linear model and adaptation algorithm
- Learning to attend: modeling the shaping of selectivity in infero-temporal cortex in a categorization task
- Posterior weighted reinforcement learning with state uncertainty
- Reinforcement learning for combinatorial optimization: a survey
- scientific article; zbMATH DE number 7453114 (Why is no real title available?)
- Set-to-Sequence Methods in Machine Learning: A Review
- An Introduction to Neural Data Compression
- Greedy attack and Gumbel attack: generating adversarial examples for discrete data
- \textsc{NeVAE}: a deep generative model for molecular graphs
- Softmax policy gradient methods can take exponential time to converge
- Reinforcement learning for a biped robot based on a CPG-actor-critic method
- Learning to compute the metric dimension of graphs
- High generalization performance structured self-attention model for knapsack problem
- Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence
- Fast global convergence of natural policy gradient methods with entropy regularization
- A learning framework for winner-take-all networks with stochastic synapses
- Multi-agent reinforcement learning aided sampling algorithms for a class of multiscale inverse problems
This page was built for publication: Simple statistical gradient-following algorithms for connectionist reinforcement learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1812928)