Reinforcement learning
From MaRDI portal
Publication:6602227
DOI10.1007/978-3-030-06164-7_12zbMATH Open1547.68648MaRDI QIDQ6602227FDOQ6602227
Authors: Olivier Buffet, Olivier Pietquin, Paul Weng
Publication date: 11 September 2024
Cites Work
- Coherent measures of risk
- A tutorial on the cross-entropy method
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- The Linear Programming Approach to Approximate Dynamic Programming
- 10.1162/1532443041827907
- Polynomial Approximation--A New Computational Technique in Dynamic Programming: Allocation Processes
- Functional Approximations and Dynamic Programming
- Preference-based reinforcement learning: a formal framework and a policy iteration algorithm
- A survey of multi-objective sequential decision-making
- Risk measurement with equivalent utility principles
- Title not available (Why is that?)
- The logic of adaptive behavior. Knowledge representation and algorithms for adaptive sequential decision making under uncertainty in first-order and relational domains.
- Transfer learning for reinforcement learning domains: a survey
- Title not available (Why is that?)
- Title not available (Why is that?)
- Innovations in multi-agent systems and application -- 1.
- A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning
- Stochastic dynamic programming with factored representations
- The \(K\)-armed dueling bandits problem
- Linear least-squares algorithms for temporal difference learning
- Natural evolution strategies
- Title not available (Why is that?)
- Risk-Constrained Markov Decision Processes
- Interactive policy learning through confidence-based autonomy
- Variance-penalized Markov decision processes: dynamic programming and reinforcement learning techniques
- Training parsers by inverse reinforcement learning
- Risk-sensitive reinforcement learning
- Bayesian reinforcement learning: a survey
- Inverse reinforcement learning in partially observable environments
- Title not available (Why is that?)
- Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm
- Neuroevolution strategies for episodic reinforcement learning
- Kalman temporal differences
This page was built for publication: Reinforcement learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6602227)