Preference-based reinforcement learning: a formal framework and a policy iteration algorithm (Q1945130): Difference between revisions

From MaRDI portal
Added link to MaRDI item.
ReferenceBot (talk | contribs)
Changed an Item
 
(4 intermediate revisions by 4 users not shown)
Property / describes a project that uses
 
Property / describes a project that uses: WEKA / rank
 
Normal rank
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1007/s10994-012-5313-8 / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2154023516 / rank
 
Normal rank
Property / Wikidata QID
 
Property / Wikidata QID: Q59195227 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4252717 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Finite-time analysis of the multiarmed bandit problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Learning to play chess using temporal differences / rank
 
Normal rank
Property / cites work
 
Property / cites work: Temporal difference learning applied to game playing and the results of application to Shogi / rank
 
Normal rank
Property / cites work
 
Property / cites work: Natural actor-critic algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Modeling agents as qualitative decision makers / rank
 
Normal rank
Property / cites work
 
Property / cites work: Elevator group control using multiple reinforcement learning agents / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3093335 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Rollout sampling approximate policy iteration / rank
 
Normal rank
Property / cites work
 
Property / cites work: Integrating guidance into relational reinforcement learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Qualitative decision theory with preference relations and comparative uncertainty: an axiomatic approach / rank
 
Normal rank
Property / cites work
 
Property / cites work: Relational reinforcement learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3093188 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Qualitative decision under uncertainty: back to expected utility / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3623997 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Preference Learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Label ranking by learning pairwise preferences / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Survey and Empirical Comparison of Object Ranking Methods / rank
 
Normal rank
Property / cites work
 
Property / cites work: Policy search for motor primitives in robotics / rank
 
Normal rank
Property / cites work
 
Property / cites work: OnActor-Critic Algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5477865 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic Orderings for Markov Processes on Partially Ordered Spaces / rank
 
Normal rank
Property / cites work
 
Property / cites work: Efficient prediction algorithms for binary decomposition techniques / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5305630 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q2880944 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Practical issues in temporal difference learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Programming backgammon using self-teaching neural nets / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q2896181 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Label Ranking Algorithms: A Survey / rank
 
Normal rank
Property / cites work
 
Property / cites work: \({\mathcal Q}\)-learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Simple statistical gradient-following algorithms for connectionist reinforcement learning / rank
 
Normal rank

Latest revision as of 08:31, 6 July 2024

scientific article
Language Label Description Also known as
English
Preference-based reinforcement learning: a formal framework and a policy iteration algorithm
scientific article

    Statements

    Preference-based reinforcement learning: a formal framework and a policy iteration algorithm (English)
    0 references
    0 references
    0 references
    0 references
    0 references
    2 April 2013
    0 references
    0 references
    reinforcement learning
    0 references
    preference learning
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references