Q4637066 (Q4637066): Difference between revisions

From MaRDI portal
Added link to MaRDI item.
ReferenceBot (talk | contribs)
Changed an Item
 
(5 intermediate revisions by 2 users not shown)
Property / describes a project that uses
 
Property / describes a project that uses: MuJoCo / rank
 
Normal rank
Property / describes a project that uses
 
Property / describes a project that uses: TAMER / rank
 
Normal rank
Property / describes a project that uses
 
Property / describes a project that uses: C4.5 / rank
 
Normal rank
Property / describes a project that uses
 
Property / describes a project that uses: PILCO / rank
 
Normal rank
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / cites work
 
Property / cites work: An introduction to MCMC for machine learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Swinging up a pendulum by energy control / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4252717 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Convergence results for the (1,\(\lambda\))-SA-ES using the theory of \(\varphi\)-irreducible Markov chains / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4893747 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Survey of Preference-Based Online Learning with Bandit Algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm / rank
 
Normal rank
Property / cites work
 
Property / cites work: Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Rollout sampling approximate policy iteration / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4403756 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Preference Learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Preference-based reinforcement learning: a formal framework and a policy iteration algorithm / rank
 
Normal rank
Property / cites work
 
Property / cites work: Probability Inequalities for Sums of Bounded Random Variables / rank
 
Normal rank
Property / cites work
 
Property / cites work: Label ranking by learning pairwise preferences / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Survey and Empirical Comparison of Object Ranking Methods / rank
 
Normal rank
Property / cites work
 
Property / cites work: Model-based contextual policy search for data-efficient generalization of robot skills / rank
 
Normal rank
Property / cites work
 
Property / cites work: 10.1162/1532443041827907 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Introduction to Information Retrieval / rank
 
Normal rank
Property / cites work
 
Property / cites work: Machine learning and knowledge discovery in databases. European conference, ECML PKDD 2011, Athens, Greece, September 5--9, 2011. Proceedings, Part III / rank
 
Normal rank
Property / cites work
 
Property / cites work: An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4626283 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Label Ranking Algorithms: A Survey / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5844986 / rank
 
Normal rank
Property / cites work
 
Property / cites work: The \(K\)-armed dueling bandits problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Reinforcement Learning Strategies for Clinical Trials in Nonsmall Cell Lung Cancer / rank
 
Normal rank

Latest revision as of 11:54, 15 July 2024

scientific article; zbMATH DE number 6860841
Language Label Description Also known as
English
No label defined
scientific article; zbMATH DE number 6860841

    Statements

    0 references
    0 references
    0 references
    0 references
    17 April 2018
    0 references
    reinforcement learning
    0 references
    preference learning
    0 references
    qualitative feedback
    0 references
    Markov decision process
    0 references
    policy search
    0 references
    temporal difference learning
    0 references
    preference-based reinforcement learning
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references

    Identifiers