| Publication | Date of Publication | Type |
|---|
Sample-based planning and learning with function approximation Statistical Science | 2026-02-10 | Paper |
| Exponential lower bounds for planning in MDPs with linearly-realizable optimal action-value functions | 2025-02-11 | Paper |
| Cleaning up the neighborhood: a full classification for adversarial partial monitoring | 2025-01-31 | Paper |
| An exponential Efron-Stein inequality for L_q stable learning rules | 2025-01-31 | Paper |
| Optimistic MLE: a generic model-based algorithm for partially observable sequential decision making | 2024-05-08 | Paper |
scientific article; zbMATH DE number 7626742 (Why is no real title available?) (available as arXiv preprint) | 2022-12-06 | Paper |
| scientific article; zbMATH DE number 7626742 (Why is no real title available?) | 2022-12-06 | Paper |
Gradient descent for sparse rank-one matrix completion for crowd-sourced aggregation of sparsely interacting workers (available as arXiv preprint) | 2020-10-05 | Paper |
| Gradient descent for sparse rank-one matrix completion for crowd-sourced aggregation of sparsely interacting workers | 2020-10-05 | Paper |
| Bandit algorithms | 2020-05-11 | Paper |
A modular analysis of adaptive (non-)convex optimization: optimism, composite objectives, variance reduction, and variational bounds Theoretical Computer Science | 2020-01-29 | Paper |
Mixing time estimation in reversible Markov chains from a single sample path The Annals of Applied Probability | 2019-10-22 | Paper |
Mixing time estimation in reversible Markov chains from a single sample path The Annals of Applied Probability | 2019-10-22 | Paper |
| A modular analysis of adaptive (non-)convex optimization: optimism, composite objectives, and variational bounds | 2019-01-10 | Paper |
A modular analysis of adaptive (non-)convex optimization: optimism, composite objectives, and variational bounds (available as arXiv preprint) | 2019-01-10 | Paper |
Stochastic Optimization in a Cumulative Prospect Theory Framework IEEE Transactions on Automatic Control | 2018-09-18 | Paper |
A Linearly Relaxed Approximate Linear Program for Markov Decision Processes IEEE Transactions on Automatic Control | 2018-06-27 | Paper |
| Following the leader and fast rates in online linear prediction: curved constraint sets and other regularities | 2018-04-17 | Paper |
Following the leader and fast rates in online linear prediction: curved constraint sets and other regularities (available as arXiv preprint) | 2018-04-17 | Paper |
Online Markov Decision Processes Under Bandit Feedback IEEE Transactions on Automatic Control | 2017-05-16 | Paper |
Regularized policy iteration with nonparametric function spaces Journal of Machine Learning Research (JMLR) | 2016-11-22 | Paper |
Partial monitoring -- classification, regret bounds, and algorithms Mathematics of Operations Research | 2015-04-24 | Paper |
On Learning the Optimal Waiting Time Lecture Notes in Computer Science | 2015-01-14 | Paper |
Alignment based kernel learning with a continuous set of base kernels Machine Learning | 2014-08-20 | Paper |
| \(X\)-armed bandits | 2014-02-03 | Paper |
Toward a classification of finite partial-monitoring games Theoretical Computer Science | 2013-03-04 | Paper |
Partial monitoring with side information Lecture Notes in Computer Science | 2012-10-16 | Paper |
Model selection in reinforcement learning Machine Learning | 2012-05-08 | Paper |
Regularized least-squares regression: learning from a sequence Journal of Statistical Planning and Inference | 2011-11-10 | Paper |
| Finite-time bounds for fitted value iteration | 2011-11-08 | Paper |
Training parsers by inverse reinforcement learning Machine Learning | 2010-10-07 | Paper |
Toward a classification of finite partial-monitoring games Lecture Notes in Computer Science | 2010-10-01 | Paper |
Algorithms for reinforcement learning. Synthesis Lectures on Artificial Intelligence and Machine Learning | 2010-09-10 | Paper |
Active learning in heteroscedastic noise Theoretical Computer Science | 2010-07-07 | Paper |
Models of active learning in group-structured state spaces Information and Computation | 2010-04-08 | Paper |
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits Theoretical Computer Science | 2009-05-12 | Paper |
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path Machine Learning | 2009-03-31 | Paper |
Active Learning in Multi-armed Bandits Lecture Notes in Computer Science | 2008-10-14 | Paper |
Active Learning of Group-Structured Environments Lecture Notes in Computer Science | 2008-10-14 | Paper |
Tuning Bandit Algorithms in Stochastic Environments Lecture Notes in Computer Science | 2008-08-19 | Paper |
Machine Learning: ECML 2004 Lecture Notes in Computer Science | 2008-03-14 | Paper |
Improved Rates for the Stochastic Continuum-Armed Bandit Problem Learning Theory | 2008-01-03 | Paper |
Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path Learning Theory | 2007-09-14 | Paper |
Computer Vision - ECCV 2004 Lecture Notes in Computer Science | 2005-12-27 | Paper |
Efficient approximate planning in continuous space Markovian decision problems AI Communications | 2002-05-02 | Paper |
An asynchronous stochastic approximation theorem and some applications Alkalmazott Matematikai Lapok. A Magyar Tudomanyos Akademia. Matematikai es Fizikai Tudomanyok Osztalyanak Közlemenyei | 2001-04-01 | Paper |
| scientific article; zbMATH DE number 1528670 (Why is no real title available?) | 2000-11-13 | Paper |
Convergence results for single-step on-policy reinforcement-learning algorithms Machine Learning | 2000-06-21 | Paper |
| scientific article; zbMATH DE number 1336305 (Why is no real title available?) | 1999-09-14 | Paper |
Module-based reinforcement learning: Experiments with a real robot Machine Learning | 1998-10-13 | Paper |
| An integrated architecture for motion-control and path-planning | 1998-06-08 | Paper |
Robust control using inverse dynamics neurocontrollers Nonlinear Analysis: Theory, Methods & Applications | 1998-05-11 | Paper |
Approximate geometry representations and sensory fusion Neurocomputing | 1997-03-31 | Paper |