| Publication | Date of Publication | Type |
|---|
| Optimistic MLE: a generic model-based algorithm for partially observable sequential decision making | 2024-05-08 | Paper |
| https://portal.mardi4nfdi.de/entity/Q5053235 | 2022-12-06 | Paper |
| Gradient descent for sparse rank-one matrix completion for crowd-sourced aggregation of sparsely interacting workers | 2020-10-05 | Paper |
| Bandit algorithms | 2020-05-11 | Paper |
| A modular analysis of adaptive (non-)convex optimization: optimism, composite objectives, variance reduction, and variational bounds | 2020-01-29 | Paper |
| Mixing time estimation in reversible Markov chains from a single sample path | 2019-10-22 | Paper |
| A modular analysis of adaptive (non-)convex optimization: optimism, composite objectives, and variational bounds | 2019-01-10 | Paper |
| Stochastic Optimization in a Cumulative Prospect Theory Framework | 2018-09-18 | Paper |
| A Linearly Relaxed Approximate Linear Program for Markov Decision Processes | 2018-06-27 | Paper |
| Following the leader and fast rates in online linear prediction: curved constraint sets and other regularities | 2018-04-17 | Paper |
| Online Markov Decision Processes Under Bandit Feedback | 2017-05-16 | Paper |
| Regularized policy iteration with nonparametric function spaces | 2016-11-22 | Paper |
| Partial monitoring -- classification, regret bounds, and algorithms | 2015-04-24 | Paper |
| On Learning the Optimal Waiting Time | 2015-01-14 | Paper |
| Alignment based kernel learning with a continuous set of base kernels | 2014-08-20 | Paper |
| \(X\)-armed bandits | 2014-02-03 | Paper |
| Toward a classification of finite partial-monitoring games | 2013-03-04 | Paper |
| Partial monitoring with side information | 2012-10-16 | Paper |
| Model selection in reinforcement learning | 2012-05-08 | Paper |
| Regularized least-squares regression: learning from a sequence | 2011-11-10 | Paper |
| Finite-time bounds for fitted value iteration | 2011-11-08 | Paper |
| Training parsers by inverse reinforcement learning | 2010-10-07 | Paper |
| Toward a classification of finite partial-monitoring games | 2010-10-01 | Paper |
| Algorithms for reinforcement learning. | 2010-09-10 | Paper |
| Active learning in heteroscedastic noise | 2010-07-07 | Paper |
| Models of active learning in group-structured state spaces | 2010-04-08 | Paper |
| Exploration-exploitation tradeoff using variance estimates in multi-armed bandits | 2009-05-12 | Paper |
| Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path | 2009-03-31 | Paper |
| Active Learning in Multi-armed Bandits | 2008-10-14 | Paper |
| Active Learning of Group-Structured Environments | 2008-10-14 | Paper |
| Tuning Bandit Algorithms in Stochastic Environments | 2008-08-19 | Paper |
| Machine Learning: ECML 2004 | 2008-03-14 | Paper |
| Improved Rates for the Stochastic Continuum-Armed Bandit Problem | 2008-01-03 | Paper |
| Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path | 2007-09-14 | Paper |
| Computer Vision - ECCV 2004 | 2005-12-27 | Paper |
| Efficient approximate planning in continuous space Markovian decision problems | 2002-05-02 | Paper |
| An asynchronous stochastic approximation theorem and some applications | 2001-04-01 | Paper |
| https://portal.mardi4nfdi.de/entity/Q4515253 | 2000-11-13 | Paper |
| Convergence results for single-step on-policy reinforcement-learning algorithms | 2000-06-21 | Paper |
| https://portal.mardi4nfdi.de/entity/Q4258651 | 1999-09-14 | Paper |
| Module-based reinforcement learning: Experiments with a real robot | 1998-10-13 | Paper |
| An integrated architecture for motion-control and path-planning | 1998-06-08 | Paper |
| Robust control using inverse dynamics neurocontrollers | 1998-05-11 | Paper |
| Approximate geometry representations and sensory fusion | 1997-03-31 | Paper |