| Publication | Date of Publication | Type |
|---|
| Reinforcement Learning, Bit by Bit | 2023-12-19 | Paper |
| Satisficing in Time-Sensitive Bandit Learning | 2023-01-09 | Paper |
| Learning to Optimize via Information-Directed Sampling | 2020-10-05 | Paper |
| https://portal.mardi4nfdi.de/entity/Q5214215 | 2020-02-07 | Paper |
| A Tutorial on Thompson Sampling | 2018-11-23 | Paper |
| Efficient Reinforcement Learning in Deterministic Systems with Value Function Generalization | 2017-09-22 | Paper |
| Convergence of Min-Sum Message Passing for Quadratic Optimization | 2017-08-08 | Paper |
| Convergence of Min-Sum Message-Passing for Convex Optimization | 2017-07-27 | Paper |
| Universal Reinforcement Learning | 2017-07-27 | Paper |
| Gaussian-Dirichlet Posterior Dominance in Sequential Learning | 2017-02-14 | Paper |
| An information-theoretic analysis of Thompson sampling | 2016-06-06 | Paper |
| Adaptive execution: exploration and learning of price impact | 2016-03-22 | Paper |
| Learning to Optimize via Posterior Sampling | 2015-04-24 | Paper |
| Directed Principal Component Analysis | 2014-11-26 | Paper |
| Learning a factor model via regularized PCA | 2014-08-20 | Paper |
| Resource allocation via message passing | 2012-07-28 | Paper |
| Manipulation Robustness of Collaborative Filtering | 2012-02-27 | Paper |
| Dynamic pricing with a prior on market response | 2011-11-24 | Paper |
| Investment and market structure in industries with congestion | 2011-11-24 | Paper |
| Computational methods for oblivious equilibrium | 2011-11-17 | Paper |
| Industry dynamics: foundations for models with an infinite number of firms | 2011-10-28 | Paper |
| Control of diffusions via linear programming | 2011-05-31 | Paper |
| On regression-based stopping times | 2010-10-15 | Paper |
| A short proof of optimality for the MIN cache replacement algorithm | 2010-01-29 | Paper |
| A Nonparametric Approach to Multiproduct Pricing | 2009-08-13 | Paper |
| The Linear Programming Approach to Approximate Dynamic Programming | 2009-07-09 | Paper |
| Capacity of the Trapdoor Channel With Feedback | 2009-02-24 | Paper |
| Consensus Propagation | 2008-12-21 | Paper |
| Markov Perfect Industry Dynamics With Many Firms | 2008-12-15 | Paper |
| Performance Loss Bounds for Approximate Value Iteration with State Aggregation | 2008-05-27 | Paper |
| A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees | 2008-05-27 | Paper |
| Strategic Execution in the Presence of an Uninformed Arbitrageur | 2008-01-18 | Paper |
| An approximate dynamic programming approach to decentralized control of stochastic systems | 2007-09-03 | Paper |
| Convergence of the Min-Sum Algorithm for Convex Optimization | 2007-05-29 | Paper |
| A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning | 2007-01-18 | Paper |
| Feature-based methods for large scale dynamic programming | 2006-06-29 | Paper |
| Tetris: A study of randomized constraint sampling | 2006-04-18 | Paper |
| On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming | 2005-11-11 | Paper |
| Algorithms and Models for the Web-Graph | 2005-08-22 | Paper |
| Decentralized decision-making in a large team with local information. | 2003-07-30 | Paper |
| https://portal.mardi4nfdi.de/entity/Q4547446 | 2002-08-21 | Paper |
| An analysis of belief propagation on the turbo decoding graph with Gaussian densities | 2002-08-04 | Paper |
| On average versus discounted reward temporal-difference learning | 2002-07-08 | Paper |
| On the existence of fixed points for approximate value iteration and temporal-difference learning | 2001-02-19 | Paper |
| Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives | 2000-10-17 | Paper |
| Average cost temporal-difference learning | 2000-02-28 | Paper |
| An analysis of temporal-difference learning with function approximation | 1999-05-06 | Paper |
| Feature-based methods for large scale dynamic programming | 1996-04-21 | Paper |