Publication | Date of Publication | Type |
---|
Reinforcement Learning, Bit by Bit | 2023-12-19 | Paper |
Satisficing in Time-Sensitive Bandit Learning | 2023-01-09 | Paper |
Learning to Optimize via Information-Directed Sampling | 2020-10-05 | Paper |
https://portal.mardi4nfdi.de/entity/Q5214215 | 2020-02-07 | Paper |
A Tutorial on Thompson Sampling | 2018-11-23 | Paper |
Efficient Reinforcement Learning in Deterministic Systems with Value Function Generalization | 2017-09-22 | Paper |
Convergence of Min-Sum Message Passing for Quadratic Optimization | 2017-08-08 | Paper |
Universal Reinforcement Learning | 2017-07-27 | Paper |
Convergence of Min-Sum Message-Passing for Convex Optimization | 2017-07-27 | Paper |
Gaussian-Dirichlet Posterior Dominance in Sequential Learning | 2017-02-14 | Paper |
https://portal.mardi4nfdi.de/entity/Q2810878 | 2016-06-06 | Paper |
Adaptive Execution: Exploration and Learning of Price Impact | 2016-03-22 | Paper |
Learning to Optimize via Posterior Sampling | 2015-04-24 | Paper |
Directed Principal Component Analysis | 2014-11-26 | Paper |
Learning a factor model via regularized PCA | 2014-08-20 | Paper |
Resource Allocation via Message Passing | 2012-07-28 | Paper |
Manipulation Robustness of Collaborative Filtering | 2012-02-27 | Paper |
Dynamic Pricing with a Prior on Market Response | 2011-11-24 | Paper |
Investment and Market Structure in Industries with Congestion | 2011-11-24 | Paper |
Computational Methods for Oblivious Equilibrium | 2011-11-17 | Paper |
Industry dynamics: foundations for models with an infinite number of firms | 2011-10-28 | Paper |
Control of Diffusions via Linear Programming | 2011-05-31 | Paper |
On regression-based stopping times | 2010-10-15 | Paper |
A short proof of optimality for the MIN cache replacement algorithm | 2010-01-29 | Paper |
A Nonparametric Approach to Multiproduct Pricing | 2009-08-13 | Paper |
The Linear Programming Approach to Approximate Dynamic Programming | 2009-07-09 | Paper |
Capacity of the Trapdoor Channel With Feedback | 2009-02-24 | Paper |
Consensus Propagation | 2008-12-21 | Paper |
Markov Perfect Industry Dynamics With Many Firms | 2008-12-15 | Paper |
Performance Loss Bounds for Approximate Value Iteration with State Aggregation | 2008-05-27 | Paper |
A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees | 2008-05-27 | Paper |
Strategic Execution in the Presence of an Uninformed Arbitrageur | 2008-01-18 | Paper |
https://portal.mardi4nfdi.de/entity/Q3590801 | 2007-09-03 | Paper |
Convergence of the Min-Sum Algorithm for Convex Optimization | 2007-05-29 | Paper |
A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning | 2007-01-18 | Paper |
https://portal.mardi4nfdi.de/entity/Q5477860 | 2006-06-29 | Paper |
https://portal.mardi4nfdi.de/entity/Q5201298 | 2006-04-18 | Paper |
On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming | 2005-11-11 | Paper |
Algorithms and Models for the Web-Graph | 2005-08-22 | Paper |
Decentralized decision-making in a large team with local information. | 2003-07-30 | Paper |
https://portal.mardi4nfdi.de/entity/Q4547446 | 2002-08-21 | Paper |
An analysis of belief propagation on the turbo decoding graph with Gaussian densities | 2002-08-04 | Paper |
On average versus discounted reward temporal-difference learning | 2002-07-08 | Paper |
On the existence of fixed points for approximate value iteration and temporal-difference learning | 2001-02-19 | Paper |
Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives | 2000-10-17 | Paper |
Average cost temporal-difference learning | 2000-02-28 | Paper |
An analysis of temporal-difference learning with function approximation | 1999-05-06 | Paper |
Feature-based methods for large scale dynamic programming | 1996-04-21 | Paper |