| Publication | Date of Publication | Type |
|---|
On strategic measures and optimality properties in discrete-time stochastic control with universally measurable policies Mathematics of Operations Research | 2024-11-07 | Paper |
Soliton molecules, multi-breathers and hybrid solutions in (2+1)-dimensional Korteweg-de Vries-Sawada-Kotera-Ramani equation Chaos, Solitons and Fractals | 2023-01-12 | Paper |
On linear programming for constrained and unconstrained average-cost Markov decision processes with countable action spaces and strictly unbounded costs Mathematics of Operations Research | 2022-06-27 | Paper |
| On Strategic Measures and Optimality Properties in Discrete-Time Stochastic Control with Universally Measurable Policies | 2022-06-13 | Paper |
On structural properties of optimal average cost functions in Markov decision processes with Borel spaces and universally measurable policies Journal of Mathematical Analysis and Applications | 2022-01-21 | Paper |
| Average-Cost Optimality Results for Borel-Space Markov Decision Processes with Universally Measurable Policies | 2021-03-31 | Paper |
Average cost optimality inequality for Markov decision processes with Borel spaces and universally measurable policies SIAM Journal on Control and Optimization | 2020-10-30 | Paper |
On generalized Bellman equations and temporal-difference learning Lecture Notes in Computer Science | 2020-08-05 | Paper |
On the Minimum Pair Approach for Average Cost Markov Decision Processes with Countable Discrete Action Spaces and Strictly Unbounded Costs SIAM Journal on Control and Optimization | 2020-03-11 | Paper |
| On Markov Decision Processes with Borel Spaces and an Average Cost Criterion | 2019-01-10 | Paper |
On generalized Bellman equations and temporal-difference learning Journal of Machine Learning Research (JMLR) | 2018-11-21 | Paper |
Convergence Results for Some Temporal Difference Methods Based on Least Squares IEEE Transactions on Automatic Control | 2017-08-08 | Paper |
| Weak convergence properties of constrained emphatic temporal-difference learning with constant and slowly diminishing stepsize | 2017-01-05 | Paper |
Weak convergence properties of constrained emphatic temporal-difference learning with constant and slowly diminishing stepsize (available as arXiv preprint) | 2017-01-05 | Paper |
A mixed value and policy iteration method for stochastic control with universally measurable policies Mathematics of Operations Research | 2016-01-29 | Paper |
On convergence of value iteration for a class of total cost Markov decision processes SIAM Journal on Control and Optimization | 2015-08-18 | Paper |
| Stochastic Shortest Path Games and Q-Learning | 2014-12-30 | Paper |
On boundedness of Q-learning iterates for stochastic shortest path problems Mathematics of Operations Research | 2014-07-11 | Paper |
| scientific article; zbMATH DE number 6277636 (Why is no real title available?) | 2014-04-02 | Paper |
Q-learning and policy iteration algorithms for stochastic shortest path problems Annals of Operations Research | 2013-11-12 | Paper |
Least squares temporal difference methods: An analysis under general conditions SIAM Journal on Control and Optimization | 2013-03-19 | Paper |
Q-learning and enhanced policy iteration in discounted dynamic programming Mathematics of Operations Research | 2012-05-24 | Paper |
A unifying polyhedral approximation framework for convex optimization SIAM Journal on Optimization | 2011-06-06 | Paper |
Error bounds for approximations from projected linear equations Mathematics of Operations Research | 2011-04-27 | Paper |
Projected equation methods for approximate solution of large linear systems Journal of Computational and Applied Mathematics | 2009-04-21 | Paper |
On Near Optimality of the Set of Finite-State Controllers for Average Cost POMDP Mathematics of Operations Research | 2008-05-27 | Paper |
| scientific article; zbMATH DE number 2036376 (Why is no real title available?) | 2004-02-02 | Paper |
| scientific article; zbMATH DE number 4215407 (Why is no real title available?) | 1991-01-01 | Paper |