Csaba Szepesvári

List of research outcomes

This list is not complete and representing at the moment only items from zbMATH Open and arXiv. We are working on additional sources - please check back here soon!

Publication	Date of Publication	Type
Sample-based planning and learning with function approximation Statistical Science	2026-02-10	Paper
Exponential lower bounds for planning in MDPs with linearly-realizable optimal action-value functions	2025-02-11	Paper
Cleaning up the neighborhood: a full classification for adversarial partial monitoring	2025-01-31	Paper
An exponential Efron-Stein inequality for L_q stable learning rules	2025-01-31	Paper
Optimistic MLE: a generic model-based algorithm for partially observable sequential decision making	2024-05-08	Paper
scientific article; zbMATH DE number 7626742 (Why is no real title available?) (available as arXiv preprint)	2022-12-06	Paper
scientific article; zbMATH DE number 7626742 (Why is no real title available?)	2022-12-06	Paper
Gradient descent for sparse rank-one matrix completion for crowd-sourced aggregation of sparsely interacting workers (available as arXiv preprint)	2020-10-05	Paper
Gradient descent for sparse rank-one matrix completion for crowd-sourced aggregation of sparsely interacting workers	2020-10-05	Paper
Bandit algorithms	2020-05-11	Paper
A modular analysis of adaptive (non-)convex optimization: optimism, composite objectives, variance reduction, and variational bounds Theoretical Computer Science	2020-01-29	Paper
Mixing time estimation in reversible Markov chains from a single sample path The Annals of Applied Probability	2019-10-22	Paper
Mixing time estimation in reversible Markov chains from a single sample path The Annals of Applied Probability	2019-10-22	Paper
A modular analysis of adaptive (non-)convex optimization: optimism, composite objectives, and variational bounds	2019-01-10	Paper
A modular analysis of adaptive (non-)convex optimization: optimism, composite objectives, and variational bounds (available as arXiv preprint)	2019-01-10	Paper
Stochastic Optimization in a Cumulative Prospect Theory Framework IEEE Transactions on Automatic Control	2018-09-18	Paper
A Linearly Relaxed Approximate Linear Program for Markov Decision Processes IEEE Transactions on Automatic Control	2018-06-27	Paper
Following the leader and fast rates in online linear prediction: curved constraint sets and other regularities	2018-04-17	Paper
Following the leader and fast rates in online linear prediction: curved constraint sets and other regularities (available as arXiv preprint)	2018-04-17	Paper
Online Markov Decision Processes Under Bandit Feedback IEEE Transactions on Automatic Control	2017-05-16	Paper
Regularized policy iteration with nonparametric function spaces Journal of Machine Learning Research (JMLR)	2016-11-22	Paper
Partial monitoring -- classification, regret bounds, and algorithms Mathematics of Operations Research	2015-04-24	Paper
On Learning the Optimal Waiting Time Lecture Notes in Computer Science	2015-01-14	Paper
Alignment based kernel learning with a continuous set of base kernels Machine Learning	2014-08-20	Paper
\(X\)-armed bandits	2014-02-03	Paper
Toward a classification of finite partial-monitoring games Theoretical Computer Science	2013-03-04	Paper
Partial monitoring with side information Lecture Notes in Computer Science	2012-10-16	Paper
Model selection in reinforcement learning Machine Learning	2012-05-08	Paper
Regularized least-squares regression: learning from a sequence Journal of Statistical Planning and Inference	2011-11-10	Paper
Finite-time bounds for fitted value iteration	2011-11-08	Paper
Training parsers by inverse reinforcement learning Machine Learning	2010-10-07	Paper
Toward a classification of finite partial-monitoring games Lecture Notes in Computer Science	2010-10-01	Paper
Algorithms for reinforcement learning. Synthesis Lectures on Artificial Intelligence and Machine Learning	2010-09-10	Paper
Active learning in heteroscedastic noise Theoretical Computer Science	2010-07-07	Paper
Models of active learning in group-structured state spaces Information and Computation	2010-04-08	Paper
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits Theoretical Computer Science	2009-05-12	Paper
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path Machine Learning	2009-03-31	Paper
Active Learning in Multi-armed Bandits Lecture Notes in Computer Science	2008-10-14	Paper
Active Learning of Group-Structured Environments Lecture Notes in Computer Science	2008-10-14	Paper
Tuning Bandit Algorithms in Stochastic Environments Lecture Notes in Computer Science	2008-08-19	Paper
Machine Learning: ECML 2004 Lecture Notes in Computer Science	2008-03-14	Paper
Improved Rates for the Stochastic Continuum-Armed Bandit Problem Learning Theory	2008-01-03	Paper
Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path Learning Theory	2007-09-14	Paper
Computer Vision - ECCV 2004 Lecture Notes in Computer Science	2005-12-27	Paper
Efficient approximate planning in continuous space Markovian decision problems AI Communications	2002-05-02	Paper
An asynchronous stochastic approximation theorem and some applications Alkalmazott Matematikai Lapok. A Magyar Tudomanyos Akademia. Matematikai es Fizikai Tudomanyok Osztalyanak Közlemenyei	2001-04-01	Paper
scientific article; zbMATH DE number 1528670 (Why is no real title available?)	2000-11-13	Paper
Convergence results for single-step on-policy reinforcement-learning algorithms Machine Learning	2000-06-21	Paper
scientific article; zbMATH DE number 1336305 (Why is no real title available?)	1999-09-14	Paper
Module-based reinforcement learning: Experiments with a real robot Machine Learning	1998-10-13	Paper
An integrated architecture for motion-control and path-planning	1998-06-08	Paper
Robust control using inverse dynamics neurocontrollers Nonlinear Analysis: Theory, Methods & Applications	1998-05-11	Paper
Approximate geometry representations and sensory fusion Neurocomputing	1997-03-31	Paper

Research outcomes over time

This page was built for person: Csaba Szepesvári