Prediction, Learning, and Games

From MaRDI portal
Publication:5470194

DOI10.1017/CBO9780511546921zbMath1114.91001WikidataQ59538584 ScholiaQ59538584MaRDI QIDQ5470194

Nicolò Cesa-Bianchi, Gábor Lugosi

Publication date: 30 May 2006




Related Items

A normalized value for information purchases, Online joint bid/daily budget optimization of Internet advertising campaigns, PAC-Bayesian lifelong learning for multi-armed bandits, Distribution-free solutions to the extended multi-period newsboy problem, Dynamics in tree formation games, Improved second-order bounds for prediction with expert advice, Guest editorial: Learning theory, The platform design problem, Stochastic evolution of rules for playing finite normal form games, Fast learning rates in statistical inference through aggregation, Approachability with bounded memory, Hedge algorithm and dual averaging schemes, Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case, Dynamic benchmark targeting, Collaborative topic regression for online recommender systems: an online and Bayesian approach, Asymptotically optimal strategies for online prediction with history-dependent experts, Testing facility location and dynamic capacity planning for pandemics with demand uncertainty, Online ordering rules for the multi-period newsvendor problem with quantity discounts, An index-based deterministic convergent optimal algorithm for constrained multi-armed bandit problems, Optimality justifications: new foundations for foundation-oriented epistemology, Constrained no-regret learning, Exploration-exploitation in multi-agent learning: catastrophe theory meets game theory, On tightness of the Tsaknakis-Spirakis algorithm for approximate Nash equilibrium, Handling concept drift via model reuse, Online strongly convex optimization with unknown delays, Worst-case regret analysis of computationally budgeted online kernel selection, Logarithmic regret in online linear quadratic control using Riccati updates, Mean field games on prosumers, Gaussian two-armed bandit and optimization of batch data processing, Optimal non-asymptotic analysis of the Ruppert-Polyak averaging stochastic algorithm, Learning MAX-SAT from contextual examples for combinatorial optimisation, Regret minimization in online Bayesian persuasion: handling adversarial receiver's types under full and partial feedback models, Arbitrage of forecasting experts, High-dimensional VAR with low-rank transition, Generalized self-concordant analysis of Frank-Wolfe algorithms, Aggregation of estimators and stochastic optimization, Customization of J. Bather's UCB strategy for a Gaussian multiarmed bandit, Online multiple kernel classification, Relational networks of conditional preferences, Optimal probability aggregation based on generalized Brier scoring, On biased random walks, corrupted intervals, and learning under adversarial design, The secretary problem with reservation costs, Upper bounds and aggregation in bipartite ranking, Fictitious play in networks, Dynamics of Bayesian updating with dependent data and misspecified models, Approachability, regret and calibration: implications and equivalences, Training parsers by inverse reinforcement learning, Extracting certainty from uncertainty: regret bounded by variation in costs, The security of machine learning, Fano's inequality for random variables, Integer programming ensemble of temporal relations classifiers, Simultaneous adaptation to the margin and to complexity in classification, Non-asymptotic calibration and resolution, On the asymptotic optimality of the comb strategy for prediction with expert advice, Additive stacking for disaggregate electricity demand forecasting, Designing coalition-based fair and stable pricing mechanisms under private information on consumers' reservation prices, Tune and mix: learning to rank using ensembles of calibrated multi-class classifiers, Regret bounded by gradual variation for online convex optimization, Generalised entropies and asymptotic complexities of languages, Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards, Probability-free solutions to the non-stationary newsvendor problem, Covariance-based dissimilarity measures applied to clustering wide-sense stationary ergodic processes, Sequential model aggregation for production forecasting, Committee, expert advice, and the weighted majority algorithm: an application to the pricing decision of a monopolist, Stable games and their dynamics, Convergence of stochastic proximal gradient algorithm, A differential game on Wasserstein space. Application to weak approachability with partial monitoring, An efficient algorithm for nonconvex-linear minimax optimization problem and its application in solving weighted maximin dispersion problem, Mistake bounds on the noise-free multi-armed bandit game, Approachability with constraints, Merging and testing opinions, Improved regret for zeroth-order adversarial bandit convex optimisation, Maximin effects in inhomogeneous large-scale data, A modular analysis of adaptive (non-)convex optimization: optimism, composite objectives, variance reduction, and variational bounds, Scale-invariant unconstrained online learning, Relative utility bounds for empirically optimal portfolios, Near-optimal no-regret algorithms for zero-sum games, Predictive Complexity for Games with Finite Outcome Spaces, On Martingale Extensions of Vapnik–Chervonenkis Theory with Applications to Online Learning, Unified Algorithms for Online Learning and Competitive Analysis, Zero-Sum Polymatrix Games: A Generalization of Minmax, A survey of network interdiction models and algorithms, Analysis of Hannan consistent selection for Monte Carlo tree search in simultaneous move games, Online Bayesian max-margin subspace learning for multi-view classification and regression, Prediction with expert advice: a PDE perspective, Learning (to disagree?) in large worlds, Exploiting problem structure in optimization under uncertainty via online convex optimization, Aggregating algorithm for prediction of packs, Two-armed bandit problem and batch version of the mirror descent algorithm, Unadjusted Langevin algorithm for sampling a mixture of weakly smooth potentials, A general equilibrium model of investor sentiment, Meta-inductive prediction based on attractivity weighting: mathematical and empirical performance evaluation, New insights on concentration inequalities for self-normalized martingales, Multi-agent reinforcement learning: a selective overview of theories and algorithms, Exponential weight approachability, applications to calibration and regret minimization, A unified framework for online trip destination prediction, Suboptimality of constrained least squares and improvements via non-linear predictors, On efficient randomized algorithms for finding the PageRank vector, On the efficiency of a randomized mirror descent algorithm in online optimization problems, MedleySolver: online SMT algorithm selection, Robust Bregman clustering, Robust option pricing: Hannan and Blackwell meet Black and Scholes, Robust probability updating, Lower bounds on individual sequence regret, One-pass AUC optimization, Optimal control with learning on the fly: a toy problem, Approximation algorithms for stochastic combinatorial optimization problems, A general internal regret-free strategy, Woodroofe's one-armed bandit problem revisited, Strategic conversations under imperfect information: epistemic message exchange games, Online passive-aggressive active learning, Sublinear time algorithms for approximate semidefinite programming, Consensus in opinion dynamics as a repeated game, Improving multi-armed bandit algorithms in online pricing settings, Bandit online optimization over the permutahedron, The query complexity of correlated equilibria, Robust mean field games, Online ordering policies for a two-product, multi-period stationary newsvendor problem, Using the Bayesian Shtarkov solution for predictions, A general procedure to combine estimators, The multi-armed bandit problem with covariates, Open problems in universal induction \& intelligence, On data-based optimal stopping under stationarity and ergodicity, On minimaxity of follow the leader strategy in the stochastic setting, Wisdom of crowds versus groupthink: learning in groups and in isolation, Two queues with non-stochastic arrivals, Exploration and exploitation of scratch games, \(\lambda \)-perceptron: an adaptive classifier for data streams, Adaptive and optimal online linear regression on \(\ell^1\)-balls, Weak aggregating algorithm for the distribution-free perishable inventory problem, Weakly universally consistent static forecasting of stationary and ergodic time series via local averaging and least squares estimates, Stochastic optimization for real time service capacity allocation under random service demand, Reducing reinforcement learning to KWIK online regression, Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model, Aggregation of predictors for nonstationary sub-linear processes and online adaptive forecasting of time varying autoregressive processes, Combining multiple strategies for multiarmed bandit problems and asymptotic optimality, Stability in large Bayesian games with heterogeneous players, Classifier evaluation and attribute selection against active adversaries, Common learning with intertemporal dependence, Learning noisy linear classifiers via adaptive and selective sampling, Sharp oracle inequalities for aggregation of affine estimators, Approximate implementation in Markovian environments, Online variance minimization, PAMR: passive aggressive mean reversion strategy for portfolio selection, Optimization of relative arbitrage, Risk management strategies for finding universal portfolios, Forecasting electricity consumption by aggregating specialized experts, Mercer's theorem on general domains: on the interaction between measures, kernels, and RKHSs, QoS commitment between vertically integrated autonomous systems, How uncertain do we need to be?, Scale-free online learning, Combinatorial bandits, Sparse regression learning by aggregation and Langevin Monte-Carlo, Learning with stochastic inputs and adversarial outputs, Model selection for weakly dependent time series forecasting, Mirror averaging with sparsity priors, Gaussian process bandits with adaptive discretization, Online transfer learning, Real-time model learning using incremental sparse spectrum Gaussian process regression, Weighted last-step min-max algorithm with improved sub-logarithmic regret, Regret minimization in repeated matrix games with variable stage duration, Load balancing without regret in the bulletin board model, Strong approachability, Randomized prediction of individual sequences, Online aggregation of unbounded losses using shifting experts with confidence, Efficient distance metric learning by adaptive sampling and mini-batch stochastic gradient descent (SGD), A generalized online mirror descent with applications to classification and regression, Emergence of information transfer by inductive learning, Markets, correlation, and regret-matching, Online estimation of discrete, continuous, and conditional joint densities using classifier chains, Consistency of discrete Bayesian learning, On the possibility of learning in reactive environments with arbitrary dependence, Leading strategies in competitive on-line prediction, Asymptotic sequential Rademacher complexity of a finite function class, Learning by mirror averaging, A continuous-time approach to online optimization, Smooth calibration, leaky forecasts, finite recall, and Nash dynamics, Sampled fictitious play is Hannan consistent, Dominant-set clustering: a review, MSO: a framework for bound-constrained black-box global optimization algorithms, The weak aggregating algorithm and weak mixability, Note on universal conditional consistency, CRPS Learning, Optimal learning with Bernstein Online Aggregation, How long to equilibrium? The communication complexity of uncoupled equilibrium procedures, Supermartingales in prediction with expert advice, Window-games between TCP flows, Context tree selection: a unifying view, Quantization and clustering with Bregman divergences, A quasi-Bayesian perspective to online clustering, Generalized mirror averaging and \(D\)-convex aggregation, Robust forecast combinations, Logarithmic regret algorithms for online convex optimization, Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity, Regret to the best vs. regret to the average, On-line predictive linear regression, A nonmanipulable test, Multi-agent learning for engineers, On universal algorithms for adaptive forecasting, An asymptotically optimal strategy for constrained multi-armed bandit problems, Interpreting uninterpretable predictors: kernel methods, Shtarkov solutions, and random forests, Gambling Under Unknown Probabilities as Proxy for Real World Decisions Under Uncertainty, Unifying mirror descent and dual averaging, Online Prediction with <scp>History‐Dependent</scp> Experts: The General Case, A PDE Approach to the Prediction of a Binary Sequence with Advice from Two History‐Dependent Experts, Prior‐free dynamic allocation under limited liability, Optimal anytime regret with two experts, Meta-inductive probability aggregation, A stochastic variant of replicator dynamics in zero-sum games and its invariant measures, Multi-armed bandits with censored consumption of resources, Unadjusted Langevin algorithm with multiplicative noise: total variation and Wasserstein bounds, Robustness of dynamics in games: a contraction mapping decomposition approach, Minimax rates for conditional density estimation via empirical entropy, No-regret algorithms in on-line learning, games and convex optimization, A unified stochastic approximation framework for learning in games, User-friendly Introduction to PAC-Bayes Bounds, Universal regression with adversarial responses, Probabilistic truthlikeness, content elements, and meta-inductive probability optimization, Learning Stationary Nash Equilibrium Policies in \(n\)-Player Stochastic Games with Independent Chains, No-regret learning for repeated non-cooperative games with lossy bandits, Robustness and sample complexity of model-based MARL for general-sum Markov games, Provably efficient reinforcement learning in decentralized general-sum Markov games, Learning in games with cumulative prospect theoretic preferences, Treatment recommendation with distributional targets, Synthetic learner: model-free inference on treatments over time, First-order methods for convex optimization, “Calibeating”: Beating forecasters at their own game, Relaxing the i.i.d. assumption: adaptively minimax optimal regret via root-entropic regularization, Independent learning in stochastic games, Stochastic online convex optimization. Application to probabilistic time series forecasting, Unnamed Item, Near-Optimal Algorithms for Online Matrix Prediction, A Linearly Convergent Variant of the Conditional Gradient Algorithm under Strong Convexity, with Applications to Online and Stochastic Optimization, Randomized allocation with arm elimination in a bandit problem with covariates, Portfolio selection in non-stationary markets, Sequential Shortest Path Interdiction with Incomplete Information and Limited Feedback, Small-Loss Bounds for Online Learning with Partial Information, A Primal–Dual Learning Algorithm for Personalized Dynamic Pricing with an Inventory Constraint, No Regret Learning in Oligopolies: Cournot vs. Bertrand, Efficient numerical methods to solve sparse linear equations with application to PageRank, Prediction of weakly locally stationary processes by auto-regression, Semantic Image Segmentation: Two Decades of Research, Oracle-Based Robust Optimization via Online Learning, A Low Complexity Algorithm with $O(\sqrt{T})$ Regret and $O(1)$ Constraint Violations for Online Convex Optimization with Long Term Constraints, Meta Algorithms for Portfolio Optimization Using Reinforcement Learning, A linear response bandit problem, Smooth Contextual Bandits: Bridging the Parametric and Nondifferentiable Regret Regimes, ON THE TRUTH-CONVERGENCE OF OPEN-MINDED BAYESIANISM, Unnamed Item, Unnamed Item, Unnamed Item, Unnamed Item, Unnamed Item, Statistical Problem Classes and Their Links to Information Theory, Stochastic Knapsack Revisited: The Service Level Perspective, A Dynamic Near-Optimal Algorithm for Online Linear Programming, Optimal Forecast Reconciliation for Hierarchical and Grouped Time Series Through Trace Minimization, Gambling, Computational Information and Encryption Security, Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback, Unnamed Item, Unnamed Item, Unnamed Item, Unnamed Item, Unnamed Item, Unnamed Item, Unnamed Item, Optimal Exploration–Exploitation in a Multi-armed Bandit Problem with Non-stationary Rewards, Chasing Ghosts: Competing with Stateful Policies, Learning the distribution with largest mean: two bandit frameworks, Learning Equilibria of a Stochastic Game on Gaussian Interference Channels with Incomplete Information, Interference Mitigation via Pricing in Time-Varying Cognitive Radio Systems, Following the Perturbed Leader to Gamble at Multi-armed Bandits, Online Regression Competitive with Changing Predictors, An Online Learning Approach to a Multi-player N-armed Functional Bandit, Growth Optimal Investment with Transaction Costs, Supermartingales in Prediction with Expert Advice, Aggregating Algorithm for a Space of Analytic Functions, Online Learning Based on Online DCA and Application to Online Classification, Bandit-Based Task Assignment for Heterogeneous Crowdsourcing, Online First-Order Framework for Robust Convex Optimization, Finite-time 4-expert prediction problem, PORTFOLIO SELECTION AND ONLINE LEARNING, Catching up Faster by Switching Sooner: A Predictive Approach to Adaptive Estimation with an Application to the AIC–BIC Dilemma, Learning in Combinatorial Optimization: What and How to Explore, Unnamed Item, Repeated Games with Incomplete Information, Online algorithm for aggregating experts’ predictions with unbounded quadratic loss, RECURSIVE FORECAST COMBINATION FOR DEPENDENT HETEROGENEOUS DATA, Unnamed Item, Unnamed Item, Unnamed Item, Unnamed Item, Nonparametric sequential prediction of time series, Prediction of time series by statistical learning: general losses and fast rates, Online portfolio selection, A survey on concept drift adaptation, Unnamed Item, Unnamed Item, Unnamed Item, A Robust Saturated Strategy for $n$-Player Prisoner's Dilemma, Sequential Shortest Path Interdiction with Incomplete Information, A PAC Approach to Application-Specific Algorithm Selection, Adaptive sequential machine learning, Some Universal Insights on Divergences for Statistics, Machine Learning and Artificial Intelligence, Tracking climate models, Explore First, Exploit Next: The True Shape of Regret in Bandit Problems, Sequential Interdiction with Incomplete Information and Learning, Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback, Bayesian Incentive-Compatible Bandit Exploration, Estimating latent feature-feature interactions in large feature-rich graphs, Unnamed Item, Learning Hurdles for Sleeping Experts, Sparse Learning for Large-Scale and High-Dimensional Data: A Randomized Convex-Concave Optimization Approach, Structural Online Learning, Things Bayes Can’t Do, Learning Volatility of Discrete Time Series Using Prediction with Expert Advice, On Minimaxity of Follow the Leader Strategy in the Stochastic Setting, On the Prior Sensitivity of Thompson Sampling, Unnamed Item, Prediction with Expert Evaluators’ Advice, The Follow Perturbed Leader Algorithm Protected from Unbounded One-Step Losses, Calibration and Internal No-Regret with Random Signals, Far from the madding crowd: collective wisdom in prediction markets, Scale-Free Algorithms for Online Linear Optimization, Online Linear Optimization for Job Scheduling Under Precedence Constraints, Online Learning over a Finite Action Set with Limited Switching, Robust Power Management via Learning and Game Design, Minimum description length revisited, Unnamed Item, Learning Optimal Forecast Aggregation in Partial Evidence Environments, Partial Monitoring—Classification, Regret Bounds, and Algorithms, Opportunistic Approachability and Generalized No-Regret Problems, Nonparametric Pricing Analytics with Customer Covariates, Online Learning of Nash Equilibria in Congestion Games, Unnamed Item, Unnamed Item, Unnamed Item, Unnamed Item