10.1162/153244303321897663

From MaRDI portal
Publication:4825350


DOI10.1162/153244303321897663zbMath1084.68543MaRDI QIDQ4825350

Peter Auer

Publication date: 28 October 2004

Published in: CrossRef Listing of Deleted DOIs (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1162/153244303321897663


68Q32: Computational learning theory

68T05: Learning and adaptive systems in artificial intelligence


Related Items

Unnamed Item, Unnamed Item, Unnamed Item, Unnamed Item, Statistical Inference for Online Decision Making via Stochastic Gradient Descent, Unnamed Item, Setting Reserve Prices in Second-Price Auctions with Unobserved Bids, Ranking and Selection with Covariates for Personalized Decision Making, Regret bounds for Narendra-Shapiro bandit algorithms, Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning, Dynamic Learning and Decision Making via Basis Weight Vectors, Online Resource Allocation with Personalized Learning, MNL-Bandit: A Dynamic Learning Approach to Assortment Selection, Bandits with Global Convex Constraints and Objective, Online Decision Making with High-Dimensional Covariates, Learning Enabled Constrained Black-Box Optimization, Derivative-free optimization methods, A linear response bandit problem, Per-Round Knapsack-Constrained Linear Submodular Bandits, Statistical Inference for Online Decision Making: In a Contextual Bandit Setting, Model-based Reinforcement Learning: A Survey, Robust sequential design for piecewise-stationary multi-armed bandit problem in the presence of outliers, Greedy Algorithm Almost Dominates in Smoothed Contextual Bandits, Online learning of network bottlenecks via minimax paths, Multi-armed bandits with censored consumption of resources, Dealing with expert bias in collective decision-making, Online learning of energy consumption for navigation of electric vehicles, Geiringer theorems: from population genetics to computational intelligence, memory evolutive systems and Hebbian learning, Algorithm portfolios for noisy optimization, Knows what it knows: a framework for self-aware learning, Learning with stochastic inputs and adversarial outputs, The \(K\)-armed dueling bandits problem, Reducing reinforcement learning to KWIK online regression, Multi-objective simultaneous optimistic optimization, Effective hybrid system falsification using Monte Carlo tree search guided by QB-robustness, An analysis of model-based interval estimation for Markov decision processes, Randomized prediction of individual sequences, Bayesian optimization of pump operations in water distribution systems, Multiclass classification with bandit feedback using adaptive regularization, Hyperparameter optimization for recommender systems through Bayesian optimization, Gorthaur-EXP3: bandit-based selection from a portfolio of recommendation algorithms balancing the accuracy-diversity dilemma, Regret lower bound and optimal algorithm for high-dimensional contextual linear bandit, Two-armed bandit problem and batch version of the mirror descent algorithm, Stochastic continuum-armed bandits with additive models: minimax regrets and adaptive algorithm, Trading utility and uncertainty: applying the value of information to resolve the exploration-exploitation dilemma in reinforcement learning, Ballooning multi-armed bandits, A survey on kriging-based infill algorithms for multiobjective simulation optimization, New bounds on the price of bandit feedback for mistake-bounded online multiclass learning, Anticipatory action selection for human-robot table tennis, Customization of J. Bather's UCB strategy for a Gaussian multiarmed bandit