10.1162/153244303321897663

DOI10.1162/153244303321897663zbMath1084.68543OpenAlexW4243522562MaRDI QIDQ4825350

Publication date: 28 October 2004

Published in: CrossRef Listing of Deleted DOIs (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1162/153244303321897663

zbMATH Keywords

online learning reinforcement learning bandit problem

Mathematics Subject Classification ID

Computational learning theory (68Q32) Learning and adaptive systems in artificial intelligence (68T05)

Related Items

Robust sequential design for piecewise-stationary multi-armed bandit problem in the presence of outliers ⋮ Geiringer theorems: from population genetics to computational intelligence, memory evolutive systems and Hebbian learning ⋮ Algorithm portfolios for noisy optimization ⋮ Greedy Algorithm Almost Dominates in Smoothed Contextual Bandits ⋮ Effective hybrid system falsification using Monte Carlo tree search guided by QB-robustness ⋮ A linear response bandit problem ⋮ Setting Reserve Prices in Second-Price Auctions with Unobserved Bids ⋮ Ranking and Selection with Covariates for Personalized Decision Making ⋮ Regret bounds for Narendra-Shapiro bandit algorithms ⋮ Anticipatory action selection for human-robot table tennis ⋮ Feel-Good Thompson Sampling for Contextual Bandits and Reinforcement Learning ⋮ Dynamic Learning and Decision Making via Basis Weight Vectors ⋮ Reducing reinforcement learning to KWIK online regression ⋮ A set‐based approach for hierarchical optimization problem using Bayesian active learning ⋮ Adaptive resources allocation CUSUM for binomial count data monitoring with application to COVID-19 hotspot detection ⋮ Online Resource Allocation with Personalized Learning ⋮ Online learning of network bottlenecks via minimax paths ⋮ Multi-armed bandits with censored consumption of resources ⋮ Dealing with expert bias in collective decision-making ⋮ Unnamed Item ⋮ Knows what it knows: a framework for self-aware learning ⋮ Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection ⋮ Online learning of energy consumption for navigation of electric vehicles ⋮ An asynchronous parallel high-throughput model calibration framework for crystal plasticity finite element constitutive models ⋮ Customization of J. Bather's UCB strategy for a Gaussian multiarmed bandit ⋮ Multiclass classification with bandit feedback using adaptive regularization ⋮ Transfer learning for contextual multi-armed bandits ⋮ Multi-armed linear bandits with latent biases ⋮ Learning with stochastic inputs and adversarial outputs ⋮ The \(K\)-armed dueling bandits problem ⋮ MNL-Bandit: A Dynamic Learning Approach to Assortment Selection ⋮ Bandits with Global Convex Constraints and Objective ⋮ Online Decision Making with High-Dimensional Covariates ⋮ Unnamed Item ⋮ Per-Round Knapsack-Constrained Linear Submodular Bandits ⋮ Randomized prediction of individual sequences ⋮ Unnamed Item ⋮ Ballooning multi-armed bandits ⋮ Learning Enabled Constrained Black-Box Optimization ⋮ Unnamed Item ⋮ Bayesian optimization of pump operations in water distribution systems ⋮ An analysis of model-based interval estimation for Markov decision processes ⋮ Unnamed Item ⋮ Hyperparameter optimization for recommender systems through Bayesian optimization ⋮ A survey on kriging-based infill algorithms for multiobjective simulation optimization ⋮ New bounds on the price of bandit feedback for mistake-bounded online multiclass learning ⋮ Gorthaur-EXP3: bandit-based selection from a portfolio of recommendation algorithms balancing the accuracy-diversity dilemma ⋮ Derivative-free optimization methods ⋮ Regret lower bound and optimal algorithm for high-dimensional contextual linear bandit ⋮ Two-armed bandit problem and batch version of the mirror descent algorithm ⋮ Multi-objective simultaneous optimistic optimization ⋮ Statistical Inference for Online Decision Making via Stochastic Gradient Descent ⋮ Unnamed Item ⋮ Stochastic continuum-armed bandits with additive models: minimax regrets and adaptive algorithm ⋮ Trading utility and uncertainty: applying the value of information to resolve the exploration-exploitation dilemma in reinforcement learning ⋮ Statistical Inference for Online Decision Making: In a Contextual Bandit Setting ⋮ Model-based Reinforcement Learning: A Survey

This page was built for publication: 10.1162/153244303321897663