Pages that link to "Item:Q5959973"

From MaRDI portal

← Finite-time analysis of the multiarmed bandit problem (Q5959973)

Jump to:navigation, search

What links here

⧼whatlinkshere-whatlinkshere-target⧽

Page:

⧼whatlinkshere-whatlinkshere-ns⧽

Namespace:

Invert selection

⧼whatlinkshere-whatlinkshere-filter⧽

Hide transclusions

Hide links

Hide redirects

The following pages link to Finite-time analysis of the multiarmed bandit problem (Q5959973):

Displaying 50 items.

Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges (Q254442) (← links)
On two continuum armed bandit problems in high dimensions (Q260274) (← links)
General game playing with stochastic CSP (Q265712) (← links)
Batched bandit problems (Q282463) (← links)
Algorithms for computing strategies in two-player simultaneous move games (Q286381) (← links)
An analysis for strength improvement of an MCTS-based program playing Chinese dark chess (Q307781) (← links)
Modification of improved upper confidence bounds for regulating exploration in Monte-Carlo tree search (Q307787) (← links)
LinUCB applied to Monte Carlo tree search (Q307792) (← links)
Infomax strategies for an optimal balance between exploration and exploitation (Q310029) (← links)
Using reinforcement learning to find an optimal set of features (Q316296) (← links)
Response-adaptive designs for clinical trials: simultaneous learning from multiple patients (Q320737) (← links)
Control problems in online advertising and benefits of randomized bidding strategies (Q328167) (← links)
The multi-armed bandit problem with covariates (Q355096) (← links)
Wisdom of crowds versus groupthink: learning in groups and in isolation (Q361811) (← links)
Kullback-Leibler upper confidence bounds for optimal sequential allocation (Q366995) (← links)
Exploration and exploitation of scratch games (Q374139) (← links)
Hypervolume indicator and dominance reward based multi-objective Monte-Carlo tree search (Q374142) (← links)
Adaptive aggregation for reinforcement learning in average reward Markov decision processes (Q378753) (← links)
Robustness of stochastic bandit policies (Q391739) (← links)
An artificial bee colony algorithm for the job shop scheduling problem with random processing times (Q400942) (← links)
An asymptotically optimal policy for finite support models in the multiarmed bandit problem (Q415624) (← links)
Temporal-difference search in Computer Go (Q420936) (← links)
The \(K\)-armed dueling bandits problem (Q440003) (← links)
Information capture and reuse strategies in Monte Carlo Tree Search, with applications to games of hidden information (Q464622) (← links)
Regret bounds for restless Markov bandits (Q465253) (← links)
MSO: a framework for bound-constrained black-box global optimization algorithms (Q524912) (← links)
Sampled fictitious play for approximate dynamic programming (Q547121) (← links)
Optimal Bayesian strategies for the infinite-armed Bernoulli bandit (Q643377) (← links)
A dynamic programming strategy to balance exploration and exploitation in the bandit problem (Q647433) (← links)
Analyzing bandit-based adaptive operator selection mechanisms (Q647443) (← links)
Modeling item-item similarities for personalized recommendations on Yahoo! front page (Q652346) (← links)
UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem (Q653803) (← links)
Corruption-tolerant bandit learning (Q669323) (← links)
Boundary crossing probabilities for general exponential families (Q722599) (← links)
Multi-armed bandits with episode context (Q766259) (← links)
Multi-objective simultaneous optimistic optimization (Q781163) (← links)
An asymptotically optimal strategy for constrained multi-armed bandit problems (Q784789) (← links)
A non-parametric solution to the multi-armed bandit problem with covariates (Q826996) (← links)
Optimal control with learning on the fly: a toy problem (Q832436) (← links)
Adaptive-treed bandits (Q888482) (← links)
Combining multiple strategies for multiarmed bandit problems and asymptotic optimality (Q892592) (← links)
Bandit-based Monte-Carlo structure learning of probabilistic logic programs (Q894703) (← links)
A perpetual search for talents across overlapping generations: a learning process (Q898767) (← links)
Truthful learning mechanisms for multi-slot sponsored search auctions with externalities (Q899160) (← links)
Online regret bounds for Markov decision processes with deterministic transitions (Q982638) (← links)
Active learning in heteroscedastic noise (Q982644) (← links)
Response adaptive designs that incorporate switching costs and constraints (Q997275) (← links)
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits (Q1017665) (← links)
Crowdsourcing with unsure option (Q1640566) (← links)
Improving multi-armed bandit algorithms in online pricing settings (Q1644914) (← links)

Retrieved from "https://portal.mardi4nfdi.de/wiki/Special:WhatLinksHere/Item:Q5959973"