Optimal learning with non-Gaussian rewards
DOI10.1017/APR.2015.9zbMATH Open1345.60039OpenAlexW2297707299MaRDI QIDQ2806349FDOQ2806349
Authors: Zi Ding, Ilya O. Ryzhov
Publication date: 17 May 2016
Published in: Advances in Applied Probability (Search for Journal in Brave)
Full work available at URL: https://projecteuclid.org/euclid.aap/1457466158
Recommendations
multi-armed banditoptimal stoppingpartial integro-differential equationoptimal learningGittins indicesnon-Gaussian rewardsprobabilistic interpolationLévy process
Processes with independent increments; Lévy processes (60G51) Integro-partial differential equations (45K05) Integro-partial differential equations (35R09) Stopping times; optimal stopping problems; gambling theory (60G40) Numerical methods for integral equations, integral transforms (65R99)
Cites Work
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Convergence properties of the expected improvement algorithm with fixed mean and covariance functions
- Title not available (Why is that?)
- Introductory lectures on fluctuations of Lévy processes with applications.
- Comparison methods for stochastic models and risks
- Probability and stochastics.
- The learning component of dynamic allocation indices
- Multi-armed bandit allocation indices. With a foreword by Peter Whittle.
- Title not available (Why is that?)
- Finite-time analysis of the multiarmed bandit problem
- Title not available (Why is that?)
- Bandit problems with Lévy processes
- Processes that can be embedded in Brownian motion
- Dynamic pricing with a prior on market response
- Stalking information: Bayesian inventory management with unobserved lost sales
- Dynamic assortment with demand learning for seasonal consumer goods
- The Multi-Armed Bandit Problem: Decomposition and Computation
- Discrete multiarmed bandits and multiparameter processes
- The knowledge gradient algorithm for a general class of online learning problems
- Consistency of sequential Bayesian sampling policies
- A generalized Gittins index for a class of multiarmed bandits with general resource requirements
- A Knowledge-Gradient Policy for Sequential Information Collection
- On optimal stopping and free boundary problems
- Dynamic allocation problems in continuous time
- Optimal investment and consumption with stochastic dividends
- Conditional Lévy processes
- Explicit Gittins Indices for a Class of Superdiffusive Processes
- Optimal learning and experimentation in bandit problems.
- Continuous multi-armed bandits and multiparameter processes
- Lévy bandits: Multi-armed bandits driven by Lévy processes
- Optimal learning for sequential sampling with non-parametric beliefs
- Title not available (Why is that?)
- Sur l'approximation des réduites. (On the approximation of residues)
- How Does the Value Function of a Markov Decision Process Depend on the Transition Probabilities?
- Properties of the Gittins index with application to optimal scheduling
- Sequential testing problems for Lévy processes
- Convergence of values in optimal stopping and convergence of optimal stopping times
Cited In (11)
- Undiscounted bandit games
- Optimal learning with \textit{Q}-aggregation
- The ratio index for budgeted learning, with applications
- Optimal Learning for Stochastic Optimization with Nonlinear Parametric Belief Models
- ON THE IDENTIFICATION AND MITIGATION OF WEAKNESSES IN THE KNOWLEDGE GRADIENT POLICY FOR MULTI-ARMED BANDITS
- Lévy bandits under Poissonian decision times
- A Framework of Learning Through Empirical Gain Maximization
- Learning Preferences Under Noise and Loss Aversion: An Optimization Approach
- Lévy bandits: Multi-armed bandits driven by Lévy processes
- ∊-Optimal nonlinear reinforcement scheme under a nonstationary muititeacher environment
- Optimal stopping problems in Lévy models with random observations
This page was built for publication: Optimal learning with non-Gaussian rewards
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2806349)