Optimal learning with non-Gaussian rewards
From MaRDI portal
Publication:2806349
DOI10.1017/apr.2015.9zbMath1345.60039OpenAlexW2297707299MaRDI QIDQ2806349
Publication date: 17 May 2016
Published in: Advances in Applied Probability (Search for Journal in Brave)
Full work available at URL: https://projecteuclid.org/euclid.aap/1457466158
optimal stoppingLévy processmulti-armed banditpartial integro-differential equationoptimal learningGittins indicesnon-Gaussian rewardsprobabilistic interpolation
Lua error in Module:PublicationMSCList at line 37: attempt to index local 'msc_result' (a nil value).
Related Items (3)
ON THE IDENTIFICATION AND MITIGATION OF WEAKNESSES IN THE KNOWLEDGE GRADIENT POLICY FOR MULTI-ARMED BANDITS ⋮ Undiscounted bandit games ⋮ Optimal stopping problems in Lévy models with random observations
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Conditional Lévy processes
- Optimal learning for sequential sampling with non-parametric beliefs
- Sur l'approximation des réduites. (On the approximation of residues)
- Convergence properties of the expected improvement algorithm with fixed mean and covariance functions
- Continuous multi-armed bandits and multiparameter processes
- The learning component of dynamic allocation indices
- On optimal stopping and free boundary problems
- Processes that can be embedded in Brownian motion
- Discrete multiarmed bandits and multiparameter processes
- Dynamic allocation problems in continuous time
- Optimal learning and experimentation in bandit problems.
- Lévy bandits: Multi-armed bandits driven by Lévy processes
- Convergence of values in optimal stopping and convergence of optimal stopping times
- Introductory lectures on fluctuations of Lévy processes with applications.
- The Knowledge Gradient Algorithm for a General Class of Online Learning Problems
- Consistency of Sequential Bayesian Sampling Policies
- Probability and Stochastics
- Multi‐Armed Bandit Allocation Indices
- Dynamic Pricing with a Prior on Market Response
- PROPERTIES OF THE GITTINS INDEX WITH APPLICATION TO OPTIMAL SCHEDULING
- Optimal investment and consumption with stochastic dividends
- Dynamic Assortment with Demand Learning for Seasonal Consumer Goods
- Stalking Information: Bayesian Inventory Management with Unobserved Lost Sales
- A Generalized Gittins Index for a Class of Multiarmed Bandits with General Resource Requirements
- A Knowledge-Gradient Policy for Sequential Information Collection
- The Multi-Armed Bandit Problem: Decomposition and Computation
- How Does the Value Function of a Markov Decision Process Depend on the Transition Probabilities?
- Sequential Testing Problems for Lévy Processes
- Bandit Problems with Lévy Processes
- Explicit Gittins Indices for a Class of Superdiffusive Processes
- Finite-time analysis of the multiarmed bandit problem
This page was built for publication: Optimal learning with non-Gaussian rewards