Pages that link to "Item:Q5305630"

The following pages link to (Q5305630):

Displayed 49 items.

(Q4986381) ‎ (← links)
(Q4998915) ‎ (← links)
Learning-Based Mean-Payoff Optimization in an Unknown MDP under Omega-Regular Constraints (Q5009420) ‎ (← links)
Concurrent MDPs with Finite Markovian Policies (Q5014502) ‎ (← links)
Bayesian adaptive bandit-based designs using the Gittins index for multi-armed trials with normally distributed endpoints (Q5035752) ‎ (← links)
An Incremental Fast Policy Search Using a Single Sample Path (Q5045345) ‎ (← links)
(Q5053310) ‎ (← links)
(Q5054652) ‎ (← links)
An Approach for Determining Stationary Equilibria in a Single-Controller Average Stochastic Game (Q5057963) ‎ (← links)
Average Cost Brownian Drift Control with Proportional Changeover Costs (Q5084489) ‎ (← links)
A Survey of Bidding Games on Graphs (Invited Paper) (Q5089263) ‎ (← links)
(Q5092369) ‎ (← links)
A Q-Learning Algorithm for Discrete-Time Linear-Quadratic Control with Random Parameters of Unknown Distribution: Convergence and Stabilization (Q5093265) ‎ (← links)
Effective Scenarios in Multistage Distributionally Robust Optimization with a Focus on Total Variation Distance (Q5093650) ‎ (← links)
(Q5093676) ‎ (← links)
Distributionally Robust Partially Observable Markov Decision Process with Moment-Based Ambiguity (Q5147037) ‎ (← links)
(Q5168869) ‎ (← links)
(Q5179071) ‎ (← links)
Characterization of the Optimal Risk-Sensitive Average Cost in Denumerable Markov Decision Chains (Q5219681) ‎ (← links)
A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs (Q5227201) ‎ (← links)
<html> Nash &epsilon;-equilibria for stochastic games with total reward functions: an approach through Markov decision processes</html> (Q5227205) ‎ (← links)
A Continuous-Time Markov Decision Process for Infrastructure Surveillance (Q5232836) ‎ (← links)
Repeated Sequential Prisoner's Dilemma: The Stackleberg Variant (Q5245844) ‎ (← links)
Synchronization and control in intrinsic and designed computation: An information-theoretic analysis of competing models of stochastic computation (Q5251235) ‎ (← links)
A Convex Analytic Approach to Risk-Aware Markov Decision Processes (Q5258943) ‎ (← links)
On Nash Equilibria in Stochastic Positional Games with Average Payoffs (Q5270512) ‎ (← links)
OPTIMAL CONTROL OF A TWO-SERVER QUEUEING SYSTEM WITH FAILURES (Q5349305) ‎ (← links)
OPTIMIZATION OF OVERFLOW POLICIES IN CALL CENTERS (Q5358051) ‎ (← links)
Dynamic Pricing with a Poisson Bandit Model (Q5431470) ‎ (← links)
(Q5446613) ‎ (← links)
Iterative Improvement of Lower and Upper Bounds for Backward SDEs (Q5738163) ‎ (← links)
Empirical Q-Value Iteration (Q5856670) ‎ (← links)
(Q5857015) ‎ (← links)
Multiply Accelerated Value Iteration for NonSymmetric Affine Fixed Point Problems and Application to Markov Decision Processes (Q5862806) ‎ (← links)
Minimising average passenger waiting time in personal rapid transit systems (Q5963104) ‎ (← links)
An exponential lower bound for Zadeh's pivot rule (Q6038661) ‎ (← links)
Off-line approximate dynamic programming for the vehicle routing problem with a highly variable customer basis and stochastic demands (Q6047879) ‎ (← links)
Formalization of methods for the development of autonomous artificial intelligence systems (Q6066037) ‎ (← links)
Optimal policies for stochastic clearing systems with time‐dependent delay penalties (Q6072160) ‎ (← links)
Block Policy Mirror Descent (Q6093281) ‎ (← links)
A unified algorithm framework for mean-variance optimization in discounted Markov decision processes (Q6096629) ‎ (← links)
Smoothing policies and safe policy gradients (Q6097096) ‎ (← links)
A dynamic analytic method for risk-aware controlled martingale problems (Q6104008) ‎ (← links)
A specification logic for programs in the probabilistic guarded command language (Q6109492) ‎ (← links)
A framework to measure the robustness of programs in the unpredictable environment (Q6135769) ‎ (← links)
Optimal Routing of Fixed Size Jobs to Two Parallel Servers (Q6160404) ‎ (← links)
Adaptive constraint satisfaction for Markov decision process congestion games: application to transportation networks (Q6163985) ‎ (← links)
Average cost minimization in a multi-server retrial queueing system with a controllable reserve group of servers (Q6169437) ‎ (← links)
Premium control with reinforcement learning (Q6174076) ‎ (← links)