The following pages link to (Q5305630):
Displayed 49 items.
- (Q4986381) (← links)
- (Q4998915) (← links)
- Learning-Based Mean-Payoff Optimization in an Unknown MDP under Omega-Regular Constraints (Q5009420) (← links)
- Concurrent MDPs with Finite Markovian Policies (Q5014502) (← links)
- Bayesian adaptive bandit-based designs using the Gittins index for multi-armed trials with normally distributed endpoints (Q5035752) (← links)
- An Incremental Fast Policy Search Using a Single Sample Path (Q5045345) (← links)
- (Q5053310) (← links)
- (Q5054652) (← links)
- An Approach for Determining Stationary Equilibria in a Single-Controller Average Stochastic Game (Q5057963) (← links)
- Average Cost Brownian Drift Control with Proportional Changeover Costs (Q5084489) (← links)
- A Survey of Bidding Games on Graphs (Invited Paper) (Q5089263) (← links)
- (Q5092369) (← links)
- A Q-Learning Algorithm for Discrete-Time Linear-Quadratic Control with Random Parameters of Unknown Distribution: Convergence and Stabilization (Q5093265) (← links)
- Effective Scenarios in Multistage Distributionally Robust Optimization with a Focus on Total Variation Distance (Q5093650) (← links)
- (Q5093676) (← links)
- Distributionally Robust Partially Observable Markov Decision Process with Moment-Based Ambiguity (Q5147037) (← links)
- (Q5168869) (← links)
- (Q5179071) (← links)
- Characterization of the Optimal Risk-Sensitive Average Cost in Denumerable Markov Decision Chains (Q5219681) (← links)
- A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs (Q5227201) (← links)
- <html> Nash &epsilon;-equilibria for stochastic games with total reward functions: an approach through Markov decision processes</html> (Q5227205) (← links)
- A Continuous-Time Markov Decision Process for Infrastructure Surveillance (Q5232836) (← links)
- Repeated Sequential Prisoner's Dilemma: The Stackleberg Variant (Q5245844) (← links)
- Synchronization and control in intrinsic and designed computation: An information-theoretic analysis of competing models of stochastic computation (Q5251235) (← links)
- A Convex Analytic Approach to Risk-Aware Markov Decision Processes (Q5258943) (← links)
- On Nash Equilibria in Stochastic Positional Games with Average Payoffs (Q5270512) (← links)
- OPTIMAL CONTROL OF A TWO-SERVER QUEUEING SYSTEM WITH FAILURES (Q5349305) (← links)
- OPTIMIZATION OF OVERFLOW POLICIES IN CALL CENTERS (Q5358051) (← links)
- Dynamic Pricing with a Poisson Bandit Model (Q5431470) (← links)
- (Q5446613) (← links)
- Iterative Improvement of Lower and Upper Bounds for Backward SDEs (Q5738163) (← links)
- Empirical Q-Value Iteration (Q5856670) (← links)
- (Q5857015) (← links)
- Multiply Accelerated Value Iteration for NonSymmetric Affine Fixed Point Problems and Application to Markov Decision Processes (Q5862806) (← links)
- Minimising average passenger waiting time in personal rapid transit systems (Q5963104) (← links)
- An exponential lower bound for Zadeh's pivot rule (Q6038661) (← links)
- Off-line approximate dynamic programming for the vehicle routing problem with a highly variable customer basis and stochastic demands (Q6047879) (← links)
- Formalization of methods for the development of autonomous artificial intelligence systems (Q6066037) (← links)
- Optimal policies for stochastic clearing systems with time‐dependent delay penalties (Q6072160) (← links)
- Block Policy Mirror Descent (Q6093281) (← links)
- A unified algorithm framework for mean-variance optimization in discounted Markov decision processes (Q6096629) (← links)
- Smoothing policies and safe policy gradients (Q6097096) (← links)
- A dynamic analytic method for risk-aware controlled martingale problems (Q6104008) (← links)
- A specification logic for programs in the probabilistic guarded command language (Q6109492) (← links)
- A framework to measure the robustness of programs in the unpredictable environment (Q6135769) (← links)
- Optimal Routing of Fixed Size Jobs to Two Parallel Servers (Q6160404) (← links)
- Adaptive constraint satisfaction for Markov decision process congestion games: application to transportation networks (Q6163985) (← links)
- Average cost minimization in a multi-server retrial queueing system with a controllable reserve group of servers (Q6169437) (← links)
- Premium control with reinforcement learning (Q6174076) (← links)