scientific article; zbMATH DE number 700091

From MaRDI portal
Publication:4315289

zbMath0829.90134MaRDI QIDQ4315289

Martin L. Puterman

Publication date: 6 December 1994


Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.



Related Items

Approximation metrics based on probabilistic bisimulations for general state-space Markov processes: a survey, An evidential approach to SLAM, path planning, and active exploration, Adaptive learning via selectionism and Bayesianism. II: The sequential case, Optimal control of a multiclass queueing system when customers can change types, Nonzero-sum constrained discrete-time Markov games: the case of unbounded costs, Convergence of controlled models and finite-state approximation for discounted continuous-time Markov decision processes with constraints, Approximate dynamic programming for stochastic linear control problems on compact state spaces, Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm, Synthesizing efficient systems in probabilistic environments, Multiscale Q-learning with linear function approximation, Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design, Discrete-time control for systems of interacting objects with unknown random disturbance distributions: a mean field approach, Modeling and optimization control of a demand-driven, conveyor-serviced production station, Optimal minimum bids and inventory scrapping in sequential, single-unit, Vickrey auctions with demand learning, A multi-step rolled forward chance-constrained model and a proactive dynamic approach for the wheat crop quality control problem, Circumventing the Slater conundrum in countably infinite linear programs, Response-adaptive designs for clinical trials: simultaneous learning from multiple patients, New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system, Modelling adherence behaviour for the treatment of obstructive sleep apnoea, Optimal policies of \(M(t)/M/c/c\) queues with two different levels of servers, Optimal policies for the berth allocation problem under stochastic nature, A multi-period ordering and clearance pricing model considering the competition between new and out-of-season products, Design and evaluation of norm-aware agents based on normative Markov decision processes, An application-oriented approach to dual control with excitation for closed-loop identification, Extreme state aggregation beyond Markov decision processes, A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces, Probabilistic inference for determining options in reinforcement learning, Decentralized stochastic control, Perspectives of approximate dynamic programming, Optimal control of queueing systems with non-collaborating servers, Online network design with outliers, Selecting malaria interventions: a top-down approach, Dynamic vehicle allocation control for automated material handling system in semiconductor manufacturing, Exploring the economic consequences of letting a supplier hold reserve storage, Multi-class, multi-resource advance scheduling with no-shows, cancellations and overbooking, Job control in heterogeneous computing systems, Constrained Markov decision processes with first passage criteria, Optimal assignment of servers to tasks when collaboration is inefficient, Q-learning and policy iteration algorithms for stochastic shortest path problems, (Approximate) iterated successive approximations algorithm for sequential decision processes, Adaptive aggregation for reinforcement learning in average reward Markov decision processes, Asymptotically optimal Bayesian sequential change detection and identification rules, Inventory replenishment control under supply uncertainty, A uniform framework for modeling nondeterministic, probabilistic, stochastic, or mixed processes and their behavioral equivalences, Compositional probabilistic verification through multi-objective model checking, A MDP approach to fault-tolerant routing, Safety verification for probabilistic hybrid systems, Determinacy and optimal strategies in infinite-state stochastic reachability games, Hybrid answer set programming, Robust adaptive dynamic programming for linear and nonlinear systems: an overview, Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model, Modular inverse reinforcement learning for visuomotor behavior, On real reward testing, Markov decision processes with state-dependent discount factors and unbounded rewards/costs, Algorithm portfolio selection as a bandit problem with unbounded losses, Stochastic relational processes: efficient inference and applications, Knows what it knows: a framework for self-aware learning, Finding optimal memoryless policies of POMDPs under the expected average reward criterion, Decision-theoretic planning with generalized first-order decision diagrams, R\&D pipeline management: task interdependencies and risk management, Finite horizon semi-Markov decision processes with application to maintenance systems, The optimal control of just-in-time-based production and distribution systems and performance comparisons with optimized pull systems, Dynamic lot-sizing in sequential online retail auctions, A reinforcement learning approach to call admission and call dropping control in links with variable capacity, New average optimality conditions for semi-Markov decision processes in Borel spaces, Monotone optimal control for a class of Markov decision processes, Performance optimization of queueing systems with perturbation realization, Optimal maintenance of systems with Markovian mission and deterioration, Optimizing a dynamic order-picking process, Three-valued abstraction for probabilistic systems, Average control of Markov decision processes with Feller transition probabilities and general action spaces, Discounted continuous-time Markov decision processes with unbounded rates and randomized history-dependent policies: the dynamic programming approach, An exact iterative search algorithm for constrained Markov decision processes, Value set iteration for Markov decision processes, Depth-based short-sighted stochastic shortest path problems, A tutorial on event-based optimization -- a new optimization framework, Event-based optimization of admission control in open queueing networks, Maximizing the set of recurrent states of an MDP subject to convex constraints, Regret bounds for restless Markov bandits, Maximizing entropy over Markov processes, Markov decision processes on Borel spaces with total cost and random horizon, Optimal call admission and call dropping control in links with variable capacity, Newton-based stochastic optimization using \(q\)-Gaussian smoothed functional algorithms, Profit maximization in flexible serial queueing networks, Markov limid processes for representing and solving renewal problems, Influence of temporal aggregation on strategic forest management under risk of wind damage, Improving active Mealy machine learning for protocol conformance testing, A counterexample on sample-path optimality in stable Markov decision chains with the average reward criterion, Towards a theory of game-based non-equilibrium control systems, Stochastic scheduling: a short history of index policies and new approaches to index generation for dynamic resource allocation, Parameterized Markov decision process and its application to service rate control, Stochastic finite-state systems in control theory, Dynamic mechanism design with interdependent valuations, Planning and acting in partially observable stochastic domains, Optimal cost almost-sure reachability in POMDPs, A leader-follower partially observed, multiobjective Markov game, Value of information for a leader-follower partially observed Markov game, Optimal pricing for a \(\mathrm{GI}/\mathrm{M}/k/N\) queue with several customer types and holding costs, Efficient approximation of optimal control for continuous-time Markov games, A unified approach to time-aggregated Markov decision processes, Sequential variable sampling plan for normal distribution, Optimization of a large-scale water reservoir network by stochastic dynamic programming with efficient state space discretization, On the optimality equation for average cost Markov control processes with Feller transition probabilities, Event-based optimization approach for solving stochastic decision problems with probabilistic constraint, Tweaking the odds in probabilistic timed automata, Evaluation and prediction of an optimal control in a processor sharing queueing system with heterogeneous servers, Runtime monitors for Markov decision processes, Model-free reinforcement learning for branching Markov decision processes, SIR dynamics with vaccination in a large configuration model, Probabilistic planning with clear preferences on missing information, Practical solution techniques for first-order MDPs, Zero-sum stochastic games with average payoffs: new optimality conditions, Online stochastic reservation systems, Strategy optimization for controlled Markov process with descriptive complexity constraint, Stochastic constraint programming: A scenario-based approach, Zero-sum continuous-time Markov games with unbounded transition and discounted payoff rates, On ordinal comparison of policies in Markov reward processes, The dynamic shortest path problem with anticipation, A fuzzy approach to Markov decision processes with uncertain transition probabilities, Means-end relations and a measure of efficacy, Marginal productivity index policies for scheduling a multiclass delay-/loss-sensitive queue, Perfect information two-person zero-sum Markov games with imprecise transition probabilities, Keep or return? Managing ordering and return policies in start-up companies, Revenue management for a make-to-order company with limited inventory capacity, A formal mathematical framework for modeling probabilistic hybrid systems, A semimartingale characterization of average optimal stationary policies for Markov decision processes, Clinic scheduling models with overbooking for patients with heterogeneous no-show probabilities, Multi-objective optimization of water-using systems, Allocation of empty containers between multi-ports, Exact decomposition approaches for Markov decision processes: a survey, Interleaving solving and elicitation of constraint satisfaction problems based on expected cost, Risk-averse dynamic programming for Markov decision processes, The policy iteration algorithm for average continuous control of piecewise deterministic Markov processes, A variable neighborhood search based algorithm for finite-horizon Markov decision processes, Performance evaluation of direct heuristic dynamic programming using control-theoretic measures, Reducing reinforcement learning to KWIK online regression, An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes, On a multi-period supply chain system with supplementary order opportunity, Ranking policies in discrete Markov decision processes, Dynamic control of a single-server system with abandonments, Stochastic control via direct comparison, Time aggregated Markov decision processes via standard dynamic programming, Explicit solution of the average-cost optimality equation for a pest-control problem, Performance analysis for controlled semi-Markov systems with application to maintenance, Industry dynamics: foundations for models with an infinite number of firms, Completion-of-squares: revisited and extended, Using negotiable features for prescription problems, Specifying and computing preferred plans, The orienteering problem with stochastic travel and service times, A dynamic programming strategy to balance exploration and exploitation in the bandit problem, Optimization of heuristic search using recursive algorithm selection and reinforcement learning, Decentralized MDPs with sparse interactions, Discounted continuous-time constrained Markov decision processes in Polish spaces, Optimal resource allocation for multiqueue systems with a shared server pool, Approximation of Markov decision processes with general state space, Management of the risk of wind damage in forestry: a graph-based Markov decision process approach, Resource allocation in congested queueing systems with time-varying demand: an application to airport operations, Computing equilibria in discounted dynamic games, Analyzing anonymity attacks through noisy channels, Exact and approximate Nash equilibria in discounted Markov stopping games with terminal redemption, Integrating inventory control and capacity management at a maintenance service provider, Control-limit policies for a class of stopping time problems with termination restrictions, Value set iteration for two-person zero-sum Markov games, An exponential lower bound for Cunningham's rule, Continuous-time Markov decision processes with risk-sensitive finite-horizon cost criterion, Finite approximation of the first passage models for discrete-time Markov decision processes with varying discount factors, Quantitative model-checking of controlled discrete-time Markov processes, Policy iteration for robust nonstationary Markov decision processes, Pseudopolynomial iterative algorithm to solve total-payoff games and min-cost reachability games, Admission control in UMTS networks based on approximate dynamic programming, Performance optimization of semi-Markov decision processes with discounted-cost criteria, Finite approximation for finite-horizon continuous-time Markov decision processes, Meet your expectations with guarantees: beyond worst-case synthesis in quantitative games, Stochastic games with unbounded payoffs: applications to robust control in economics, Accuracy of fluid approximations to controlled birth-and-death processes: absorbing case, A policy iteration heuristic for constrained discounted controlled Markov chains, Semi-Markov control models with partially known holding times distribution: discounted and average criteria, Dynamic pricing and scheduling in a multi-class single-server queueing system, Dynamic resource allocation in a multi-product make-to-stock production system, Sampled fictitious play for approximate dynamic programming, General notions of indexability for queueing control and asset management, A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: multichain cases, Teaching randomized learners with feedback, Approximate dynamic programming via direct search in the space of value function approximations, A tractable discrete fractional programming: application to constrained assortment optimization, Stochastic decomposition applied to large-scale hydro valleys management, Optimal and heuristic policies for assemble-to-order systems with different review periods, A stochastic dynamic programming approach for delay management of a single train line, M/G/\(1\) queue with event-dependent arrival rates, Heuristic procedures for a stochastic batch service problem, Program repair without regret, Policy gradient in Lipschitz Markov decision processes, Finite optimal control for time-bounded reachability in CTMDPs and continuous-time Markov games, Optimality, equilibrium, and curb sets in decision problems without commitment, On essential information in sequential decision processes, On mean reward variance in semi-Markov processes, On the optimality of a full-service policy for a queueing system with discounted costs, Solving factored MDPs using non-homogeneous partitions, A multigenerational game model to analyze sustainable development, Constraint solving in uncertain and dynamic environments: A survey, Continuous-time controlled Markov chains., Policy iteration type algorithms for recurrent state Markov decision processes, An empirical study of policy convergence in Markov decision process value iteration, Analysis of optimal and nearly optimal sequencing policies for a closed queueing network, Optimal patient and personnel scheduling policies for care-at-home service facilities, Optimizing contracted resource capacity with two advance cancelation modes, Twenty years of rewriting logic, Periodic capacity management under a lead-time performance constraint, The transformation method for continuous-time Markov decision processes, Cooperative Markov decision processes: time consistency, greedy players satisfaction, and cooperation maintenance, Probabilistic may/must testing: retaining probabilities by restricted schedulers, Optimal balanced control for call centers, Strong polynomiality of the Gass-Saaty shadow-vertex pivoting rule for controlled random walks, Sequential approaches for learning datum-wise sparse representations, Dynamic scheduling of a single-server two-class queue with constant retrial policy, Structural properties of the optimal resource allocation policy for single-queue systems, Markov decision processes in service facilities holding perishable inventory, Policy iteration for bounded-parameter POMDPs, Online stochastic optimization under time constraints, Hybrid least-squares algorithms for approximate policy evaluation, Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains, Semi-Markov control processes with unknown holding times distribution under an average cost criterion, Markov-achievable payoffs for finite-horizon decision models., Examining military medical evacuation dispatching policies utilizing a Markov decision process model of a controlled queueing system, Shape constraints in economics and operations research, On the link between infinite horizon control and quasi-stationary distributions, Multi-robot inverse reinforcement learning under occlusion with estimation of state transitions, A unified approach to Markov decision problems and performance sensitivity analysis, Verification and control for probabilistic hybrid automata with finite bisimulations, Markov decision processes with sequential sensor measurements, Optimal timing of airline promotions under dilution, The complexity of synchronizing Markov decision processes, \(L^\ast\)-based learning of Markov decision processes (extended version), Task-structured probabilistic I/O automata, Variance minimization of parameterized Markov decision processes, Discrete-time hybrid control in Borel spaces, Discrete-time hybrid control in Borel spaces: average cost optimality criterion, Optimal lateral transshipment policies for a two location inventory problem with multiple demand classes, Dynamic repositioning strategy in a bike-sharing system; how to prioritize and how to rebalance a bike station, On Bayesian index policies for sequential resource allocation, Optimization and customer utilities under dynamic lead time quotation in an \(M / M\) type base stock system, Approximate dynamic programming for missile defense interceptor fire control, Optimal control of branching diffusion processes: a finite horizon problem, Joint condition-based maintenance and inventory optimization for systems with multiple components, Economic design of memory-type control charts: the fallacy of the formula proposed by Lorenzen and Vance (1986), Piracy on the internet: accommodate it or fight it? A dynamic approach, Expiration dates and order quantities for perishables, Optimal control of a continuous-time \(W\)-configuration assemble-to-order system, Managing an integrated production and inventory system selling to a dual market: long-term and walk-in, On the scheduling of operations in a chat contact center, A multi-stage stochastic optimization model of a pastoral dairy farm, The jump start power method: a new approach for computing the ergodic projector of a finite Markov chain, Lagrangian relaxation and constraint generation for allocation and advanced scheduling, Illustrated review of convergence conditions of the value iteration algorithm and the rolling horizon procedure for average-cost MDPs, An analytic framework to develop policies for testing, prevention, and treatment of two-stage contagious diseases, Modelling and analysis of healthcare inventory management systems, Multi-policy improvement in stochastic optimization with forward recursive function criteria, Basic ideas for event-based optimization of Markov systems, A characterization of the optimal risk-sensitive average cost in finite controlled Markov chains, Index policies for the maintenance of a collection of machines by a set of repairmen, Front-office multitasking between service encounters and back-office tasks, A discounted approach in communicating average Markov decision chains under risk-aversion, Policy-based branch-and-bound for infinite-horizon multi-model Markov decision processes, Dimensioning a queue with state-dependent arrival rates, Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming, Approximation of two-person zero-sum continuous-time Markov games with average payoff criterion, Inverse optimization in countably infinite linear programs, Improved bound on the worst case complexity of policy iteration, Optimal production and inventory rationing policies with selective-information sharing and two demand classes, Optimizing the interaction between residents and attending physicians, Condition-based inspection policies for boiler heat exchangers, Singularly perturbed linear programs and Markov decision processes, Robust analysis of discounted Markov decision processes with uncertain transition probabilities, Coupling based estimation approaches for the average reward performance potential in Markov chains, Long-term values in Markov decision processes, (co)algebraically, Inductive synthesis for probabilistic programs reaches new horizons, Multi-objective optimization of long-run average and total rewards, Markov decision processes with dynamic transition probabilities: an analysis of shooting strategies in basketball, Optimal stopping time on discounted semi-Markov processes, Importance sampling in reinforcement learning with an estimated behavior policy, Question selection for multi-attribute decision-aiding., Special subclass of generalized semi-Markov decision processes with discrete time, Age-based Markovian approximation of the G/M/1 queue, Ergodic inventory control with diffusion demand and general ordering costs, Efficient algorithms for risk-sensitive Markov decision processes with limited budget, Adaptive optimization and the harvest of biological populations, Enhancing probabilistic model checking with ontologies, Nash equilibria in a class of Markov stopping games with total reward criterion, Subgame maxmin strategies in zero-sum stochastic games with tolerance levels, Hidden Markov models: inverse filtering, belief estimation and privacy protection, Inverse reinforcement learning in contextual MDPs, Grounded action transformation for sim-to-real reinforcement learning, Dealing with multiple experts and non-stationarity in inverse reinforcement learning: an application to real-life problems, Reinforcement learning and stochastic optimisation, Contractive approximations in risk-sensitive average semi-Markov decision chains on a finite state space, Deep reinforcement learning for inventory control: a roadmap, Solving possibilistic games with incomplete information, Optimal policy for minimizing risk models in Markov decision processes, A class of dual fuzzy dynamic programs, Controlled Markov set-chains under average criteria, Hoeffding's inequality for uniformly ergodic Markov chains, Discounted stochastic games with voluntary transfers, An incremental off-policy search in a model-free Markov decision process using a single sample path, Replacement and inventory control for a multi-customer product service system with decreasing replacement costs, Reinforcement learning-based design of sampling policies under cost constraints in Markov random fields: application to weed map reconstruction, Axiomatising infinitary probabilistic weak bisimilarity of finite-state behaviours, Resource allocation and routing in parallel multi-server queues with abandonments for cloud profit maximization, Markov decision process measurement model, Scheduling in a multi-class series of queues with deterministic service times, On computing average cost optimal policies with application to routing to parallel queues, A risk-averse inventory model with Markovian purchasing costs, The value iteration algorithm is not strongly polynomial for discounted dynamic programming, Reformulation of the linear program for completely ergodic MDPs with average cost criteria, Average cost criterion induced by the regular utility function for continuous-time Markov decision processes, A pause control approach to the value iteration scheme in average Markov decision processes, A note on the convergence rate of the value iteration scheme in controlled Markov chains, Model-based average reward reinforcement learning, Maximizing the probability of pest extinction on a stochastic pest-predator model, Intervention to maximise the probability of epidemic fade-out, Mean-variance problems for finite horizon semi-Markov decision processes, A minimization problem of the risk probability in first passage semi-Markov decision processes with loss rates, Optimal control of a production-inventory system with product returns and two disposal options, Reduce shortage with self-reservation policy for a manufacturer paying both fixed and variable stockout expenditure, \textsc{ULTraS} at work: compositionality metaresults for bisimulation and trace semantics, Markov control models with unknown random state-action-dependent discount factors, Probabilistic timed automata with clock-dependent probabilities, A probability criterion for zero-sum stochastic games, The risk probability criterion for discounted continuous-time Markov decision processes, Optimal dispatching in a tandem queue, Variance-constrained actor-critic algorithms for discounted and average reward MDPs, Random search for constrained Markov decision processes with multi-policy improvement, Abstraction and approximate decision-theoretic planning., Continuous-time Markov decision processes under the risk-sensitive average cost criterion, Affect control processes: intelligent affective interaction using a partially observable Markov decision process, Real-time dynamic programming for Markov decision processes with imprecise probabilities, Dynamic quotation of leadtime and price for a make-to-order system with multiple customer classes and perfect information on customer preferences, Joint procurement and demand-side bidding strategies under price volatility, Verification and control of partially observable probabilistic systems, Approximate receding horizon approach for Markov decision processes: average reward case, Approachability in Stackelberg stochastic games with vector costs, Verifiable conditions for average optimality of continuous-time Markov decision processes, A risk minimization problem for finite horizon semi-Markov decision processes with loss rates, Verifiable conditions for the irreducibility and aperiodicity of Markov chains by analyzing underlying deterministic models, Optimal scheduling of multiple sensors over shared channels with packet transmission constraint, Analysis of averages over distributions of Markov processes, A multi-cluster time aggregation approach for Markov chains, Delay-aware online service scheduling in high-speed railway communication systems, Dynamic request routing for online video-on-demand service: a Markov decision process approach, A weakly monotonic backward induction algorithm on finite bounded subsets of vector lattices., Dynamic priority allocation via restless bandit marginal productivity indices, Continuous state dynamic programming via nonexpansive approximation, Lexicographic refinements in stationary possibilistic Markov decision processes, Robust topological policy iteration for infinite horizon bounded Markov decision processes, Relationship between least squares Monte Carlo and approximate linear programming, Learning agents in an artificial power exchange: Tacit collusion, market power and efficiency of two double-auction mechanisms, An index heuristic for transshipment decisions in multi-location inventory systems based on a pairwise decomposition, Saddle-point calculation for constrained finite Markov chains, Optimal integrated production and inventory control of an assemble-to-order system with multiple non-unitary demand classes, A note on negative dynamic programming for risk-sensitive control, An analysis of model-based interval estimation for Markov decision processes, Policy iteration for continuous-time average reward Markov decision processes in Polish spaces, Continuous-time Markov decision processes with \(n\)th-bias optimality criteria, Policy iteration for customer-average performance optimization of closed queueing systems, Conformant plans and beyond: principles and complexity, Asymptotically optimal parallel resource assignment with interference, Analyzing the dynamics of stigmergetic interactions through pheromone games, Online regret bounds for Markov decision processes with deterministic transitions, Timing of testing and treatment for asymptomatic diseases, Optimal control of a production-inventory system with customer impatience, Sensitivity analysis and optimal ultimately stationary deterministic policies in some constrained discounted cost models, Managing an integrated production inventory system with information on the production and demand status and multiple non-unitary demand classes, An innovative approach for strategic capacity portfolio planning under uncertainties, Average optimality for continuous-time Markov decision processes in Polish spaces, Characterizing extreme points as basic feasible solutions in infinite linear programs, Markov decision processes with exponentially representable discounting, Neighbourhood search for constructing Pareto sets, Transfer in variable-reward hierarchical reinforcement learning, Projected equation methods for approximate solution of large linear systems, Fast convergence to state-action frequency polytopes for MDPs, Advance demand information and a restricted production capacity: on the optimality of order base-stock policies, Simulation-based designs for multiperiod control, Utility-based on-line exploration for repeated navigation in an embedded graph, The value of information in a capacitated closed loop supply chain, Probabilistic weak simulation is decidable in polynomial time, Effects of system parameters on the optimal policy structure in a class of queueing control problems, Lower bounding aggregation and direct computation for an infinite horizon one-reservoir model, Conditional decision processes with recursive function, Single sample path-based optimization of Markov chains, Random walk, birth-and-death process and their fluid approximations: Absorbing case, Resource-constrained management of heterogeneous assets with stochastic deterioration, A tutorial on partially observable Markov decision processes, Theoretical tools for understanding and aiding dynamic decision making, Solutions of the average cost optimality equation for finite Markov decision chains: Risk-sensitive and risk-neutral criteria, Modeling secrecy and deception in a multiple-period attacker-defender signaling game, Stochastic dynamic programming with factored representations, Bounded-parameter Markov decision processes, Natural actor-critic algorithms, The value iteration method for countable state Markov decision processes, Generalizing Markov decision processes to imprecise probabilities, Notes on average Markov decision processes with a minimum-variance criterion, Synthesizing optimal bias in randomized self-stabilization, Optimal market thickness, A counter abstraction technique for verifying properties of probabilistic swarm systems, Analyzing generalized planning under nondeterminism, Variable demand and multi-commodity flow in Markovian network equilibrium, A new dissipativity condition for asymptotic stability of discounted economic MPC, Automatic verification of concurrent stochastic systems, A sojourn-based approach to semi-Markov reinforcement learning, Markov automata with multiple objectives, Sensitivity-based optimization for blockchain selfish mining, The platform design problem, Out of control: reducing probabilistic models by control-state elimination, Stochastic control of a class of dynamical systems via path limits, Stability-constrained Markov decision processes using MPC, Lipschitzness is all you need to tame off-policy generative adversarial imitation learning, Semi-Lipschitz functions and machine learning for discrete dynamical systems on graphs, Policy space identification in configurable environments, Computing transience bounds of emergency call centers: a hierarchical timed Petri net approach, Reinforcement learning with algorithms from probabilistic structure estimation, Costless delay in negotiations, Hybrid offline/online optimization for energy management via reinforcement learning, Convergence of deep fictitious play for stochastic differential games, Throughput maximization of complex resource allocation systems through timed-continuous-Petri-net modeling, Optimal energy-efficient policies for data centers through sensitivity-based optimization, Multi-machine preventive maintenance scheduling with imperfect interventions: a restless bandit approach, Efficient algorithms of pathwise dynamic programming for decision optimization in mining operations, On the probabilistic bisimulation spectrum with silent moves, Bias optimality of admission control in a non-stationary repairable queue, A novel decomposition-based method for solving general-product structure assemble-to-order systems, Improved value iteration for neural-network-based stochastic optimal control design, Strategies and trajectories of coral reef fish larvae optimizing self-recruitment, Reachability and safety objectives in Markov decision processes on long but finite horizons, Discounted approximations in risk-sensitive average Markov cost chains with finite state space, An arbitrary starting tracing procedure for computing subgame perfect equilibria, Dynamic optimization over infinite-time horizon: web-building strategy in an orb-weaving spider as a case study, Detection-averse optimal and receding-horizon control for Markov decision processes, Optimal strategies for managing complex authentication systems, First-order sensitivity of the optimal value in a Markov decision model with respect to deviations in the transition probability function, Discrete-time control with non-constant discount factor, A conservative index heuristic for routing problems with multiple heterogeneous service facilities, The value of a draw, The role of information in system stability with partially observable servers, Stationary equilibria of mean field games with finite state and action space, Farsighted manipulation and exploitation in networks, Towards general axiomatizations for bisimilarity and trace semantics, Algorithms and conditional lower bounds for planning problems, A survey of inverse reinforcement learning: challenges, methods and progress, Counterfactual state explanations for reinforcement learning agents via generative deep learning, Managing mobile production-inventory systems influenced by a modulation process, Enhance load forecastability: optimize data sampling policy by reinforcing user behaviors, Joint condition-based maintenance and load-sharing optimization for two-unit systems with economic dependency, Symbolic algorithms for qualitative analysis of Markov decision processes with Büchi objectives, Unifying temporal and organizational scales in multiscale decision-making, Dynamic journeying under uncertainty, Integrated inventory management and supplier base reduction in a supply chain with multiple uncertainties, On the optimal control of manufacturing and remanufacturing activities with a single shared server, Average case analysis of the classical algorithm for Markov decision processes with Büchi objectives, Optimal policies and bounds for stochastic inventory systems with lost sales, A simple heuristic for load balancing in parallel processing networks with highly variable service time distributions, A theoretic and practical framework for scheduling in a stochastic environment, Optimal risk probability for first passage models in semi-Markov decision processes, Deciding probabilistic simulation between probabilistic pushdown automata and finite-state systems, Risk-sensitive average equilibria for discrete-time stochastic games, Optimal pricing for tandem queues with finite buffers, Dynamic exploitation of myopic best response, Optimality of admission control in a repairable queue, An approximate dynamic programming approach for comparing firing policies in a networked air defense environment, Partially observable game-theoretic agent programming in Golog, Convergence of Markov decision processes with constraints and state-action dependent discount factors, Admission control in a two-class loss system with periodically varying parameters and abandonments, Computing expected runtimes for constant probability programs, Admit or preserve? Addressing server failures in cloud computing task management, Admission control strategies for tandem Markovian loss systems, Controller exploitation-exploration reinforcement learning architecture for computing near-optimal policies, Quantitative static analysis of communication protocols using abstract Markov chains, A hybrid repair-replacement policy in the proportional hazards model, Semi-Markov decision processes with vector pay-offs, Approximation and mean field control of systems of large populations, Mean-semivariance optimality for continuous-time Markov decision processes, Lexicographic refinements in possibilistic decision trees and finite-horizon Markov decision processes, Late-rejection, a strategy to perform an overflow policy, An active-set strategy to solve Markov decision processes with good-deal risk measure, Dynamic matching with teams, Learning COVID-19 mitigation strategies using reinforcement learning, Dynamic bus dispatch policies, On the Whittle index of Markov modulated restless bandits, Operational optimization of wastewater treatment plants: a CMDP based decomposition approach, Optimal price-threshold control for battery operation with aging phenomenon: a quasiconvex optimization approach, Bounds for synchronizing Markov decision processes, Dynamic node packing, Economic MPC of Markov decision processes: dissipativity in undiscounted infinite-horizon optimal control, Optimal admission control in queues with abandonments, Comparison of algorithms for simple stochastic games, Essential stationary equilibria of mean field games with finite state and action space, Toward theoretical understandings of robust Markov decision processes: sample complexity and asymptotics, Batch policy learning in average reward Markov decision processes, Multi-objective dynamic programming with limited precision, Whittle index based Q-learning for restless bandits with average reward, Dynamic policies for resource reallocation in a robotic mobile fulfillment system with time-varying demand, Verification of multiplayer stochastic games via abstract dependency graphs, A dynamic programming model for effect of worker's type on wage arrears, Dynamic lead time quotation under responsive inventory and multiple customer classes, Meeting a deadline: shortest paths on stochastic directed acyclic graphs with information gathering, A partial history of the early development of continuous-time nonlinear stochastic systems theory, Outpatient appointment scheduling given individual day-dependent no-show predictions, Project planning with alternative technologies in uncertain environments, A Jackson network model and threshold policy for joint optimization of energy and delay in multi-hop wireless networks, Managing an assemble-to-order system with after sales market for components, A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning, Dynamic pricing of multiple home delivery options, Optimality equations and inequalities in a class of risk-sensitive average cost Markov decision chains, Dynamic multiagent probabilistic inference, Interval iteration algorithm for MDPs and IMDPs, Central limit theorem for the estimator of the value of an optimal stopping problem, Comparative branching-time semantics for Markov chains, Dynamic routing to heterogeneous collections of unreliable servers, Reinforcement learning based algorithms for average cost Markov decision processes, Restricted gradient-descent algorithm for value-function approximation in reinforcement learning, Sequential Monte Carlo in reachability heuristics for probabilistic planning, Reachability analysis of uncertain systems using bounded-parameter Markov decision processes, Accelerated modified policy iteration algorithms for Markov decision processes, The stochastic shortest-path problem for Markov chains with infinite state space with applications to nearest-neighbor lattice chains, Optimal threshold control of a retrial queueing system with finite buffer, Sporadic overtaking optimality in Markov decision problems, An elective surgery scheduling problem considering patient priority, Complexity bounds for approximately solving discounted MDPs by value iterations, Remote state estimation with usage-dependent Markovian packet losses, The complexity of reachability in parametric Markov decision processes, Geometric backtracking for combined task and motion planning in robotic systems, Policy iterations for reinforcement learning problems in continuous time and space -- fundamental theory and methods, Symblicit algorithms for mean-payoff and shortest path in monotonic Markov decision processes, Supervisor synthesis of POMDP via automata learning, Bounding fixed points of set-based Bellman operator and Nash equilibria of stochastic games, Discounted approximations to the risk-sensitive average cost in finite Markov chains, Error bounds for stochastic shortest path problems, Optimization of Markov decision processes under the variance criterion, An approximate dynamic programming approach to the admission control of elective patients, On the optimality of a maintenance queueing system, Scheduling Markovian PERT networks to maximize the net present value: new results, Recursive stochastic games with positive rewards, Learning efficient logic programs, Optimization of a special case of continuous-time Markov decision processes with compact action set, First passage models for denumerable semi-Markov decision processes with nonnegative discounted costs, Dynamic programming and minimum risk paths, Finding the \(K\) best policies in a finite-horizon Markov decision process, Policy iteration based feedback control, SLAP: specification logic of actions with probability, PageRank optimization by edge selection, Strong polynomiality of policy iterations for average-cost MDPs modeling replacement and maintenance problems, Optimal allocation policy for a multi-location inventory system with a quick response warehouse, Optimal admission control for tandem loss systems with two stations, State observation accuracy and finite-memory policy performance, Real-reward testing for probabilistic processes, Optimal schedulers vs optimal bases: an approach for efficient exact solving of Markov decision processes, Fuzzy optimality relation for perceptive MDPs-the average case, Optimal empty vehicle repositioning and fleet-sizing for two-depot service systems, A multi-period TSP with stochastic regular and urgent demands, A policy improvement method for constrained average Markov decision processes, Converging marriage in honey-bees optimization and application to stochastic dynamic programming, A structured pattern matrix algorithm for multichain Markov decision processes, Markov control processes with randomized discounted cost, Reading policies for joins: an asymptotic analysis, NP-hardness of checking the unichain condition in average cost MDPs, Adaptive stepsize selection for tracking in a regime-switching environment, Stability and optimality of a multi-product production and storage system under demand uncertainty, A class of algorithms for collision resolution with multiplicity estimation, Structural results on a batch acceptance problem for capacitated queues, Constrained continuous-time Markov decision processes with average criteria, A tutorial on the cross-entropy method, Basis function adaptation in temporal difference reinforcement learning, Monotonic robust optimal control policies for the time-quality trade-offs in concurrent new product development (NPD), On the optimal control of a two-queue polling model, Bisimulation and cocongruence for probabilistic systems, Using adaptive learning in credit scoring to estimate take-up probability distribution, Mum, why do you keep on growing? Impacts of environmental variability on optimal growth and reproduction allocation strategies of annual plants, Time consistent dynamic risk measures, An actor-critic algorithm for constrained Markov decision processes, Approximating infinite horizon stochastic optimal control in discrete time with constraints, Dynamic risk measures under model uncertainty, Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm, Service rate control of closed Jackson networks from game theoretic perspective, Sensitivity of optimal prices to system parameters in a steady-state service facility, Credibilistic Markov decision processes: The average case, Automatic EEG classification: a path to smart and connected sleep interventions, Approximated timed reachability graphs for the robust control of discrete event systems, Dynamic load balancing in parallel queueing systems: stability and optimal control, Hierarchical testing designs for pattern recognition, Stratified breast cancer follow-up using a continuous state partially observable Markov decision process, Perishable inventory management and dynamic pricing using RFID technology, Stochastic approximations of constrained discounted Markov decision processes, Efficient computation of time-bounded reachability probabilities in uniform continuous-time Markov decision processes, Integral control for population management, Incompleteness of results for the slow-server problem with an unreliable fast server, Semi-Markov decision processes with variance minimization criterion, Performance optimization for a class of generalized stochastic Petri nets, Continuity of the optimal average cost in Markov decision chains with small risk-sensitivity, Transshipment and rebalancing policies for library books, Accelerating the convergence of value iteration by using partial transition functions, A semi-Markov decision problem for proactive and reactive transshipments between multiple warehouses, On probabilistic snap-stabilization, Markov Decision Processes with Asymptotic Average Failure Rate Constraint, OPTIMALITY OF TRUNK RESERVATION FOR AN M/M/K/N QUEUE WITH SEVERAL CUSTOMER TYPES AND HOLDING COSTS, Index policies for the routing of background jobs, Approximate policy iteration: a survey and some new methods, A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications, Approximating Labelled Markov Processes Again!, Dynamic production control in a serial line with process queue time constraint, Synthesizing Efficient Controllers, The Expected Total Cost Criterion for Markov Decision Processes under Constraints: A Convex Analytic Approach, The Odds of Staying on Budget, Understanding Probabilistic Programs, Bayesian estimation of the mean holding time in average semi-Markov control processes, Model Checking Probabilistic Systems, Unnamed Item, Algorithms for Optimal Control of Stochastic Switching Systems, Concavely-Priced Probabilistic Timed Automata, Optimal Allocation Problem in the Machine Repairman System with Heterogeneous Servers, Minimum Average Value-at-Risk for Finite Horizon Semi-Markov Decision Processes in Continuous Time, Admission Control Policies in a Finite Capacity Geo/Geo/1 Queue Under Partial State Observations, Optimal Control of Finite-Valued Networks, A Tutorial on Interactive Markov Chains, A Theory for the Semantics of Stochastic and Non-deterministic Continuous Systems, Markov Reward Models and Markov Decision Processes in Discrete and Continuous Time: Performance Evaluation and Optimization, Dynamic decision making: a comparison of approaches, Multilevel Simulation Based Policy Iteration for Optimal Stopping--Convergence and Complexity, Constrained Markov decision processes with uncertain costs, A dynamic ``predict, then optimize preventive maintenance approach using operational intervention data, Specification and optimal reactive synthesis of run-time enforcement shields, Value iteration for simple stochastic games: stopping criterion and learning algorithm, Red light green light method for solving large Markov chains, Sample-Path Optimal Stationary Policies in Stable Markov Decision Chains with the Average Reward Criterion, First Passage Optimality and Variance Minimisation of Markov Decision Processes with Varying Discount Factors, Asymptotic optimality and rates of convergence of quantized stationary policies in continuous-time Markov decision processes, Bisimulations for non-deterministic labelled Markov processes, MAXIMIZING THE THROUGHPUT OF TANDEM LINES WITH FLEXIBLE FAILURE-PRONE SERVERS AND FINITE BUFFERS, Quantitative controller synthesis for consumption Markov decision processes, Solving an Infinite-Horizon Discounted Markov Decision Process by DC Programming and DCA, Computing Game Values for Crash Games, Deciding Simulations on Probabilistic Automata, Policy mirror descent for reinforcement learning: linear convergence, new sampling complexity, and generalized problem classes, Probabilistic CEGAR, An average-value-at-risk criterion for Markov decision processes with unbounded costs, Bias and Overtaking Optimality for Continuous-Time Jump Markov Decision Processes in Polish Spaces, Controller Synthesis and Verification for Markov Decision Processes with Qualitative Branching Time Objectives, On optimality gaps for fuzzification in finite Markov decision processes, A Cost-Based Model and Algorithms for Interleaving Solving and Elicitation of CSPs, An axiomatic approach to Markov decision processes, Unnamed Item, Online Regret Bounds for Markov Decision Processes with Deterministic Transitions, STRONG AVERAGE OPTIMALITY FOR CONTROLLED NONHOMOGENEOUS MARKOV CHAINS*, Simultaneous determination of production and maintenance schedules using in‐line equipment condition and yield information, Optimal empty vehicle redistribution for hub‐and‐spoke transportation systems, Deciding Fast Termination for Probabilistic VASS with Nondeterminism, Linear Temporal Logic Satisfaction in Adversarial Environments Using Secure Control Barrier Certificates, Solving for Best Responses and Equilibria in Extensive-Form Games with Reinforcement Learning Methods, Improving Strategies via SMT Solving, Optimal Translation of LTL to Limit Deterministic Automata, Long-Run Rewards for Markov Automata, Maximizing the Conditional Expected Reward for Reaching the Goal, On Generalized Bellman Equations and Temporal-Difference Learning, An Overview for Markov Decision Processes in Queues and Networks, A Subexponential Lower Bound for Zadeh’s Pivoting Rule for Solving Linear Programs and Games, Approximating the Termination Value of One-Counter MDPs and Stochastic Games, SOJOURN TIMES IN NON-HOMOGENEOUS QBD PROCESSES WITH PROCESSOR SHARING, Stopped decision processes in conjunction with general utility, An application of yield management for Internet Service Providers, Linear Programming and Zero-Sum Two-Person Undiscounted Semi-Markov Games, Control of parallel non-observable queues: asymptotic equivalence and optimality of periodic policies, Phase Transitions for Controlled Markov Chains on Infinite Graphs, Shrinking-horizon dynamic programming, Functional Encryption for Inner Product with Full Function Privacy, Identity-Based Cryptosystems and Quadratic Residuosity, Attribute-Based Signatures for Circuits from Bilinear Map, Nearly Optimal Verifiable Data Streaming, Multiobjective Stopping Problem for Discrete-Time Markov Processes: Convex Analytic Approach, Successive approximations in partially observable controlled Markov chains with risk-sensitive average criterion, A finite exact algorithm to solve a dice game, New discount and average optimality conditions for continuous-time Markov decision processes, An Inductive Technique for Parameterised Model Checking of Degenerative Distributed Randomised Protocols, Robust Equilibria in Mean-Payoff Games, Distributed Synthesis in Continuous Time, Mitigating Information Asymmetry in Liver Allocation, On Memoryless Quantitative Objectives, The Complexity of Nash Equilibria in Limit-Average Games, Sequential selection of a monotone subsequence from a random permutation, Probabilistic Time Petri Nets, Q( $$\lambda $$ ) with Off-Policy Corrections, OPTIMAL MIXING OF MARKOV DECISION RULES FOR MDP CONTROL, Optimal investment and consumption with stochastic dividends, Sensitivity Analysis in Markov Decision Processes with Uncertain Reward Parameters, Average optimality for Markov decision processes in borel spaces: a new condition and approach, On the value function of the M/Cox(r)/1 queue, Influence of modeling structure in probabilistic sequential decision problems, Robust shortest path planning and semicontractive dynamic programming, Optimal Policies for Reducing Unnecessary Follow-Up Mammography Exams in Breast Cancer Diagnosis, Modeling and analysis of uncertain time-critical tasking problems, The Expected Total Cost Criterion for Markov Decision Processes under Constraints, Control of continuous-time Markov chains with safety constraints, Stable Optimal Control and Semicontractive Dynamic Programming, Play to Test, Some Limit Properties of Markov Chains Induced by Recursive Stochastic Algorithms, Good-for-MDPs Automata for Probabilistic Analysis and Reinforcement Learning, Simple Strategies in Multi-Objective MDPs, A Convex Programming Approach to Solve Posynomial Systems, Deep Statistical Model Checking, Bellman's principle of optimality and deep reinforcement learning for time-varying tasks, Undiscounted control policy generation for continuous-valued optimal control by approximate dynamic programming, Non-ergodic Markov decision processes with a constraint on the asymptotic failure rate: general class of policies, Temporal logics for the specification of performance and reliability, Linear waste of best fit bin packing on skewed distributions, Temporal concatenation for Markov decision processes, OPTIMAL ADMISSION AND ROUTING WITH CONGESTION-SENSITIVE CUSTOMER CLASSES, Stochastic DP Based on Trained Database for Sub-optimal Energy Management of Hybrid Electric Vehicles, Conditions for indexability of restless bandits and an algorithm to compute Whittle index, Dynamic Programs with Shared Resources and Signals: Dynamic Fluid Policies and Asymptotic Optimality, Characterization and simplification of optimal strategies in positive stochastic games, On the optimal allocation of service to impatient tasks, Deterministic policies based on maximum regrets in MDPs with imprecise rewards, Approximation solution and suboptimality for discounted semi-markov decision problems with countable state space, ONLINE CAPACITY PLANNING FOR REHABILITATION TREATMENTS: AN APPROXIMATE DYNAMIC PROGRAMMING APPROACH, Model Checking Linear-Time Properties of Probabilistic Systems, Dynamic pricing and replenishment: Optimality, bounds, and asymptotics, Computational Benefits of Intermediate Rewards for Goal-Reaching Policy Learning, Unnamed Item, Unnamed Item, What is the value of the cross-sectional approach to deep reinforcement learning?, Heuristic Solution for the Optimal Thresholds in a Controllable Multi-server Heterogeneous Queueing System Without Preemption, Robust Markov Decision Processes with Data-Driven, Distance-Based Ambiguity Sets, Simple and Optimal Methods for Stochastic Variational Inequalities, II: Markovian Noise and Policy Evaluation in Reinforcement Learning, Incentive Stackelberg Mean-Payoff Games, On Linear Programming for Constrained and Unconstrained Average-Cost Markov Decision Processes with Countable Action Spaces and Strictly Unbounded Costs, Unnamed Item, Unnamed Item, Ordinary Differential Equation Methods for Markov Decision Processes and Application to Kullback--Leibler Control Cost, Managing Patient Admissions in a Neurology Ward, The Joint Stock and Capacity Rationings of a Make-To-Stock System with Flexible Demand, Learning the distribution with largest mean: two bandit frameworks, Regularized Decomposition of High-Dimensional Multistage Stochastic Programs with Markov Uncertainty, Unnamed Item, The Vanishing Discount Approach in a class of Zero-Sum Finite Games with Risk-Sensitive Average Criterion, Optimizing Long‐term Hydro‐power Production Using Markov Decision Processes, Unnamed Item, Applications of variable discounting dynamic programming to iterated function systems and related problems, Average-Price and Reachability-Price Games on Hybrid Automata with Strong Resets, TWO-CLASS ROUTING WITH ADMISSION CONTROL AND STRICT PRIORITIES, Risk-sensitive average continuous-time Markov decision processes with unbounded rates, Quantitative Automata under Probabilistic Semantics, Reachability Switching Games, Strategic capacity decision-making in a stochastic manufacturing environment using real-time approximate dynamic programming, An approximate dynamic programing approach to the development of heuristics for the scheduling of impatient jobs in a clearing system, Optimal control of a production-inventory system with both backorders and lost sales, APPROXIMATE DYNAMIC PROGRAMMING TECHNIQUES FOR SKILL-BASED ROUTING IN CALL CENTERS, Unnamed Item, Optimal Drift Rate Control and Impulse Control for a Stochastic Inventory/Production System, Dynamic coordination of production planning and sales admission control in the presence of a spot market, Unnamed Item, Monotone Policies and Indexability for Bidirectional Restless Bandits, Multigrid methods for two‐player zero‐sum stochastic games, THE SEQUENTIAL STOCHASTIC ASSIGNMENT PROBLEM WITH POSTPONEMENT OPTIONS, A hierarchical decomposition of decision process Petri nets for modeling complex systems, Unnamed Item, Asymptotic Normality of Discrete-Time Markov Control Processes, Unnamed Item, Unnamed Item, Unnamed Item, Anticipation in Dynamic Vehicle Routing, A Minimum-Cost Strategy for Cluster Recruitment, Optimal Control of a Two-Server Heterogeneous Queueing System with Breakdowns and Constant Retrials, DYNAMIC ROUTING POLICIES FOR MULTISKILL CALL CENTERS, Symbolic Verification of Communicating Systems with Probabilistic Message Losses: Liveness and Fairness, Grid Brokering for Batch Allocation Using Indexes, Opportunistic Transmission over Randomly Varying Channels, Delayed Nondeterminism in Continuous-Time Markov Decision Processes, Zero-sum games for continuous-time Markov chains with unbounded transition and average payoff rates, Recent Developments in Algorithmic Teaching, Probabilistic Verification of Uncertain Systems Using Bounded-Parameter Markov Decision Processes, Design and control of agile automated CONWIP production lines, Designing and pricing menus of extended warranty contracts, What you should know about approximate dynamic programming, Sequential hypothesis tests under random horizon, Value and Policy Function Approximations in Infinite-Horizon Optimization Problems, Performance Guarantees and Optimal Purification Decisions for Engineered Proteins, Dynamic Distribution of Patients to Medical Facilities in the Aftermath of a Disaster, Weighted versus Probabilistic Logics, Challenges in Enterprise Wide Optimization for the Process Industries, On the structure of value functions for threshold policies in queueing models, Decision Problems for Nash Equilibria in Stochastic Games, OPTIMAL PRICING AND PRODUCTION POLICIES OF A MAKE-TO-STOCK SYSTEM WITH FLUCTUATING DEMAND, Bounds for Multistage Stochastic Programs Using Supervised Learning Strategies, Quantitative Analysis under Fairness Constraints, Stochastic Games for Verification of Probabilistic Timed Automata, Unnamed Item, Stochastic Control Liaisons: Richard Sinkhorn Meets Gaspard Monge on a Schrödinger Bridge, Optimality of admission control in an MM∕1∕N queue with varying services, Switching diffusion approximations for optimal power management in parallel processing systems, Points Gained in Football: Using Markov Process-Based Value Functions to Assess Team Performance, “Controlled” Versions of the Collatz–Wielandt and Donsker–Varadhan Formulae, Discrete-time constrained stochastic games with the expected average payoff criteria, On the correctness of monadic backward induction, Task-Aware Verifiable RNN-Based Policies for Partially Observable Markov Decision Processes, Continuous-time zero-sum games for markov decision processes with discounted risk-sensitive cost criterion on a general state space, Robust Control for Dynamical Systems with Non-Gaussian Noise via Formal Abstractions, Risk-Sensitive Average Optimality for Discrete-Time Markov Decision Processes, Strategy Complexity of Point Payoff, Mean Payoff and Total Payoff Objectives in Countable MDPs, Accelerated and Instance-Optimal Policy Evaluation with Linear Function Approximation, APPROXIMATE DYNAMIC PROGRAMMING TECHNIQUES FOR THE CONTROL OF TIME-VARYING QUEUING SYSTEMS APPLIED TO CALL CENTERS WITH ABANDONMENTS AND RETRIALS, Unnamed Item, Outsourcing warranty repairs: Dynamic allocation, Unnamed Item, Dynamic Learning and Decision Making via Basis Weight Vectors, Regular Policies in Abstract Dynamic Programming, COORDINATED PRICING AND INVENTORY CONTROL WITH BATCH PRODUCTION AND ERLANG LEADTIMES, Risk-Sensitive Reinforcement Learning via Policy Gradient Search, On the time discretization of stochastic optimal control problems: The dynamic programming approach, Instantaneous Control of Brownian Motion with a Positive Lead Time, Uniform Turnpike Theorems for Finite Markov Decision Processes, Bridging Commonsense Reasoning and Probabilistic Planning via a Probabilistic Action Language, OPTIMAL SWITCHING ON AND OFF THE ENTIRE SERVICE CAPACITY OF A PARALLEL QUEUE, THE BIAS OPTIMAL K IN THE M/M/1/K QUEUE: AN APPLICATION OF THE DEVIATION MATRIX, Unnamed Item, Unnamed Item, Delay-Based Service Differentiation with Many Servers and Time-Varying Arrival Rates, Periodical Multistage Stochastic Programs, A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits, SOLUTIONS AND DIAGNOSTICS OF SWITCHING PROBLEMS WITH LINEAR STATE DYNAMICS, Approximations to Stochastic Dynamic Programs via Information Relaxation Duality, Analysis of Markov Influence Graphs, Optimal Signaling Mechanisms in Unobservable Queues, Dynamic Electricity Pricing to Smart Homes, Easy Affine Markov Decision Processes, Dynamic distributed clustering in wireless sensor networks via Voronoi tessellation control, Risk-Sensitive Reinforcement Learning, Percentile queries in multi-dimensional Markov decision processes, Machine maintenance with workload considerations, Nonzero-sum games for continuous-time Markov chains with unbounded discounted payoffs, Production-inventory systems with imperfect advance demand information and updating, Anticipation of goals in automated planning, Some indexable families of restless bandit problems, An iterative method for multiple stopping: convergence and stability, Combinations of Qualitative Winning for Stochastic Parity Games, Stochastic Games, A two-level hierarchical Markov decision model with considering interaction between levels, First Passage Exponential Optimality Problem for Semi-Markov Decision Processes, Controlled Random Walk: Conjecture and Counter-Example, On Finite Approximations to Markov Decision Processes with Recursive and Nonlinear Discounting, Quadratic approximate dynamic programming for input‐affine systems, Energy-Utility Analysis for Resilient Systems Using Probabilistic Model Checking, On Markov policies for minimax decision processes, A CAPACITATED REPLENISHMENT-LIQUIDATION MODEL UNDER CONTRACTUAL AND SPOT MARKETS WITH STOCHASTIC DEMAND, Markov decision processes associated with two threshold probability criteria, Probabilistic Model Checking of Labelled Markov Processes via Finite Approximate Bisimulations, Probabilistic Model Checking for Energy-Utility Analysis, Bisimulation for Markov Decision Processes through Families of Functional Expressions, Policy Gradient Approach of Event‐Based Optimization and Its Online Implementation, Zero-sum games for continuous-time jump Markov processes in Polish spaces: discounted payoffs, Inventory models with minimal service level constraints, The Robot Routing Problem for Collecting Aggregate Stochastic Rewards, Unnamed Item, On the total reward variance for continuous-time Markov reward chains, Congestion-dependent pricing in a stochastic service system, Unnamed Item, Unnamed Item, Optimal Distributed Uplink Channel Allocation: A Constrained MDP Formulation, Unnamed Item, Context adaptation of the communication stack, Sensing as a Complexity Measure, On Decision Problems for Probabilistic Büchi Automata, Markov Decision Processes with Multiple Long-Run Average Objectives, A linear programming based approach for composite-action Markov decision processes, Vanishing discount approximations in controlled Markov chains with risk-sensitive average criterion, Risk-sensitive semi-Markov decision processes with general utilities and multiple criteria, A Sensitivity‐Based Construction Approach to Variance Minimization of Markov Decision Processes, On the Minimum Pair Approach for Average Cost Markov Decision Processes with Countable Discrete Action Spaces and Strictly Unbounded Costs, Dynamic admission control for loss systems with batch arrivals, Spinning plates and squad systems: policies for bi-directional restless bandits, Nonstationary value iteration in controlled Markov chains with risk-sensitive average criterion, The Effect of New Links on Google Pagerank, First passage risk probability optimality for continuous time Markov decision processes, Time-varying Markov decision processes with state-action-dependent discount factors and unbounded costs, Optimal control and performance analysis of anMX/M/1queue with batches of negative customers, Unnamed Item, Infinite Horizon Average Cost Dynamic Programming Subject to Total Variation Distance Ambiguity, On Lifetime Optimization of Boolean Parallel Systems with Erlang Repair Distributions, On zero-sum two-person undiscounted semi-Markov games with a multichain structure, Gittins Index for Simple Family of Markov Bandit Processes with Switching Cost and No Discounting, Numerical solution for the performance characteristics of the M/M/C/K retrial queue with negative customers and exponential abandonments by using value extrapolation method, DYNAMIC CONTROL OF A SINGLE-SERVER SYSTEM WHEN JOBS CHANGE STATUS, On the Existence of Optimal Policies for a Class of Static and Sequential Dynamic Teams, Measuring and Synthesizing Systems in Probabilistic Environments, On Convergence of Value Iteration for a Class of Total Cost Markov Decision Processes, Unnamed Item, Average optimality for continuous-time Markov decision processes under weak continuity conditions, Approximations of Countably Infinite Linear Programs over Bounded Measure Spaces, A Sufficient Statistic for Influence in Structured Multiagent Environments, Constrained Multiagent Markov Decision Processes: a Taxonomy of Problems and Algorithms, On the First Passage $g$-Mean-Variance Optimality for Discounted Continuous-Time Markov Decision Processes, Off-Policy Estimation of Long-Term Average Outcomes With Applications to Mobile Health, YMCA, Unnamed Item, Optimal Consumption and Investment with Fixed and Proportional Transaction Costs, Approximation algorithms for stochastic online matching with reusable resources, Contractive approximations in average Markov decision chains driven by a risk-seeking controller, Opaque selling of multiple substitutable products with finite inventories, Access control method for EV charging stations based on state aggregation and Q-learning, Admission control of hospitalization with patient gender by using Markov decision process, Optimal admission control under premature discharge decisions for operational effectiveness, Now decision theory, Balanced Q-learning: combining the influence of optimistic and pessimistic targets, Admission and pricing optimization of on-street parking with delivery bays, Kullback–Leibler-Quadratic Optimal Control, Embedding active learning in batch-to-batch optimization using reinforcement learning, Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework, Dynamic policy for idling time preservation, Average criteria in denumerable semi-Markov decision chains under risk-aversion, Zero-sum stochastic games with the average-value-at-risk criterion, Revenue maximization in two‐station tandem queueing systems, Markov decision processes with burstiness constraints, A reinforcement learning approach to distribution-free capacity allocation for sea cargo revenue management, The policy graph decomposition of multistage stochastic programming problems, GUBS criterion: arbitrary trade-offs between cost and probability-to-goal in stochastic planning based on expected utility theory, Inventory control with modulated demand and a partially observed modulation process, An automated quantitative information flow analysis for concurrent programs, An impossibility result in automata-theoretic reinforcement learning, Verifying Probabilistic Timed Automata Against Omega-Regular Dense-Time Properties, Model Checking for Safe Navigation Among Humans, Future memories are not needed for large classes of POMDPs, Dynamic assignment of a multi-skilled workforce in job shops: an approximate dynamic programming approach, Modelling the influence of returns for an omni-channel retailer, Testing indexability and computing Whittle and Gittins index in subcubic time, Managing production-inventory-maintenance systems with condition monitoring, Parameter synthesis in Markov models: a gentle survey, A reinforcement learning approach to the stochastic cutting stock problem, Dual Ascent and Primal-Dual Algorithms for Infinite-Horizon Nonstationary Markov Decision Processes, An intelligent intervention strategy for patients to prevent chronic complications based on reinforcement learning, On the sample complexity of actor-critic method for reinforcement learning with function approximation, Continuous Positional Payoffs, Reward Maximization Through Discrete Active Inference, A multiagent reinforcement learning framework for off-policy evaluation in two-sided markets, A model for data transmission and its optimization, Perpetual American options with asset-dependent discounting, On probability-raising causality in Markov decision processes, A note on generalized second-order value iteration in Markov decision processes, Markov decision processes under risk sensitivity: a discount vanishing approach, A single server retrial queue with event-dependent arrival rates, High frequency market making: the role of speed, Foundations of probability-raising causality in Markov decision processes, On solutions of the distributional Bellman equation, Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons, Optimal deterministic controller synthesis from steady-state distributions, Metalearning of time series: an approximate dynamic programming approach, Turnpikes in Finite Markov Decision Processes and Random Walk, Alternating good-for-MDPs automata, Optimal repair for omega-regular properties, Learning key steps to attack deep reinforcement learning agents, A Stochastic Composite Augmented Lagrangian Method for Reinforcement Learning, Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence, A note on the existence of optimal stationary policies for average Markov decision processes with countable states, Exponential asymptotic optimality of Whittle index policy, Optimal decision-making of mutual fund temporary borrowing problem via approximate dynamic programming, The optimal dynamic rationing policy in the stock-rationing queue, Modeling and control of data transmission, Stochastic Fixed-Point Iterations for Nonexpansive Maps: Convergence and Error Bounds, Projected state-action balancing weights for offline reinforcement learning, Primal-Dual Regression Approach for Markov Decision Processes with General State and Action Spaces, Simulation-based search, Trajectory modeling via random utility inverse reinforcement learning, Characterization of the optimal average cost in Markov decision chains driven by a risk-seeking controller, Mixed nondeterministic-probabilistic automata: blending graphical probabilistic models with nondeterminism, Unnamed Item, Unnamed Item, Conditioning in probabilistic programming, Unnamed Item, The actor-critic algorithm as multi-time-scale stochastic approximation., Algorithms for optimization and stabilization of controlled Markov chains., Stochastic approximation algorithms: overview and recent trends., On decision-theoretic foundations for defaults, A sensitivity formula for risk-sensitive cost and the actor-critic algorithm, On-line policy gradient estimation with multi-step sampling, Admission control for a multi-server queue with abandonment, Asymptotically optimal control of parallel tandem queues with loss, New sufficient conditions for average optimality in continuous-time Markov decision processes