Publication:4315289

From MaRDI portal


zbMath0829.90134MaRDI QIDQ4315289

Martin L. Puterman

Publication date: 6 December 1994



90C40: Markov and semi-Markov decision processes

90-02: Research exposition (monographs, survey articles) pertaining to operations research and mathematical programming


Related Items

Planning and acting in partially observable stochastic domains, A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: multichain cases, On essential information in sequential decision processes, On mean reward variance in semi-Markov processes, On the optimality of a full-service policy for a queueing system with discounted costs, Solving factored MDPs using non-homogeneous partitions, A multigenerational game model to analyze sustainable development, Constraint solving in uncertain and dynamic environments: A survey, Sequential variable sampling plan for normal distribution, Optimization of a large-scale water reservoir network by stochastic dynamic programming with efficient state space discretization, On the optimality equation for average cost Markov control processes with Feller transition probabilities, Probabilistic planning with clear preferences on missing information, Practical solution techniques for first-order MDPs, Zero-sum stochastic games with average payoffs: new optimality conditions, Online stochastic reservation systems, Strategy optimization for controlled Markov process with descriptive complexity constraint, Stochastic constraint programming: A scenario-based approach, Zero-sum continuous-time Markov games with unbounded transition and discounted payoff rates, On ordinal comparison of policies in Markov reward processes, The dynamic shortest path problem with anticipation, A fuzzy approach to Markov decision processes with uncertain transition probabilities, Means-end relations and a measure of efficacy, Marginal productivity index policies for scheduling a multiclass delay-/loss-sensitive queue, Perfect information two-person zero-sum Markov games with imprecise transition probabilities, Keep or return? Managing ordering and return policies in start-up companies, Revenue management for a make-to-order company with limited inventory capacity, A formal mathematical framework for modeling probabilistic hybrid systems, A semimartingale characterization of average optimal stationary policies for Markov decision processes, Multi-objective optimization of water-using systems, Allocation of empty containers between multi-ports, Dynamic priority allocation via restless bandit marginal productivity indices, Continuous state dynamic programming via nonexpansive approximation, Learning agents in an artificial power exchange: Tacit collusion, market power and efficiency of two double-auction mechanisms, An index heuristic for transshipment decisions in multi-location inventory systems based on a pairwise decomposition, Saddle-point calculation for constrained finite Markov chains, Optimal integrated production and inventory control of an assemble-to-order system with multiple non-unitary demand classes, A note on negative dynamic programming for risk-sensitive control, An analysis of model-based interval estimation for Markov decision processes, Policy iteration for continuous-time average reward Markov decision processes in Polish spaces, Continuous-time Markov decision processes with \(n\)th-bias optimality criteria, Policy iteration for customer-average performance optimization of closed queueing systems, Conformant plans and beyond: principles and complexity, Asymptotically optimal parallel resource assignment with interference, Analyzing the dynamics of stigmergetic interactions through pheromone games, Average optimality for continuous-time Markov decision processes in Polish spaces, Characterizing extreme points as basic feasible solutions in infinite linear programs, Markov decision processes with exponentially representable discounting, Neighbourhood search for constructing Pareto sets, Transfer in variable-reward hierarchical reinforcement learning, Projected equation methods for approximate solution of large linear systems, Fast convergence to state-action frequency polytopes for MDPs, Advance demand information and a restricted production capacity: on the optimality of order base-stock policies, Simulation-based designs for multiperiod control, The value of information in a capacitated closed loop supply chain, Probabilistic weak simulation is decidable in polynomial time, Effects of system parameters on the optimal policy structure in a class of queueing control problems, Random walk, birth-and-death process and their fluid approximations: Absorbing case, Resource-constrained management of heterogeneous assets with stochastic deterioration, A tutorial on partially observable Markov decision processes, Theoretical tools for understanding and aiding dynamic decision making, Solutions of the average cost optimality equation for finite Markov decision chains: Risk-sensitive and risk-neutral criteria, Modeling secrecy and deception in a multiple-period attacker-defender signaling game, Natural actor-critic algorithms, A pause control approach to the value iteration scheme in average Markov decision processes, A note on the convergence rate of the value iteration scheme in controlled Markov chains, Model-based average reward reinforcement learning, Maximizing the probability of pest extinction on a stochastic pest-predator model, Utility-based on-line exploration for repeated navigation in an embedded graph, Lower bounding aggregation and direct computation for an infinite horizon one-reservoir model, Conditional decision processes with recursive function, Single sample path-based optimization of Markov chains, The value iteration method for countable state Markov decision processes, Scheduling in a multi-class series of queues with deterministic service times, On computing average cost optimal policies with application to routing to parallel queues, Abstraction and approximate decision-theoretic planning., Approximate receding horizon approach for Markov decision processes: average reward case, A weakly monotonic backward induction algorithm on finite bounded subsets of vector lattices., Stochastic dynamic programming with factored representations, Bounded-parameter Markov decision processes, Generalizing Markov decision processes to imprecise probabilities, Notes on average Markov decision processes with a minimum-variance criterion, Hoeffding's inequality for uniformly ergodic Markov chains, Multi-policy improvement in stochastic optimization with forward recursive function criteria, Basic ideas for event-based optimization of Markov systems, A characterization of the optimal risk-sensitive average cost in finite controlled Markov chains, Index policies for the maintenance of a collection of machines by a set of repairmen, Question selection for multi-attribute decision-aiding., Adaptive optimization and the harvest of biological populations, Optimal policy for minimizing risk models in Markov decision processes, A class of dual fuzzy dynamic programs, Controlled Markov set-chains under average criteria, Continuous-time controlled Markov chains., Policy iteration type algorithms for recurrent state Markov decision processes, An empirical study of policy convergence in Markov decision process value iteration, Analysis of optimal and nearly optimal sequencing policies for a closed queueing network, Markov-achievable payoffs for finite-horizon decision models., A unified approach to Markov decision problems and performance sensitivity analysis, A simple heuristic for load balancing in parallel processing networks with highly variable service time distributions, A theoretic and practical framework for scheduling in a stochastic environment, Optimal risk probability for first passage models in semi-Markov decision processes, Optimizing Long‐term Hydro‐power Production Using Markov Decision Processes, Unnamed Item, Zero-sum games for continuous-time Markov chains with unbounded transition and average payoff rates, On the structure of value functions for threshold policies in queueing models, Non-ergodic Markov decision processes with a constraint on the asymptotic failure rate: general class of policies, Linear waste of best fit bin packing on skewed distributions, On the optimal allocation of service to impatient tasks, Dynamic pricing of multiple home delivery options, Optimality equations and inequalities in a class of risk-sensitive average cost Markov decision chains, Dynamic multiagent probabilistic inference, Central limit theorem for the estimator of the value of an optimal stopping problem, Comparative branching-time semantics for Markov chains, Restricted gradient-descent algorithm for value-function approximation in reinforcement learning, Sequential Monte Carlo in reachability heuristics for probabilistic planning, Reachability analysis of uncertain systems using bounded-parameter Markov decision processes, Optimization of a special case of continuous-time Markov decision processes with compact action set, Dynamic programming and minimum risk paths, Finding the \(K\) best policies in a finite-horizon Markov decision process, Fuzzy optimality relation for perceptive MDPs-the average case, Optimal empty vehicle repositioning and fleet-sizing for two-depot service systems, A multi-period TSP with stochastic regular and urgent demands, A policy improvement method for constrained average Markov decision processes, Converging marriage in honey-bees optimization and application to stochastic dynamic programming, A structured pattern matrix algorithm for multichain Markov decision processes, Markov control processes with randomized discounted cost, Reading policies for joins: an asymptotic analysis, NP-hardness of checking the unichain condition in average cost MDPs, Adaptive stepsize selection for tracking in a regime-switching environment, Stability and optimality of a multi-product production and storage system under demand uncertainty, A class of algorithms for collision resolution with multiplicity estimation, Structural results on a batch acceptance problem for capacitated queues, Constrained continuous-time Markov decision processes with average criteria, A tutorial on the cross-entropy method, Basis function adaptation in temporal difference reinforcement learning, Monotonic robust optimal control policies for the time-quality trade-offs in concurrent new product development (NPD), On the optimal control of a two-queue polling model, Bisimulation and cocongruence for probabilistic systems, Using adaptive learning in credit scoring to estimate take-up probability distribution, Mum, why do you keep on growing? Impacts of environmental variability on optimal growth and reproduction allocation strategies of annual plants, Time consistent dynamic risk measures, An actor-critic algorithm for constrained Markov decision processes, Approximating infinite horizon stochastic optimal control in discrete time with constraints, Sensitivity of optimal prices to system parameters in a steady-state service facility, Credibilistic Markov decision processes: The average case, Dynamic load balancing in parallel queueing systems: stability and optimal control, Hierarchical testing designs for pattern recognition, Perishable inventory management and dynamic pricing using RFID technology, Efficient computation of time-bounded reachability probabilities in uniform continuous-time Markov decision processes, Unnamed Item, Dynamic routing to heterogeneous collections of unreliable servers, Reinforcement learning based algorithms for average cost Markov decision processes, An elective surgery scheduling problem considering patient priority, STRONG AVERAGE OPTIMALITY FOR CONTROLLED NONHOMOGENEOUS MARKOV CHAINS*, SOJOURN TIMES IN NON-HOMOGENEOUS QBD PROCESSES WITH PROCESSOR SHARING, Stopped decision processes in conjunction with general utility, An application of yield management for Internet Service Providers, Markov Decision Processes with Asymptotic Average Failure Rate Constraint, Index policies for the routing of background jobs, Concavely-Priced Probabilistic Timed Automata, Successive approximations in partially observable controlled Markov chains with risk-sensitive average criterion, Average optimality for Markov decision processes in borel spaces: a new condition and approach, On the value function of the M/Cox(r)/1 queue, Influence of modeling structure in probabilistic sequential decision problems, Modeling and analysis of uncertain time-critical tasking problems, Play to Test, MAXIMIZING THE THROUGHPUT OF TANDEM LINES WITH FLEXIBLE FAILURE-PRONE SERVERS AND FINITE BUFFERS, Computing Game Values for Crash Games, Deciding Simulations on Probabilistic Automata, Probabilistic CEGAR, Bias and Overtaking Optimality for Continuous-Time Jump Markov Decision Processes in Polish Spaces, Controller Synthesis and Verification for Markov Decision Processes with Qualitative Branching Time Objectives, On optimality gaps for fuzzification in finite Markov decision processes, A Cost-Based Model and Algorithms for Interleaving Solving and Elicitation of CSPs, Online Regret Bounds for Markov Decision Processes with Deterministic Transitions, Simultaneous determination of production and maintenance schedules using in‐line equipment condition and yield information, Optimal empty vehicle redistribution for hub‐and‐spoke transportation systems, Average-Price and Reachability-Price Games on Hybrid Automata with Strong Resets, Strategic capacity decision-making in a stochastic manufacturing environment using real-time approximate dynamic programming, An approximate dynamic programing approach to the development of heuristics for the scheduling of impatient jobs in a clearing system, Optimal control of a production-inventory system with both backorders and lost sales, Dynamic coordination of production planning and sales admission control in the presence of a spot market, DYNAMIC ROUTING POLICIES FOR MULTISKILL CALL CENTERS, Symbolic Verification of Communicating Systems with Probabilistic Message Losses: Liveness and Fairness, Grid Brokering for Batch Allocation Using Indexes, Opportunistic Transmission over Randomly Varying Channels, Delayed Nondeterminism in Continuous-Time Markov Decision Processes, Recent Developments in Algorithmic Teaching, Probabilistic Verification of Uncertain Systems Using Bounded-Parameter Markov Decision Processes, Design and control of agile automated CONWIP production lines, Designing and pricing menus of extended warranty contracts, What you should know about approximate dynamic programming, Value and Policy Function Approximations in Infinite-Horizon Optimization Problems, Weighted versus Probabilistic Logics, Challenges in Enterprise Wide Optimization for the Process Industries, Decision Problems for Nash Equilibria in Stochastic Games, OPTIMAL PRICING AND PRODUCTION POLICIES OF A MAKE-TO-STOCK SYSTEM WITH FLUCTUATING DEMAND, Bounds for Multistage Stochastic Programs Using Supervised Learning Strategies, Quantitative Analysis under Fairness Constraints, Stochastic Games for Verification of Probabilistic Timed Automata, Unnamed Item, Approximation solution and suboptimality for discounted semi-markov decision problems with countable state space, Context adaptation of the communication stack, APPROXIMATE DYNAMIC PROGRAMMING TECHNIQUES FOR THE CONTROL OF TIME-VARYING QUEUING SYSTEMS APPLIED TO CALL CENTERS WITH ABANDONMENTS AND RETRIALS, Outsourcing warranty repairs: Dynamic allocation, Machine maintenance with workload considerations, Some indexable families of restless bandit problems, An iterative method for multiple stopping: convergence and stability, Zero-sum games for continuous-time jump Markov processes in Polish spaces: discounted payoffs, On the total reward variance for continuous-time Markov reward chains, Congestion-dependent pricing in a stochastic service system, On Decision Problems for Probabilistic Büchi Automata, Markov Decision Processes with Multiple Long-Run Average Objectives, Dynamic admission control for loss systems with batch arrivals, Spinning plates and squad systems: policies for bi-directional restless bandits, Nonstationary value iteration in controlled Markov chains with risk-sensitive average criterion, The Effect of New Links on Google Pagerank, Optimal control and performance analysis of anMX/M/1queue with batches of negative customers, Nonzero-sum games for continuous-time Markov chains with unbounded discounted payoffs, On Markov policies for minimax decision processes, Inventory models with minimal service level constraints, The actor-critic algorithm as multi-time-scale stochastic approximation., Algorithms for optimization and stabilization of controlled Markov chains., Stochastic approximation algorithms: overview and recent trends., On decision-theoretic foundations for defaults, A sensitivity formula for risk-sensitive cost and the actor-critic algorithm