scientific article; zbMATH DE number 1095138

From MaRDI portal

Publication:4368722

Jump to:navigation, search

zbMath0904.90170MaRDI QIDQ4368722

Dimitri P. Bertsekas

Publication date: 7 December 1997

Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.

zbMATH Keywords

uncertainty optimal control dynamic programming stochastic control combinatorial optimization sequential decision making Markovian decision

Mathematics Subject Classification ID

Dynamic programming in optimal control and differential games (49L20) Dynamic programming (90C39) Optimal stochastic control (93E20) Markov and semi-Markov decision processes (90C40) Introductory exposition (textbooks, tutorial papers, etc.) pertaining to operations research and mathematical programming (90-01) Introductory exposition (textbooks, tutorial papers, etc.) pertaining to calculus of variations and optimal control (49-01)

Related Items

Generalized maximum entropy estimation ⋮ Stabilising quasi-time-optimal nonlinear model predictive control with variable discretisation ⋮ STOCHASTIC MODEL PREDICTIVE CONTROL AND PORTFOLIO OPTIMIZATION ⋮ A MEAN FIELD GAME ANALYSIS OF SIR DYNAMICS WITH VACCINATION ⋮ Computational Benefits of Intermediate Rewards for Goal-Reaching Policy Learning ⋮ What is the value of the cross-sectional approach to deep reinforcement learning? ⋮ State-Variable Modeling for a Class of Two-Stage Stochastic Optimization Problems ⋮ Quantile Markov Decision Processes ⋮ Dual Control and Online Optimal Experimental Design ⋮ Analysis of the optimization landscape of Linear Quadratic Gaussian (LQG) control ⋮ Optimal operation of a grid‐connected battery energy storage system over its lifetime ⋮ A Lyapunov characterization of robust policy optimization ⋮ A variable projection method for large-scale inverse problems with \(\ell^1\) regularization ⋮ ENDOGENOUS SOCIAL NETWORKS AND INEQUALITY IN AN INTERGENERATIONAL SETTING ⋮ Long-term dynamic asset allocation under asymmetric risk preferences ⋮ Multi-agent off-policy actor-critic algorithm for distributed multi-task reinforcement learning ⋮ Learning Markov Models Via Low-Rank Optimization ⋮ GUBS criterion: arbitrary trade-offs between cost and probability-to-goal in stochastic planning based on expected utility theory ⋮ On the optimization of pit stop strategies via dynamic programming ⋮ Solving nonlinear and dynamic programming equations on extended \(b\)-metric spaces with the fixed-point technique ⋮ An optimal control approach to particle filtering ⋮ Distributed output data-driven optimal robust synchronization of heterogeneous multi-agent systems ⋮ Multi-sourcing under supply uncertainty and buyer's risk aversion ⋮ On the sample complexity of actor-critic method for reinforcement learning with function approximation ⋮ Robust regulation of discrete-time systems subject to parameter uncertainties and state delay ⋮ Adaptive event-triggered actor-critic algorithm for optimal 3D formation circumnavigation with relative measurement and an unknown moving target ⋮ Metalearning of time series: an approximate dynamic programming approach ⋮ Multi-agent natural actor-critic reinforcement learning algorithms ⋮ Optimal transmission scheduling for remote state estimation in CPSs with energy harvesting two-hop relay networks ⋮ A Stochastic Composite Augmented Lagrangian Method for Reinforcement Learning ⋮ Quickest detection of deception attacks on cyber-physical systems with a parsimonious watermarking policy ⋮ Unnamed Item ⋮ Model‐free optimal tracking over finite horizon using adaptive dynamic programming ⋮ Stealthy switching attacks on sensors against state estimation in cyber‐physical systems ⋮ Reinforcement learning based optimal synchronization control for multi-agent systems with input constraints using vanishing viscosity method ⋮ Randomized Linear Programming Solves the Markov Decision Problem in Nearly Linear (Sometimes Sublinear) Time ⋮ Easy Affine Markov Decision Processes ⋮ Deadlines, Offer Timing, and the Search for Alternatives ⋮ An average optimal control approach to the set stabilization problem for boolean control networks ⋮ LQG Online Learning ⋮ A New Approach to Real-Time Bidding in Online Advertisements: Auto Pricing Strategy ⋮ An Approximation Approach for Response-Adaptive Clinical Trial Design ⋮ Some operations research methods for analyzing protein sequences and structures ⋮ Deterministic mean-variance-optimal consumption and investment ⋮ Variance-penalized Markov decision processes: dynamic programming and reinforcement learning techniques ⋮ LAO*: A heuristic search algorithm that finds solutions with loops ⋮ Multiscale analysis and control of networks with fractal traffic ⋮ Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning ⋮ Improvements and Generalizations of Stochastic Knapsack and Markovian Bandits Approximation Algorithms ⋮ Robust Dynamic Pricing with Strategic Customers ⋮ Adaptive Low-Nonnegative-Rank Approximation for State Aggregation of Markov Chains ⋮ Connected cruise control with delayed feedback and disturbance: An adaptive dynamic programming approach ⋮ Unnamed Item ⋮ Riemannian Fast-Marching on Cartesian Grids, Using Voronoi's First Reduction of Quadratic Forms ⋮ Unnamed Item ⋮ Online optimal and adaptive integral tracking control for varying discrete‐time systems using reinforcement learning ⋮ Unnamed Item ⋮ A Finite Time Analysis of Temporal Difference Learning with Linear Function Approximation ⋮ Automatic Generation of FPTASes for Stochastic Monotone Dynamic Programs Made Easier ⋮ On the correctness of monadic backward induction ⋮ Impedance adaptation for optimal robot–environment interaction ⋮ Optimal Energy Shaping via Neural Approximators ⋮ Model-based Reinforcement Learning: A Survey ⋮ Discrete-review policies for scheduling stochastic networks: trajectory tracking and fluid-scale asymptotic optimality. ⋮ Approximation properties of receding horizon optimal control ⋮ Some applications of polynomial optimization in operations research and real-time decision making ⋮ Riemannian optimization for registration of curves in elastic shape analysis ⋮ An incremental off-policy search in a model-free Markov decision process using a single sample path ⋮ Finite-horizon LQR controller for partially-observed Boolean dynamical systems ⋮ Policy iteration type algorithms for recurrent state Markov decision processes ⋮ Multi-objective evolutionary optimization of biological pest control with impulsive dynamics in soybean crops ⋮ Continuous lunches are free plus the design of optimal optimization algorithms ⋮ Response-adaptive designs for clinical trials: simultaneous learning from multiple patients ⋮ Parameter uncertainty and policy intensity: some extensions and suggestions for further work ⋮ Decentralized stochastic control ⋮ Revisiting dynamic programming for finding optimal subtrees in trees ⋮ A quantity flexibility contract model for a system with heterogeneous suppliers ⋮ Numerical methods for the pricing of swing options: a stochastic control approach ⋮ Multi-period mean-variance portfolio optimization based on Monte-Carlo simulation ⋮ Strategy improvement for concurrent reachability and turn-based stochastic safety games ⋮ General value iteration based single network approach for constrained optimal controller design of partially-unknown continuous-time nonlinear systems ⋮ The \((S,s)\) policy is an optimal trading strategy in a class of commodity price speculation problems ⋮ Efficient output solution for nonlinear stochastic optimal control problem with model-reality differences ⋮ Nonlinear protocols for optimal distributed consensus in networks of dynamic agents ⋮ Approximate robust dynamic programming and robustly stable MPC ⋮ Optimal placement of UV-based communications relay nodes ⋮ Symmetry and antisymmetry properties of optimal solutions to regression problems ⋮ Integrated topology optimization and optimal control for vibration suppression in structural design ⋮ Meta-control of an interacting-particle algorithm for global optimization ⋮ Safety verification for probabilistic hybrid systems ⋮ Immediate return preference emerged from a synaptic learning rule for return maximization ⋮ Optimal control of a two-server flow-shop network ⋮ Reputation in the long-run with imperfect monitoring ⋮ Variance-constrained actor-critic algorithms for discounted and average reward MDPs ⋮ Real-time dynamic programming for Markov decision processes with imprecise probabilities ⋮ A semi-Lagrangian scheme for a modified version of the Hughes' model for Pedestrian flow ⋮ Joint routing and scheduling control in a two-class network with a flexible server ⋮ The single-server scheduling problem with convex costs ⋮ Dynamic programming and viscosity solutions for the optimal control of quantum spin systems ⋮ Finite time identification in unstable linear systems ⋮ Reinforcement learning for a class of continuous-time input constrained optimal control problems ⋮ Depth-based short-sighted stochastic shortest path problems ⋮ The joint transshipment and production control policies for multi-location production/inventory systems ⋮ Risk pooling strategy in a multi-echelon supply chain with price-sensitive demand ⋮ Infinite horizon optimal policy for an inventory system with two types of product sharing common hardware platforms ⋮ A unified approach to Markov decision problems and performance sensitivity analysis ⋮ Multi-sensor transmission power control for remote estimation through a SINR-based communication channel ⋮ Dynamic mechanism design with interdependent valuations ⋮ Analyzing anonymity attacks through noisy channels ⋮ Multi-period risk sharing under financial fairness ⋮ Pareto efficiency of finite horizon switched linear quadratic differential games ⋮ Optimal energy allocation for linear control with packet loss under energy harvesting constraints ⋮ Stochastic scheduling in an in-forest ⋮ Finite-horizon inverse optimal control for discrete-time nonlinear systems ⋮ Batch repair actions for automated troubleshooting ⋮ Symbolic optimal expected time reachability computation and controller synthesis for probabilistic timed automata ⋮ Delay-optimal scheduling for two-hop relay networks with randomly varying connectivity: join the shortest queue-longest connected queue policy ⋮ A network flow approach in finding maximum likelihood estimate of high concentration regions ⋮ A linear-quadratic Gaussian approach to dynamic information acquisition ⋮ Assortment planning with nested preferences: dynamic programming with distributions as states? ⋮ Set-membership estimations for the evolution of infectious diseases in heterogeneous populations ⋮ Discovering hidden structure in factored MDPs ⋮ Planning and acting in partially observable stochastic domains ⋮ Generative models for functional data using phase and amplitude separation ⋮ Conformant plans and beyond: principles and complexity ⋮ Beam-ACO--hybridizing ant colony optimization with beam search: an application to open shop scheduling ⋮ Sampled fictitious play for approximate dynamic programming ⋮ A numerical method for hybrid optimal control based on dynamic programming ⋮ Error estimation and adaptive discretization for the discrete stochastic Hamilton-Jacobi-Bellman equation ⋮ On pricing of multiple bundles of products and services ⋮ Basic ideas for event-based optimization of Markov systems ⋮ On infinite horizon active fault diagnosis for a class of non-linear non-Gaussian systems ⋮ Optimal capture trajectories using multiple gravity assists ⋮ Accelerating Benders decomposition for short-term hydropower maintenance scheduling ⋮ Sensitivity analysis and optimal ultimately stationary deterministic policies in some constrained discounted cost models ⋮ Optimal search from multiple distributions with infinite horizon ⋮ Coupling based estimation approaches for the average reward performance potential in Markov chains ⋮ Stabilization of strictly dissipative discrete time systems with discounted optimal control ⋮ Stochastic output-feedback model predictive control ⋮ Finding a simple polytope from its graph in polynomial time ⋮ Efficient blind search: optimal power of detection under computational cost constraints ⋮ Optimal synchronization control of multiple Euler-Lagrange systems via event-triggered reinforcement learning ⋮ A survey on metaheuristics for stochastic combinatorial optimization ⋮ Dynamic coordination games with activation costs ⋮ A benders squared \((B^2)\) framework for infinite-horizon stochastic linear programs ⋮ Bias-policy iteration based adaptive dynamic programming for unknown continuous-time linear systems ⋮ Model-free \(H_\infty\) tracking control for de-oiling hydrocyclone systems via off-policy reinforcement learning ⋮ The interacting-particle algorithm with dynamic heating and cooling ⋮ Low earth orbit satellite based communication systems -- research opportunities ⋮ Single sample path-based optimization of Markov chains ⋮ Optimizing Bernoulli routing policies for balancing loads on call centers and minimizing transmission costs ⋮ Strongly polynomial FPTASes for monotone dynamic programs ⋮ A dynamic game formulation for control of opinion dynamics over social networks ⋮ Optimal control of chaotic systems via peak-to-peak maps ⋮ Simplified risk-aware decision making with belief-dependent rewards in partially observable domains ⋮ Reinforcement learning: an industrial perspective ⋮ Adaptive optimal output tracking of continuous-time systems via output-feedback-based reinforcement learning ⋮ Amplitude mean of functional data on \(\mathbb{S}^2\) and its accurate computation ⋮ Optimal cost almost-sure reachability in POMDPs ⋮ Robustness of performance and stability for multistep and updated multistep MPC schemes ⋮ Tool path optimization of selective laser sintering processes using deep learning ⋮ Peril, prudence and planning as risk, avoidance and worry ⋮ Striped parameterized tube model predictive control ⋮ Optimal inventory control with fixed ordering cost for selling by Internet auctions ⋮ Robust and reliable portfolio optimization formulation of a chance constrained problem ⋮ Homotopic policy iteration-based learning design for unknown linear continuous-time systems ⋮ Stochastic event-based LQG control: an analysis on strict consistency ⋮ Levenberg-Marquardt method for identifying Young's modulus of the elasticity imaging inverse problem ⋮ A partial history of the early development of continuous-time nonlinear stochastic systems theory ⋮ Revenue management for operations with urgent orders ⋮ Stochastic output feedback MPC with intermittent observations ⋮ Age-based maintenance under population heterogeneity: optimal exploration and exploitation ⋮ Learning classifier systems: a survey ⋮ Heuristics for planning with penalties and rewards formulated in logic and computed through circuits ⋮ Optimizing Image Quality ⋮ Optimal allocation of heterogeneous resources in cooperative control scenarios ⋮ Self-triggered control of probabilistic Boolean control networks: a reinforcement learning approach ⋮ Optimal control of a queue under a quality-of-service constraint with bounded and unbounded rates ⋮ Remote state estimation with usage-dependent Markovian packet losses ⋮ The linear quadratic regulator for periodic hybrid systems ⋮ Optimizing DoS attack energy with imperfect acknowledgments and energy harvesting constraints in cyber-physical systems ⋮ Markov Reward Models and Markov Decision Processes in Discrete and Continuous Time: Performance Evaluation and Optimization ⋮ Primal-dual method for solving a linear-quadratic multi-input optimal control problem ⋮ Optimization of stock trading with additional information by limit order book ⋮ A generalization of Bellman's equation with application to path planning, obstacle avoidance and invariant set estimation ⋮ On the convergence of reinforcement learning with Monte Carlo exploring starts ⋮ On the usefulness of set-membership estimation in the epidemiology of infectious diseases ⋮ Dynamic marketing policies with rating-sensitive consumers: a mean-field games approach ⋮ Symbolic Minimum Expected Time Controller Synthesis for Probabilistic Timed Automata ⋮ Neural circuits for learning context-dependent associations of stimuli ⋮ Bias optimality of admission control in a non-stationary repairable queue ⋮ A survey of numerical solutions for stochastic control problems: some recent progress ⋮ Input perturbations for adaptive control and learning ⋮ On adaptive linear-quadratic regulators ⋮ Differential-game for resource aware approximate optimal control of large-scale nonlinear systems with multiple players ⋮ Improved value iteration for neural-network-based stochastic optimal control design ⋮ Robust min-max optimal control design for systems with uncertain models: a neural dynamic programming approach ⋮ Differential stability of discrete optimal control problems with possibly nondifferentiable costs ⋮ A complete characterization of optimal dictionaries for least squares representation ⋮ Designing higher value roads to preserve species at risk by optimally controlling traffic flow ⋮ Action selection in growing state spaces: control of network structure growth ⋮ The Joint Stock and Capacity Rationings of a Make-To-Stock System with Flexible Demand ⋮ Online inverse optimal control for control-constrained discrete-time systems on finite and infinite horizons ⋮ Dynamic games with strategic complements and large number of players ⋮ Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems ⋮ A penalty function-based greedy diffusion search algorithm for the optimization of constrained nonlinear dynamical processes with discrete-valued input ⋮ Detection-averse optimal and receding-horizon control for Markov decision processes ⋮ Investment Decisions Under Uncertainty Using Stochastic Dynamic Programming: A Case Study of Wind Power ⋮ Policy iteration based feedback control ⋮ Optimal stopping in infinite horizon: an eigenfunction expansion approach ⋮ Control and Systems Theory for Advanced Manufacturing ⋮ Toward Breaking the Curse of Dimensionality: An FPTAS for Stochastic Dynamic Programs with Multidimensional Actions and Scalar States ⋮ Reducing the Bullwhip effect in a supply chain network by application of optimal control theory ⋮ Discrete time dynamic multi-leader-follower games with stage-depending leaders under feedback information ⋮ Ellipsoidal methods for dynamics and control. I ⋮ Joint source-channel coding via model predictive control ⋮ An Overview for Markov Decision Processes in Queues and Networks ⋮ INTEGRATED DECISION ON PRICING, PROMOTION AND INVENTORY MANAGEMENT ⋮ Markov control processes with randomized discounted cost ⋮ Robust Optimizers for Nonlinear Programming in Approximate Dynamic Programming ⋮ Statistical Modeling of Curves Using Shapes and Related Features ⋮ Risk-Constrained Reinforcement Learning with Percentile Risk Criteria ⋮ Dynamic journeying under uncertainty ⋮ A tutorial on the cross-entropy method ⋮ Basis function adaptation in temporal difference reinforcement learning ⋮ Nonlinear optimal control of population systems: applications in ecosystems ⋮ The Impact of Noise and Sampling Frequency on the Control of Peak-to-Peak Dynamics ⋮ Trajectory Generation for Relative Guidance of Merging Aircraft ⋮ On the introduction of an agile, temporary workforce into a tandem queueing system ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Randomized algorithms for the synthesis of cautious adaptive controllers ⋮ A set oriented approach to optimal feedback stabilization ⋮ Exponentially Accurate Temporal Decomposition for Long-Horizon Linear-Quadratic Dynamic Optimization ⋮ Convergence of the standard RLS method andUDU^Tfactorisation of covariance matrix for solving the algebraic Riccati equation of the DLQR via heuristic approximate dynamic programming ⋮ Output regulation of unknown linear systems using average cost reinforcement learning ⋮ Scheduling networked state estimators based on value of information ⋮ An optimal stopping approach for the end-of-life inventory problem ⋮ The How and Why of Interactive Markov Chains ⋮ A moment and sum-of-squares extension of dual dynamic programming with application to nonlinear energy storage problems ⋮ Portfolio optimization under Solvency II ⋮ Dynamic procurement management by reverse auctions with fixed setup costs and sales levers ⋮ Optimization Based Stabilization of Nonlinear Control Systems ⋮ An iterative approach to the optimal co-design of linear control systems ⋮ Variance Reduced Value Iteration and Faster Algorithms for Solving Markov Decision Processes ⋮ Exponentially convergent receding horizon strategy for constrained optimal control ⋮ A Riccati-based primal interior point solver for multistage stochastic programming ‐ extensions ⋮ Computational aspects of optimal strategic network diffusion ⋮ Phase-Amplitude Separation and Modeling of Spherical Trajectories ⋮ Optimal battery purchasing and charging strategy at electric vehicle battery swap stations ⋮ Optimal dictionary for least squares representation ⋮ Unnamed Item ⋮ Stochastic Control Liaisons: Richard Sinkhorn Meets Gaspard Monge on a Schrödinger Bridge ⋮ CONTROL OF COMPLEX PEAK-TO-PEAK DYNAMICS ⋮ Optimality of admission control in an M∕M∕1∕N queue with varying services ⋮ Perishable inventory management and dynamic pricing using RFID technology ⋮ Efficient computation of time-bounded reachability probabilities in uniform continuous-time Markov decision processes ⋮ Model checking discounted temporal properties ⋮ Accelerating the convergence of value iteration by using partial transition functions ⋮ Algorithmic aspects of mean-variance optimization in Markov decision processes ⋮ On the computational efficiency of catalyst accelerated coordinate descent ⋮ Optimal sensor scheduling for hidden Markov model state estimation

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:4368722&oldid=18363912"