The following pages link to (Q4257216):
Displayed 39 items.
- QUANTUM COMPUTATION FOR ACTION SELECTION USING REINFORCEMENT LEARNING (Q3427060) (← links)
- Optimal empty vehicle redistribution for hub‐and‐spoke transportation systems (Q3539892) (← links)
- Robust Optimizers for Nonlinear Programming in Approximate Dynamic Programming (Q3564534) (← links)
- Reward-Modulated Hebbian Learning of Decision Making (Q3568365) (← links)
- A Spiking Neural Network Model of an Actor-Critic Learning Agent (Q3612121) (← links)
- Opportunistic Transmission over Randomly Varying Channels (Q3616977) (← links)
- Simultaneous Optimal Control and Discrete Stochastic Sensor Selection (Q3624562) (← links)
- Value and Policy Function Approximations in Infinite-Horizon Optimization Problems (Q3626047) (← links)
- Challenges in Enterprise Wide Optimization for the Process Industries (Q3638498) (← links)
- Bounds for Multistage Stochastic Programs Using Supervised Learning Strategies (Q3646118) (← links)
- Optimal control of a class of nonlinear stochastic systems (Q3931287) (← links)
- On the structure of value functions for threshold policies in queueing models (Q4462692) (← links)
- From Infinite to Finite Programs: Explicit Error Bounds with Applications to Approximate Dynamic Programming (Q4571046) (← links)
- Ordinary Differential Equation Methods for Markov Decision Processes and Application to Kullback--Leibler Control Cost (Q4602532) (← links)
- (Q4636981) (← links)
- Decomposition Methods for Computing Directional Stationary Solutions of a Class of Nonsmooth Nonconvex Optimization Problems (Q4641680) (← links)
- Optimal Dynamic Treatment Regimes (Q4665861) (← links)
- Computable approximations for average Markov decision processes in continuous time (Q4684960) (← links)
- New Rollout Algorithms for Combinatorial Optimization Problems (Q4709749) (← links)
- Suboptimal Policies for Stochastic $$N$$-Stage Optimization: Accuracy Analysis and a Case Study from Optimal Consumption (Q4979399) (← links)
- Variance-penalized Markov decision processes: dynamic programming and reinforcement learning techniques (Q5166474) (← links)
- Finite horizon optimal control of non-linear discrete-time switched systems using adaptive dynamic programming with ε-error bound (Q5168016) (← links)
- (Q5168862) (← links)
- (Q5168869) (← links)
- Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning (Q5189863) (← links)
- Convergence Rates and Decoupling in Linear Stochastic Approximation Algorithms (Q5254881) (← links)
- Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints (Q5265704) (← links)
- Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities (Q5265786) (← links)
- Randomized Shortest-Path Problems: Two Related Models (Q5323768) (← links)
- A Relational Hierarchical Model for Decision-Theoretic Assistance (Q5452090) (← links)
- Power and delay optimisation in multi-hop wireless networks (Q5494536) (← links)
- On Convergence of Value Iteration for a Class of Total Cost Markov Decision Processes (Q5502179) (← links)
- REINFORCEMENT LEARNING WITH GOAL-DIRECTED ELIGIBILITY TRACES (Q5699354) (← links)
- Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes (Q5898263) (← links)
- Some operations research methods for analyzing protein sequences and structures (Q5900888) (← links)
- Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes (Q5920615) (← links)
- Mathematical programming for network revenue management revisited (Q5956205) (← links)
- A sensitivity formula for risk-sensitive cost and the actor-critic algorithm (Q5958425) (← links)
- Minimising average passenger waiting time in personal rapid transit systems (Q5963104) (← links)