The following pages link to (Q4257216):
Displayed 50 items.
- A generic architecture for adaptive agents based on reinforcement learning (Q707320) (← links)
- Parallelization strategies for rollout algorithms (Q812417) (← links)
- Solving factored MDPs using non-homogeneous partitions (Q814475) (← links)
- Optimization of a large-scale water reservoir network by stochastic dynamic programming with efficient state space discretization (Q819117) (← links)
- The factored policy-gradient planner (Q835832) (← links)
- Practical solution techniques for first-order MDPs (Q835833) (← links)
- Strategy optimization for controlled Markov process with descriptive complexity constraint (Q848403) (← links)
- On solving the Lagrangian dual of integer programs via an incremental approach (Q849073) (← links)
- Neural network and regression spline value function approximations for stochastic dynamic programming (Q850310) (← links)
- Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming (Q851872) (← links)
- Heterarchical reinforcement-learning model for integration of multiple cortico-striatal loops: fMRI examination in stimulus-action-reward association learning (Q853317) (← links)
- Approximate policy optimization and adaptive control in regression models (Q853656) (← links)
- Actor-critic algorithms for hierarchical Markov decision processes (Q856510) (← links)
- A policy gradient method for semi-Markov decision processes with application to call admission control (Q859693) (← links)
- Symmetric approximate linear programming for factored MDPs with application to constrained problems (Q870814) (← links)
- The emergence of goals in a self-organizing network: a non-mentalist model of intentional actions (Q872388) (← links)
- Model-free \(Q\)-learning designs for linear discrete-time zero-sum games with application to \(H^\infty\) control (Q875484) (← links)
- Approximate dynamic programming for link scheduling in wireless mesh networks (Q925829) (← links)
- Synergies of operations research and data mining (Q976388) (← links)
- Optimally maintaining a Markovian deteriorating system with limited imperfect repairs (Q976454) (← links)
- Heterogeneous trading strategies with adaptive fuzzy actor-critic reinforcement learning: a behavioral approach (Q976531) (← links)
- Stochastic dynamic programming applied to hydrothermal power systems operation planning based on the convex hull algorithm (Q980598) (← links)
- Approximate dynamic programming with a fuzzy parameterization (Q980910) (← links)
- Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem (Q980921) (← links)
- On solving integral equations using Markov chain Monte Carlo methods (Q984311) (← links)
- Multi-period portfolio optimization with linear control policies (Q1004108) (← links)
- A formal framework and extensions for function approximation in learning classifier systems (Q1009226) (← links)
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path (Q1009248) (← links)
- Convergence analysis of batch gradient algorithm for three classes of sigma-pi neural networks (Q1009343) (← links)
- Projected equation methods for approximate solution of large linear systems (Q1012492) (← links)
- Adaptive optimal control for continuous-time linear systems based on policy iteration (Q1012874) (← links)
- A stochastic gradient type algorithm for closed-loop problems (Q1013967) (← links)
- An approximate dynamic programming approach for the vehicle routing problem with stochastic demands (Q1027533) (← links)
- Reinforcement distribution in fuzzy Q-learning (Q1037957) (← links)
- Pricing substitutable flights in airline revenue management (Q1041999) (← links)
- Resource-constrained management of heterogeneous assets with stochastic deterioration (Q1042122) (← links)
- Theoretical tools for understanding and aiding dynamic decision making (Q1042309) (← links)
- Reinforcement learning in the brain (Q1042310) (← links)
- Limitations of learning in automata-based systems (Q1046075) (← links)
- Natural actor-critic algorithms (Q1049136) (← links)
- Application of orthogonal arrays and MARS to inventory forecasting stochastic dynamic programs. (Q1285809) (← links)
- A maxmin policy for bond management (Q1296370) (← links)
- Approximate receding horizon approach for Markov decision processes: average reward case (Q1414220) (← links)
- On finding global optima for the hinge fitting problem. (Q1422384) (← links)
- Reinforcement learning for long-run average cost. (Q1427588) (← links)
- Convergent multiple-timescales reinforcement learning algorithms in normal form games (Q1429103) (← links)
- Bond management and max-min optimal control. (Q1569221) (← links)
- Comparing neuro-dynamic programming algorithms for the vehicle routing problem with stochastic demands (Q1579026) (← links)
- Stochastic dynamic programming with factored representations (Q1583230) (← links)
- Bounded-parameter Markov decision processes (Q1583513) (← links)