Stochastic recursive algorithms for optimization. Simultaneous perturbation methods

From MaRDI portal
Publication:441138

DOI10.1007/978-1-4471-4285-0zbMath1260.90002OpenAlexW2493209382MaRDI QIDQ441138

H. L. Prasad, L. A. Prashanth, Shalabh Bhatnagar

Publication date: 20 August 2012

Published in: Lecture Notes in Control and Information Sciences (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1007/978-1-4471-4285-0



Related Items

Multiscale Q-learning with linear function approximation, Risk-Sensitive Reinforcement Learning via Policy Gradient Search, Variance-constrained actor-critic algorithms for discounted and average reward MDPs, Variable ansatz applied to spectral operator decomposition in a physical superconducting quantum device, A model for data transmission and its optimization, Stochastic approximation procedures for Lévy-driven SDEs, Modeling and control of data transmission, Learning equilibrium mean‐variance strategy, Nested kriging predictions for datasets with a large number of observations, Generalization of a result of Fabian on the asymptotic normality of stochastic approximation, Gradient-Based Adaptive Stochastic Search for Simulation Optimization Over Continuous Space, Newton-based stochastic optimization using \(q\)-Gaussian smoothed functional algorithms, Smoothed functional-based gradient algorithms for off-policy reinforcement learning: a non-asymptotic viewpoint, Quasi-Newton smoothed functional algorithms for unconstrained and constrained simulation optimization, Risk-Constrained Reinforcement Learning with Percentile Risk Criteria, Simultaneous perturbation Newton algorithms for simulation optimization, A simulation‐based algorithm for optimal pricing policy under demand uncertainty, Recurrent neural networks as optimal mesh refinement strategies, Derivative-free optimization methods, Simulation methods for robust risk assessment and the distorted mix approach, Quantum simulation of the ground-state Stark effect in small molecules: a case study using IBM Q, Iterative learning control using faded measurements without system information: a gradient estimation approach, Actor-Critic Algorithms with Online Feature Adaptation