Shalabh Bhatnagar

From MaRDI portal
Person:230108

Available identifiers

zbMath Open bhatnagar.shalabhMaRDI QIDQ230108

List of research outcomes

PublicationDate of PublicationType
A Generalized Minimax Q-Learning Algorithm for Two-Player Zero-Sum Stochastic Games2023-09-21Paper
An Incremental Fast Policy Search Using a Single Sample Path2022-11-04Paper
Generalized Second-Order Value Iteration in Markov Decision Processes2022-10-11Paper
Analyzing Approximate Value Iteration Algorithms2022-09-26Paper
Stochastic recursive inclusions with non-additive iterate-dependent Markov noise2022-06-30Paper
Stochastic Approximation With Iterate-Dependent Markov Noise Under Verifiable Conditions in Compact State Space With the Stability of Iterates Not Ensured2022-02-24Paper
On tight bounds for function approximation error in risk-sensitive reinforcement learning2021-11-10Paper
Asynchronous Stochastic Approximations With Asymptotically Biased Errors and Deep Multiagent Learning2021-09-09Paper
Stochastic Recursive Inclusions in Two Timescales with Nonadditive Iterate-Dependent Markov Noise2021-01-08Paper
Gradient-Based Adaptive Stochastic Search for Simulation Optimization Over Continuous Space2020-11-09Paper
Analysis of Stochastic Approximation Schemes With Set-Valued Maps in the Absence of a Stability Guarantee and Their Stabilization2020-10-07Paper
Random Directions Stochastic Approximation With Deterministic Perturbations2020-10-07Paper
Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning2020-03-11Paper
Stability of Stochastic Approximations With “Controlled Markov” Noise and Temporal Difference Learning2019-07-18Paper
An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method2018-12-07Paper
An incremental off-policy search in a model-free Markov decision process using a single sample path2018-11-12Paper
https://portal.mardi4nfdi.de/entity/Q53752312018-09-14Paper
Random directions stochastic approximation with deterministic perturbations2018-08-08Paper
https://portal.mardi4nfdi.de/entity/Q45762342018-07-12Paper
A Linearly Relaxed Approximate Linear Program for Markov Decision Processes2018-06-27Paper
Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences2018-06-12Paper
Adaptive multivariate three-timescale stochastic approximation algorithms for simulation based optimization2018-06-12Paper
Adaptive Newton-based multivariate smoothed functional algorithms for simulation optimization2018-06-12Paper
Optimal parameter trajectory estimation in parameterized SDEs2018-06-12Paper
Stochastic approximation algorithms for constrained optimization via simulation2018-04-16Paper
A stability criterion for two timescale stochastic approximation schemes2017-10-11Paper
A Generalization of the Borkar-Meyn Theorem for Stochastic Recursive Inclusions2017-09-22Paper
Adaptive System Optimization Using Random Directions Stochastic Approximation2017-07-27Paper
A Simultaneous Perturbation Stochastic Approximation-Based Actor–Critic Algorithm for Markov Decision Processes2017-07-12Paper
Actor-Critic Algorithms with Online Feature Adaptation2017-06-30Paper
Smoothed Functional Algorithms for Stochastic Optimization Using q -Gaussian Distributions2017-06-30Paper
Multi-armed bandits based on a variant of simulated annealing2016-12-13Paper
Stochastic recursive inclusion in two timescales with an application to the Lagrangian dual problem2016-11-25Paper
Dynamics of stochastic approximation with iterate-dependent Markov noise under verifiable conditions in compact state space with the stability of iterates not ensured2016-01-10Paper
Necessary and sufficient conditions for optimality in constrained general sum stochastic games2015-11-02Paper
Simultaneous perturbation Newton algorithms for simulation optimization2015-03-11Paper
A simulation‐based algorithm for optimal pricing policy under demand uncertainty2015-02-25Paper
Newton-based stochastic optimization using \(q\)-Gaussian smoothed functional algorithms2014-11-19Paper
New algorithms of the Q-learning type2014-03-19Paper
General-sum stochastic games: verifiability conditions for Nash equilibria2012-12-13Paper
Stochastic recursive algorithms for optimization. Simultaneous perturbation methods2012-08-20Paper
https://portal.mardi4nfdi.de/entity/Q31740292011-10-12Paper
Monte-Carlo estimation of time-dependent statistical characteristics of random dynamical systems2011-08-28Paper
The Borkar-Meyn theorem for asynchronous stochastic approximations2011-07-27Paper
An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes2011-01-12Paper
Natural actor-critic algorithms2010-01-08Paper
Ant Colony Optimization Algorithms for Shortest Path Problems2009-03-26Paper
An extension of Wick's theorem2008-10-30Paper
Gelfand-Yaglom-Perez theorem for generalized relative entropy functionals2008-01-03Paper
Reinforcement learning based algorithms for average cost Markov decision processes2007-08-27Paper
Actor-critic algorithms for hierarchical Markov decision processes2006-12-07Paper
Multiscale Stochastic Approximation for Parametric Optimization of Hidden Markov Models2006-09-22Paper
Nongeneralizability of Tsallis Entropy by means of Kolmogorov-Nagumo averages under pseudo-additivity2005-05-30Paper
Nonextensive triangle equality and other properties of Tsallis relative-entropy minimization2005-01-11Paper
A time aggregation approach to Markov decision processes2002-09-05Paper
An optimal fuel-injection policy for performance enhancement in internal combustion engines.2002-02-18Paper
https://portal.mardi4nfdi.de/entity/Q27243832001-12-17Paper
A two Timescale Stochastic Approximation Scheme for Simulation-Based Parametric Optimization2000-12-12Paper
A Convex Analytic Framework for Ergodic Control of Semi-Markov Processes1996-07-15Paper

Research outcomes over time


Doctoral students

No records found.


Known relations from the MaRDI Knowledge Graph

PropertyValue
MaRDI profile typeMaRDI person profile
instance ofhuman


This page was built for person: Shalabh Bhatnagar