The following pages link to OpenAI Gym (Q27219):
Displaying 50 items.
- Model-free reinforcement learning for branching Markov decision processes (Q832301) (← links)
- Deep active inference (Q1627054) (← links)
- Importance sampling in reinforcement learning with an estimated behavior policy (Q2051319) (← links)
- Accelerating reinforcement learning with a directional-Gaussian-smoothing evolution strategy (Q2055215) (← links)
- Convex optimization with an interpolation-based projection and its application to deep learning (Q2071365) (← links)
- Air learning: a deep reinforcement learning gym for autonomous aerial robot visual navigation (Q2071395) (← links)
- Permutation flow shop scheduling with multiple lines and demand plans using reinforcement learning (Q2077963) (← links)
- How does momentum benefit deep neural networks architecture design? A few case studies (Q2079522) (← links)
- Neural network repair with reachability analysis (Q2112124) (← links)
- Recruitment-imitation mechanism for evolutionary reinforcement learning (Q2123550) (← links)
- SAMBA: safe model-based \& active reinforcement learning (Q2127227) (← links)
- Reinforcement learning for robotic manipulation using simulated locomotion demonstrations (Q2127242) (← links)
- Deep reinforcement learning for the control of conjugate heat transfer (Q2131088) (← links)
- Quantum-enhanced reinforcement learning for control: a preliminary study (Q2138934) (← links)
- Dynamic metasurface control using deep reinforcement learning (Q2139890) (← links)
- Towards finding longer proofs (Q2142073) (← links)
- End-to-end learning for off-road terrain navigation using the chrono open-source simulation platform (Q2142333) (← links)
- A theoretical and empirical comparison of gradient approximations in derivative-free optimization (Q2143221) (← links)
- Lipschitzness is all you need to tame off-policy generative adversarial imitation learning (Q2163202) (← links)
- Laplacian smoothing gradient descent (Q2168883) (← links)
- Data science applications to string theory (Q2187812) (← links)
- Deep active inference as variational policy gradients (Q2197091) (← links)
- Active deep Q-learning with demonstration (Q2217431) (← links)
- Counterfactual state explanations for reinforcement learning agents via generative deep learning (Q2238641) (← links)
- A review on deep reinforcement learning for fluid mechanics (Q2245392) (← links)
- The Hanabi challenge: a new frontier for AI research (Q2302288) (← links)
- Branes with brains: exploring string vacua with deep reinforcement learning (Q2314876) (← links)
- TD-regularized actor-critic methods (Q2320580) (← links)
- Reinforcement learning control of constrained dynamic systems with uniformly ultimate boundedness stability guarantee (Q2665179) (← links)
- Preparation of three-atom GHZ states based on deep reinforcement learning (Q2690543) (← links)
- You only Lie Twice: A Multi-round Cyber Deception Game of Questionable Veracity (Q3297655) (← links)
- ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear Dynamical Systems with Adjustable Boundary Conditions (Q3381944) (← links)
- Neural Networks and Deep Learning (Q4569250) (← links)
- MADRaS : Multi Agent Driving Simulator (Q4989338) (← links)
- Bellman's principle of optimality and deep reinforcement learning for time-varying tasks (Q5043501) (← links)
- (Q5053314) (← links)
- (Q5054599) (← links)
- Dependable learning-enabled multiagent systems (Q5054961) (← links)
- Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning (Q5060503) (← links)
- Computational Benefits of Intermediate Rewards for Goal-Reaching Policy Learning (Q5076329) (← links)
- Constrained, Global Optimization of Unknown Functions with Lipschitz Continuous Gradients (Q5081778) (← links)
- Reproducible Hyperparameter Optimization (Q5083358) (← links)
- Automated Reinforcement Learning (AutoRL): A Survey and Open Problems (Q5094025) (← links)
- A Stochastic Trust-Region Framework for Policy Optimization (Q5096136) (← links)
- (Q5149235) (← links)
- EpidemiOptim: A Toolbox for the Optimization of Control Policies in Epidemiological Models (Q5154728) (← links)
- (Q5159396) (← links)
- Robust flow control and optimal sensor placement using deep reinforcement learning (Q5853740) (← links)
- Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning (Q5870485) (← links)
- Model-based Reinforcement Learning: A Survey (Q5870792) (← links)