OpenAI Gym
From MaRDI portal
Software:27219
swMATH15330MaRDI QIDQ27219FDOQ27219
Author name not available (Why is that?)
Cited In (50)
- Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning
- Counterfactual state explanations for reinforcement learning agents via generative deep learning
- Branes with brains: exploring string vacua with deep reinforcement learning
- Convex optimization with an interpolation-based projection and its application to deep learning
- Active deep Q-learning with demonstration
- Reinforcement learning for robotic manipulation using simulated locomotion demonstrations
- Model-free reinforcement learning for branching Markov decision processes
- TD-regularized actor-critic methods
- The Hanabi challenge: a new frontier for AI research
- Neural network repair with reachability analysis
- Robust flow control and optimal sensor placement using deep reinforcement learning
- Neural Networks and Deep Learning
- Recruitment-imitation mechanism for evolutionary reinforcement learning
- You only Lie Twice: A Multi-round Cyber Deception Game of Questionable Veracity
- Deep active inference
- Reinforcement learning control of constrained dynamic systems with uniformly ultimate boundedness stability guarantee
- SAMBA: safe model-based \& active reinforcement learning
- Deep reinforcement learning for the control of conjugate heat transfer
- Constrained, Global Optimization of Unknown Functions with Lipschitz Continuous Gradients
- Quantum-enhanced reinforcement learning for control: a preliminary study
- Dynamic metasurface control using deep reinforcement learning
- End-to-end learning for off-road terrain navigation using the chrono open-source simulation platform
- Towards finding longer proofs
- A review on deep reinforcement learning for fluid mechanics
- Title not available (Why is that?)
- Automated Reinforcement Learning (AutoRL): A Survey and Open Problems
- Lipschitzness is all you need to tame off-policy generative adversarial imitation learning
- MADRaS : Multi Agent Driving Simulator
- Importance sampling in reinforcement learning with an estimated behavior policy
- Accelerating reinforcement learning with a directional-Gaussian-smoothing evolution strategy
- EpidemiOptim: A Toolbox for the Optimization of Control Policies in Epidemiological Models
- A theoretical and empirical comparison of gradient approximations in derivative-free optimization
- ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear Dynamical Systems with Adjustable Boundary Conditions
- Bellman's principle of optimality and deep reinforcement learning for time-varying tasks
- Title not available (Why is that?)
- Title not available (Why is that?)
- Model-based Reinforcement Learning: A Survey
- Dependable learning-enabled multiagent systems
- Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning
- Title not available (Why is that?)
- Air learning: a deep reinforcement learning gym for autonomous aerial robot visual navigation
- Preparation of three-atom GHZ states based on deep reinforcement learning
- A Stochastic Trust-Region Framework for Policy Optimization
- Permutation flow shop scheduling with multiple lines and demand plans using reinforcement learning
- How does momentum benefit deep neural networks architecture design? A few case studies
- Deep active inference as variational policy gradients
- Computational Benefits of Intermediate Rewards for Goal-Reaching Policy Learning
- Laplacian smoothing gradient descent
- Reproducible Hyperparameter Optimization
- Data science applications to string theory
This page was built for software: OpenAI Gym