Cited in
(18)- Fundamental design principles for reinforcement learning algorithms
- scientific article; zbMATH DE number 7415112 (Why is no real title available?)
- Policy space identification in configurable environments
- Fast global convergence of natural policy gradient methods with entropy regularization
- An efficient algorithm for nonconvex-linear minimax optimization problem and its application in solving weighted maximin dispersion problem
- Sample complexity of sample average approximation for conditional stochastic optimization
- Efficient search of first-order Nash equilibria in nonconvex-concave smooth min-max problems
- TernGrad
- DSCOVR
- ckn_kernel
- Baselines
- IQC-Game
- NC-OPT
- DualDICE
- IterNet
- IMPALA
- A backward SDE method for uncertainty quantification in deep learning
- scientific article; zbMATH DE number 7370620 (Why is no real title available?)
This page was built for software: SBEED