Policy-based optimization: single-step policy gradient method seen as an evolution strategy

From MaRDI portal
Publication:6365194