Two steps at a time-taking GAN training in stride with Tseng's method

From MaRDI portal
Publication:5089720

DOI10.1137/21M1420939zbMATH Open1492.65175arXiv2006.09033OpenAlexW3035573799MaRDI QIDQ5089720FDOQ5089720

Axel Böhm, Ernö Robert Csetnek, Michael Sedlmayer, Radu I. Boţ

Publication date: 15 July 2022

Published in: SIAM Journal on Mathematics of Data Science (Search for Journal in Brave)

Abstract: Motivated by the training of Generative Adversarial Networks (GANs), we study methods for solving minimax problems with additional nonsmooth regularizers. We do so by employing emph{monotone operator} theory, in particular the emph{Forward-Backward-Forward (FBF)} method, which avoids the known issue of limit cycling by correcting each update by a second gradient evaluation. Furthermore, we propose a seemingly new scheme which recycles old gradients to mitigate the additional computational cost. In doing so we rediscover a known method, related to emph{Optimistic Gradient Descent Ascent (OGDA)}. For both schemes we prove novel convergence rates for convex-concave minimax problems via a unifying approach. The derived error bounds are in terms of the gap function for the ergodic iterates. For the deterministic and the stochastic problem we show a convergence rate of mathcalO(1/k) and mathcalO(1/sqrtk), respectively. We complement our theoretical results with empirical improvements in the training of Wasserstein GANs on the CIFAR10 dataset.


Full work available at URL: https://arxiv.org/abs/2006.09033




Recommendations




Cites Work


Cited In (10)

Uses Software





This page was built for publication: Two steps at a time-taking GAN training in stride with Tseng's method

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5089720)