GPipe
From MaRDI portal
Software:55166
swMATH39466MaRDI QIDQ55166FDOQ55166
Author name not available (Why is that?)
Official website: https://paperswithcode.com/paper/gpipe-efficient-training-of-giant-neural
Cited In (15)
- Binary quantized network training with sharpness-aware minimization
- A statistician teaches deep learning
- mT5
- EGC: entropy-based gradient compression for distributed deep learning
- GhostNet
- Title not available (Why is that?)
- Europarl
- GShard
- M2M-100
- Mesh TensorFlow
- BiT
- On the convergence analysis of asynchronous SGD for solving consistent linear systems
- The stochastic delta rule: faster and more accurate deep learning through adaptive weight noise
- Associated learning: decomposing end-to-end backpropagation based on autoencoders and target propagation
- Deep double descent: where bigger models and more data hurt*
This page was built for software: GPipe