DeepSpeed
From MaRDI portal
Cited in
(13)- ColBERT
- DCT-former
- Combiner
- FNet
- Fastformer
- Soft
- Scatterbrain
- SqueezeBERT
- Gradient methods for optimizing metaparameters in the knowledge distillation problem
- Parallel physics-informed neural networks via domain decomposition
- Co-evolution-based parameter learning for remote sensing scene classification
- FMMformer
- WinoGrande
This page was built for software: DeepSpeed