The following pages link to Levent Sagun (Q4605657):
Displayed 8 items.
- Universal halting times in optimization and machine learning (Q4605658) (← links)
- Triple descent and the two kinds of overfitting: where and why do they appear?* (Q5020037) (← links)
- ConViT: improving vision transformers with soft convolutional inductive biases* (Q5055416) (← links)
- Comparing dynamics: deep neural networks versus glassy systems (Q5854115) (← links)
- Entropy-SGD: biasing gradient descent into wide valleys (Q5854121) (← links)
- Scaling description of generalization with number of parameters in deep learning (Q5856249) (← links)
- A jamming transition from under- to over-parametrization affects generalization in deep learning (Q5872795) (← links)
- On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks (Q6330127) (← links)