Optimizing non-decomposable measures with deep networks
From MaRDI portal
Abstract: We present a class of algorithms capable of directly training deep neural networks with respect to large families of task-specific performance measures such as the F-measure and the Kullback-Leibler divergence that are structured and non-decomposable. This presents a departure from standard deep learning techniques that typically use squared or cross-entropy loss functions (that are decomposable) to train neural networks. We demonstrate that directly training with task-specific loss functions yields much faster and more stable convergence across problems and datasets. Our proposed algorithms and implementations have several novel features including (i) convergence to first order stationary points despite optimizing complex objective functions; (ii) use of fewer training samples to achieve a desired level of convergence, (iii) a substantial reduction in training time, and (iv) a seamless integration of our implementation into existing symbolic gradient frameworks. We implement our techniques on a variety of deep architectures including multi-layer perceptrons and recurrent neural networks and show that on a variety of benchmark and real data sets, our algorithms outperform traditional approaches to training deep networks, as well as some recent approaches to task-specific training of neural networks.
Recommendations
- Efficient optimization of \(F\)-measure with cost-sensitive SVM
- Stochastic generalized gradient methods for training nonconvex nonsmooth neural networks
- Multicomposite nonconvex optimization for training deep neural networks
- A survey of deep network techniques all classifiers can adopt
- Block layer decomposition schemes for training deep neural networks
Cites work
- scientific article; zbMATH DE number 700090 (Why is no real title available?)
- Cutting-plane training of structural SVMs
- Dyad ranking using Plackett-Luce models based on joint feature representations
- Introduction to Information Retrieval
- Large margin methods for structured and interdependent output variables
- Quantification-oriented learning based on reliable classifiers
- Regularization techniques for learning with matrices
Cited in
(2)
This page was built for publication: Optimizing non-decomposable measures with deep networks
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1631814)