Statistical performance of support vector machines
From MaRDI portal
Nonparametric estimation (62G05) Asymptotic properties of nonparametric inference (62G20) Classification and discrimination; cluster analysis (statistical aspects) (62H30) Learning and adaptive systems in artificial intelligence (68T05) Neural nets and related approaches to inference from stochastic processes (62M45) Applications of operator theory in probability theory and statistics (47N30)
Abstract: The support vector machine (SVM) algorithm is well known to the computer learning community for its very good practical results. The goal of the present paper is to study this algorithm from a statistical perspective, using tools of concentration theory and empirical processes. Our main result builds on the observation made by other authors that the SVM can be viewed as a statistical regularization procedure. From this point of view, it can also be interpreted as a model selection principle using a penalized criterion. It is then possible to adapt general methods related to model selection in this framework to study two important points: (1) what is the minimum penalty and how does it compare to the penalty actually used in the SVM algorithm; (2) is it possible to obtain ``oracle inequalities in that setting, for the specific loss function used in the SVM algorithm? We show that the answer to the latter question is positive and provides relevant insight to the former. Our result shows that it is possible to obtain fast rates of convergence for SVMs.
Recommendations
Cites work
- scientific article; zbMATH DE number 1332320 (Why is no real title available?)
- scientific article; zbMATH DE number 1950576 (Why is no real title available?)
- scientific article; zbMATH DE number 962825 (Why is no real title available?)
- 10.1162/153244302760200704
- 10.1162/153244303321897690
- 10.1162/1532443041424319
- 10.1162/1532443041424337
- A Bennett concentration inequality and its application to suprema of empirical processes
- A new concentration result for regularized risk minimizers
- About the constants in Talagrand's concentration inequalities for empirical processes.
- An introduction to support vector machines and other kernel-based learning methods.
- Capacity of reproducing kernel spaces in learning theory
- Classifiers of support vector machine type with \(\ell_1\) complexity regularization
- Complexity regularization via localized random penalties
- Convexity, Classification, and Risk Bounds
- Empirical margin distributions and bounding the generalization error of combined classifiers
- Empirical minimization
- Fast rates for support vector machines using Gaussian kernels
- Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators
- Learning Theory
- Learning from examples as an inverse problem
- Local Rademacher complexities
- Local Rademacher complexities and oracle inequalities in risk minimization. (2004 IMS Medallion Lecture). (With discussions and rejoinder)
- Minimax nonparametric classification .I. Rates of convergence
- On the Eigenspectrum of the Gram Matrix and the Generalization Error of Kernel-PCA
- On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities
- Optimal aggregation of classifiers in statistical learning.
- PIECEWISE-POLYNOMIAL APPROXIMATIONS OF FUNCTIONS OF THE CLASSES $ W_{p}^{\alpha}$
- Regularization networks and support vector machines
- Risk bounds for statistical learning
- Scale-sensitive dimensions, uniform convergence, and learnability
- Simultaneous adaptation to the margin and to complexity in classification
- Some applications of concentration inequalities to statistics
- Square root penalty: Adaption to the margin in classification and in edge estimation
- Statistical behavior and consistency of classification methods based on convex risk minimization.
- Support vector machine soft margin classifiers: error analysis
- Support vector machines are universally consistent
- The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network
Cited in
(88)- Learning Theory
- Consistency of Support Vector Machines and Other Regularized Kernel Classifiers
- Support vector machines with a reject option
- Universally consistent vertex classification for latent positions graphs
- A penalized criterion for variable selection in classification
- Penalized empirical risk minimization over Besov spaces
- An Adaptive Transfer Learning Framework for Functional Classification
- Optimality of SVM: novel proofs and tighter bounds
- Statistical inference in classification of high-dimensional Gaussian mixture
- Estimating individualized treatment rules using outcome weighted learning
- Learning with rigorous support vector machines.
- Covering numbers for support vector machines
- Analysis of support vector machine classification
- A compression approach to support vector model selection
- Learning rates for classification with Gaussian kernels
- Comment
- Adaptive metric dimensionality reduction
- Concentration estimates for the moving least-square method in learning theory
- Convolution smoothing and online updating estimation for support vector machine
- Some theoretical results regarding the polygonal distribution
- scientific article; zbMATH DE number 2018622 (Why is no real title available?)
- Measuring the capacity of sets of functions in the analysis of ERM
- Consistency and convergence rates of one-class SVMs and related algorithms
- Oracle properties of SCAD-penalized support vector machine
- Convergence rates of generalization errors for margin-based classification
- Byzantine-robust distributed support vector machine
- The benefits of modeling slack variables in SVMs
- Spatio-temporal convolution kernels
- Robust machine learning by median-of-means: theory and practice
- Asymptotic behavior of support vector machine for spiked population model
- Local Rademacher complexities and oracle inequalities in risk minimization. (2004 IMS Medallion Lecture). (With discussions and rejoinder)
- Improved classification rates for localized algorithms under margin conditions
- Nonasymptotic bounds for vector quantization in Hilbert spaces
- Fast learning rates for plug-in classifiers
- Estimating conditional quantiles with the help of the pinball loss
- Classifiers of support vector machine type with \(\ell_1\) complexity regularization
- Multi-kernel regularized classifiers
- Support Vector Machines
- When can support vector machine achieve fast rates of convergence?
- A Bahadur representation of the linear support vector machine
- Some Remarks on the Statistical Analysis of SVMs and Related Methods
- Statistical properties of support vector machines with forgetting factor
- A Reproducing Kernel Hilbert Space Framework for Functional Classification
- 10.1162/1532443041827925
- An asymptotic statistical analysis of support vector machines with soft margins
- scientific article; zbMATH DE number 7370542 (Why is no real title available?)
- Statistical properties and adaptive tuning of support vector machines
- Consistency of support vector machines using additive kernels for additive models
- Consistency and convergence rate for nearest subspace classifier
- scientific article; zbMATH DE number 7370646 (Why is no real title available?)
- Sparse kernel regression with coefficient-based \(\ell_q\)-regularization
- Asymptotic normality of support vector machine variants and other regularized kernel methods
- Large‐margin classification with multiple decision rules
- Fast generalization error bound of deep learning without scale invariance of activation functions
- Distributed inference for linear support vector machine
- The statistical rate for support matrix machines under low rankness and row (column) sparsity
- Support vector machine in big data: smoothing strategy and adaptive distributed inference
- Fast convergence rates of deep neural networks for classification
- Fast rates for empirical vector quantization
- Aggregation of SVM classifiers using Sobolev spaces
- Oracle inequalities for support vector machines that are based on random entropy numbers
- Support Vector Machines for Dyadic Data
- Optimal dyadic decision trees
- Inverse statistical learning
- Bandwidth selection in kernel empirical risk minimization via the gradient
- SVM-Maj: a majorization approach to linear support vector machines with different hinge errors
- Comment
- Cox process functional learning
- Geometric insights into support vector machine behavior using the KKT conditions
- Simultaneous adaptation to the margin and to complexity in classification
- Optimal regression rates for SVMs using Gaussian kernels
- An oracle inequality for regularized risk minimizers with strongly mixing observations
- Optimal rates of aggregation in classification under low noise assumption
- Optimal learning rates of \(l^p\)-type multiple kernel learning under general conditions
- Learning Rates of lq Coefficient Regularization Learning with Gaussian Kernel
- The statistical learning element of support vector ordinal regression machines
- Classification with non-i.i.d. sampling
- On reject and refine options in multicategory classification
- Sparsity in multiple kernel learning
- Learning rates for kernel-based expectile regression
- Optimal weighted nearest neighbour classifiers
- Support vector regression for right censored data
- Regularization in kernel learning
- The new interpretation of support vector machines on statistical learning theory
- Divide-and-conquer for debiased \(l_1\)-norm support vector machine in ultra-high dimensions
- Learning from dependent observations
- Gaps in Support Vector Optimization
- Theory of Classification: a Survey of Some Recent Advances
This page was built for publication: Statistical performance of support vector machines
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2426613)