Statistical performance of support vector machines
From MaRDI portal
Nonparametric estimation (62G05) Asymptotic properties of nonparametric inference (62G20) Classification and discrimination; cluster analysis (statistical aspects) (62H30) Learning and adaptive systems in artificial intelligence (68T05) Neural nets and related approaches to inference from stochastic processes (62M45) Applications of operator theory in probability theory and statistics (47N30)
Abstract: The support vector machine (SVM) algorithm is well known to the computer learning community for its very good practical results. The goal of the present paper is to study this algorithm from a statistical perspective, using tools of concentration theory and empirical processes. Our main result builds on the observation made by other authors that the SVM can be viewed as a statistical regularization procedure. From this point of view, it can also be interpreted as a model selection principle using a penalized criterion. It is then possible to adapt general methods related to model selection in this framework to study two important points: (1) what is the minimum penalty and how does it compare to the penalty actually used in the SVM algorithm; (2) is it possible to obtain ``oracle inequalities in that setting, for the specific loss function used in the SVM algorithm? We show that the answer to the latter question is positive and provides relevant insight to the former. Our result shows that it is possible to obtain fast rates of convergence for SVMs.
Recommendations
Cites work
- scientific article; zbMATH DE number 1332320 (Why is no real title available?)
- scientific article; zbMATH DE number 1950576 (Why is no real title available?)
- scientific article; zbMATH DE number 962825 (Why is no real title available?)
- 10.1162/153244302760200704
- 10.1162/153244303321897690
- 10.1162/1532443041424319
- 10.1162/1532443041424337
- A Bennett concentration inequality and its application to suprema of empirical processes
- A new concentration result for regularized risk minimizers
- About the constants in Talagrand's concentration inequalities for empirical processes.
- An introduction to support vector machines and other kernel-based learning methods.
- Capacity of reproducing kernel spaces in learning theory
- Classifiers of support vector machine type with \(\ell_1\) complexity regularization
- Complexity regularization via localized random penalties
- Convexity, Classification, and Risk Bounds
- Empirical margin distributions and bounding the generalization error of combined classifiers
- Empirical minimization
- Fast rates for support vector machines using Gaussian kernels
- Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators
- Learning Theory
- Learning from examples as an inverse problem
- Local Rademacher complexities
- Local Rademacher complexities and oracle inequalities in risk minimization. (2004 IMS Medallion Lecture). (With discussions and rejoinder)
- Minimax nonparametric classification .I. Rates of convergence
- On the Eigenspectrum of the Gram Matrix and the Generalization Error of Kernel-PCA
- On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities
- Optimal aggregation of classifiers in statistical learning.
- PIECEWISE-POLYNOMIAL APPROXIMATIONS OF FUNCTIONS OF THE CLASSES $ W_{p}^{\alpha}$
- Regularization networks and support vector machines
- Risk bounds for statistical learning
- Scale-sensitive dimensions, uniform convergence, and learnability
- Simultaneous adaptation to the margin and to complexity in classification
- Some applications of concentration inequalities to statistics
- Square root penalty: Adaption to the margin in classification and in edge estimation
- Statistical behavior and consistency of classification methods based on convex risk minimization.
- Support vector machine soft margin classifiers: error analysis
- Support vector machines are universally consistent
- The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network
Cited in
(84)- Consistency and convergence rate for nearest subspace classifier
- A Reproducing Kernel Hilbert Space Framework for Functional Classification
- Bandwidth selection in kernel empirical risk minimization via the gradient
- scientific article; zbMATH DE number 2018622 (Why is no real title available?)
- Measuring the capacity of sets of functions in the analysis of ERM
- Improved classification rates for localized algorithms under margin conditions
- Comment
- The statistical rate for support matrix machines under low rankness and row (column) sparsity
- Learning rates for classification with Gaussian kernels
- Some theoretical results regarding the polygonal distribution
- Support vector machine in big data: smoothing strategy and adaptive distributed inference
- When can support vector machine achieve fast rates of convergence?
- Comment
- The benefits of modeling slack variables in SVMs
- Adaptive metric dimensionality reduction
- Penalized empirical risk minimization over Besov spaces
- Nonasymptotic bounds for vector quantization in Hilbert spaces
- Distributed inference for linear support vector machine
- A Bahadur representation of the linear support vector machine
- Oracle properties of SCAD-penalized support vector machine
- Multi-kernel regularized classifiers
- Asymptotic normality of support vector machine variants and other regularized kernel methods
- Learning with rigorous support vector machines.
- Learning Theory
- Fast convergence rates of deep neural networks for classification
- A penalized criterion for variable selection in classification
- Sparse kernel regression with coefficient-based \(\ell_q\)-regularization
- scientific article; zbMATH DE number 7370542 (Why is no real title available?)
- Analysis of support vector machine classification
- Consistency and convergence rates of one-class SVMs and related algorithms
- Fast generalization error bound of deep learning without scale invariance of activation functions
- Concentration estimates for the moving least-square method in learning theory
- Support Vector Machines
- Statistical properties and adaptive tuning of support vector machines
- SVM-Maj: a majorization approach to linear support vector machines with different hinge errors
- Learning Rates of lq Coefficient Regularization Learning with Gaussian Kernel
- Convergence rates of generalization errors for margin-based classification
- Asymptotic behavior of support vector machine for spiked population model
- Classifiers of support vector machine type with \(\ell_1\) complexity regularization
- Fast rates for empirical vector quantization
- Fast learning rates for plug-in classifiers
- Covering numbers for support vector machines
- Inverse statistical learning
- Classification with non-i.i.d. sampling
- An oracle inequality for regularized risk minimizers with strongly mixing observations
- Estimating individualized treatment rules using outcome weighted learning
- Learning rates for kernel-based expectile regression
- The new interpretation of support vector machines on statistical learning theory
- Large‐margin classification with multiple decision rules
- Local Rademacher complexities and oracle inequalities in risk minimization. (2004 IMS Medallion Lecture). (With discussions and rejoinder)
- Divide-and-conquer for debiased \(l_1\)-norm support vector machine in ultra-high dimensions
- Estimating conditional quantiles with the help of the pinball loss
- Optimal weighted nearest neighbour classifiers
- Consistency of support vector machines using additive kernels for additive models
- Simultaneous adaptation to the margin and to complexity in classification
- Oracle inequalities for support vector machines that are based on random entropy numbers
- Learning from dependent observations
- Optimal learning rates of \(l^p\)-type multiple kernel learning under general conditions
- Geometric insights into support vector machine behavior using the KKT conditions
- Universally consistent vertex classification for latent positions graphs
- An asymptotic statistical analysis of support vector machines with soft margins
- On reject and refine options in multicategory classification
- Sparsity in multiple kernel learning
- 10.1162/1532443041827925
- Support vector regression for right censored data
- Support Vector Machines for Dyadic Data
- Robust machine learning by median-of-means: theory and practice
- Gaps in Support Vector Optimization
- Cox process functional learning
- A compression approach to support vector model selection
- Regularization in kernel learning
- Optimality of SVM: novel proofs and tighter bounds
- Optimal regression rates for SVMs using Gaussian kernels
- Consistency of Support Vector Machines and Other Regularized Kernel Classifiers
- Some Remarks on the Statistical Analysis of SVMs and Related Methods
- Aggregation of SVM classifiers using Sobolev spaces
- Theory of Classification: a Survey of Some Recent Advances
- Optimal rates of aggregation in classification under low noise assumption
- Spatio-temporal convolution kernels
- Statistical properties of support vector machines with forgetting factor
- Support vector machines with a reject option
- scientific article; zbMATH DE number 7370646 (Why is no real title available?)
- The statistical learning element of support vector ordinal regression machines
- Optimal dyadic decision trees
This page was built for publication: Statistical performance of support vector machines
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2426613)