Theory of Classification: a Survey of Some Recent Advances
From MaRDI portal
Publication:3373749
DOI10.1051/ps:2005018zbMath1136.62355OpenAlexW2014902932WikidataQ58374465 ScholiaQ58374465MaRDI QIDQ3373749
Olivier Bousquet, Stéphane Boucheron, Gábor Lugosi
Publication date: 9 March 2006
Published in: ESAIM: Probability and Statistics (Search for Journal in Brave)
Full work available at URL: http://www.numdam.org/item?id=PS_2005__9__323_0
model selectionempirical processesstatistical learning theoryPattern recognitionconcentration inequalities
Classification and discrimination; cluster analysis (statistical aspects) (62H30) Pattern recognition, speech recognition (68T10)
Related Items (96)
Sampling and empirical risk minimization ⋮ Neural network approximation ⋮ An empirical classification procedure for nonparametric mixture models ⋮ Classification in general finite dimensional spaces with the \(k\)-nearest neighbor rule ⋮ Classification with reject option ⋮ Efficiency of classification methods based on empirical risk minimization ⋮ PAC-Bayesian high dimensional bipartite ranking ⋮ Model selection by bootstrap penalization for classification ⋮ Guest editorial: Learning theory ⋮ On the kernel rule for function classification ⋮ Fast learning rates in statistical inference through aggregation ⋮ Stability and minimax optimality of tangential Delaunay complexes for manifold reconstruction ⋮ Localization of VC classes: beyond local Rademacher complexities ⋮ On regularization algorithms in learning theory ⋮ Multi-kernel regularized classifiers ⋮ How can we identify the sparsity structure pattern of high-dimensional data: an elementary statistical analysis to interpretable machine learning ⋮ Mathematical methods of randomized machine learning ⋮ Robust statistical learning with Lipschitz and convex loss functions ⋮ Optimal functional supervised classification with separation condition ⋮ Overlaying classifiers: A practical approach to optimal scoring ⋮ A partial overview of the theory of statistics with functional data ⋮ A statistical view of clustering performance through the theory of \(U\)-processes ⋮ Consistency of learning algorithms using Attouch–Wets convergence ⋮ Hold-out estimates of prediction models for Markov processes ⋮ Empirical risk minimization for heavy-tailed losses ⋮ SVRG meets AdaGrad: painless variance reduction ⋮ Unnamed Item ⋮ Adaptive partitioning schemes for bipartite ranking ⋮ Learning noisy linear classifiers via adaptive and selective sampling ⋮ Unnamed Item ⋮ Learning bounds for quantum circuits in the agnostic setting ⋮ Ranking and empirical minimization of \(U\)-statistics ⋮ Local nearest neighbour classification with applications to semi-supervised learning ⋮ Properties of convergence of a fuzzy set estimator of the density function ⋮ Robustness and generalization ⋮ Strong Lp convergence of wavelet deconvolution density estimators ⋮ Robust classification via MOM minimization ⋮ Risk bounds for CART classifiers under a margin condition ⋮ Upper bounds and aggregation in bipartite ranking ⋮ Concentration Inequalities for Samples without Replacement ⋮ Classification with minimax fast rates for classes of Bayes rules with sparse representation ⋮ Plugin procedure in segmentation and application to hyperspectral image segmentation ⋮ Kullback-Leibler aggregation and misspecified generalized linear models ⋮ Adaptive kernel methods using the balancing principle ⋮ Relative deviation learning bounds and generalization with unbounded loss functions ⋮ An empirical comparison of learning algorithms for nonparametric scoring: the \textsc{TreeRank} algorithm and other methods ⋮ Cover-based combinatorial bounds on probability of overfitting ⋮ Combinatorial bounds of overfitting for threshold classifiers ⋮ Kernel methods in machine learning ⋮ Structure from Randomness in Halfspace Learning with the Zero-One Loss ⋮ Obtaining fast error rates in nonconvex situations ⋮ Simultaneous adaptation to the margin and to complexity in classification ⋮ Classification algorithms using adaptive partitioning ⋮ Concentration inequalities for two-sample rank processes with application to bipartite ranking ⋮ Convergence conditions for the observed mean method in stochastic programming ⋮ Sample Complexity of Classifiers Taking Values in ℝQ, Application to Multi-Class SVMs ⋮ Optimal rates of aggregation in classification under low noise assumption ⋮ Unnamed Item ⋮ Learning by mirror averaging ⋮ Constructing processes with prescribed mixing coefficients ⋮ Reducing mechanism design to algorithm design via machine learning ⋮ Learning sets with separating kernels ⋮ Classification with many classes: challenges and pluses ⋮ A survey of cross-validation procedures for model selection ⋮ Maxisets for model selection ⋮ A high-dimensional Wilks phenomenon ⋮ Supervised Learning by Support Vector Machines ⋮ Testing piecewise functions ⋮ Fast learning rates for plug-in classifiers ⋮ Variance-based regularization with convex objectives ⋮ Sharpness estimation of combinatorial generalization ability bounds for threshold decision rules ⋮ Some properties of Gaussian reproducing kernel Hilbert spaces and their implications for function approximation and learning theory ⋮ Generalized mirror averaging and \(D\)-convex aggregation ⋮ PAC-Bayesian bounds for randomized empirical risk minimizers ⋮ Bandwidth selection in kernel empirical risk minimization via the gradient ⋮ Measuring the Capacity of Sets of Functions in the Analysis of ERM ⋮ Agnostic active learning ⋮ Bayesian approach, theory of empirical risk minimization. Comparative analysis ⋮ Optimal weighted nearest neighbour classifiers ⋮ Optimal survey schemes for stochastic gradient descent with applications to M-estimation ⋮ Robust \(k\)-means clustering for distributions with two moments ⋮ When are epsilon-nets small? ⋮ Unnamed Item ⋮ On signal representations within the Bayes decision framework ⋮ Adaptive Estimation of the Optimal ROC Curve and a Bipartite Ranking Algorithm ⋮ Permutational Rademacher Complexity ⋮ Instability, complexity, and evolution ⋮ Statistical analysis of Mapper for stochastic and multivariate filters ⋮ Percolation centrality via Rademacher Complexity ⋮ Unnamed Item ⋮ Minimax fast rates for discriminant analysis with errors in variables ⋮ Statistical active learning algorithms for noise tolerance and differential privacy ⋮ Statistical learning from biased training samples ⋮ Stochastic Difference-of-Convex-Functions Algorithms for Nonconvex Programming ⋮ For interpolating kernel machines, minimizing the norm of the ERM solution maximizes stability ⋮ Depth separations in neural networks: what is actually being separated?
Uses Software
Cites Work
- On the mathematical foundations of learning
- 10.1162/153244302760185252
- 10.1162/153244303765208368
- PAC-Bayesian Generalisation Error Bounds for Gaussian Process Classification
- 10.1162/153244303768966111
- On Talagrand's deviation inequalities for product measures
- Concentration for Independent Permutations
- Probably Approximate Learning of Sets and Functions
- Probability Inequalities for the Sum of Independent Random Variables
- Learnability and the Vapnik-Chervonenkis dimension
- Density estimation via exponential model selection
- Consistency of Support Vector Machines and Other Regularized Kernel Classifiers
- Capacity of reproducing kernel spaces in learning theory
- A simple proof of the blowing-up lemma (Corresp.)
- Automatic pattern recognition: a study of the probability of error
- Distribution-free performance bounds for potential function rules
- Necessary and Sufficient Conditions for the Uniform Convergence of Means to their Expectations
- Minimum complexity density estimation
- Exponential Bounds for Large Deviations
- Correction to bounds on conditional probabilities with applications
- Distribution-free inequalities for the deleted and holdout error estimates
- Uniform Central Limit Theorems
- Strongly consistent code-based identification and order estimation for constrained finite-state model classes
- Ideal spatial adaptation by wavelet shrinkage
- A Remark on the Szarek–Talagrand Theorem
- Scale-sensitive dimensions, uniform convergence, and learnability
- Boosting With theL2Loss
- A sharp concentration inequality with applications
- Minimax nonparametric classification. II. Model selection for adaptation
- Minimax nonparametric classification .I. Rates of convergence
- Rademacher penalties and structural risk minimization
- The Generic Chaining
- Large-scale typicality of Markov sample paths and consistency of MDL order estimators
- Improving the sample complexity using global data
- Learning Theory
- Structural risk minimization over data-dependent hierarchies
- The importance of convexity in learning with squared loss
- Learning pattern classification-a survey
- On the infeasibility of training neural networks with small mean-squared error
- 10.1162/153244302760200650
- 10.1162/153244302760200704
- 10.1162/1532443041424319
- 10.1162/153244303321897681
- 10.1162/153244303321897690
- 10.1162/1532443041827925
- Concept learning using complexity regularization
- Neural Network Learning
- Finiteness results for sigmoidal “neural” networks
- Probability Inequalities for Sums of Bounded Random Variables
- An alternative point of view on Lepski's method
- Learning Theory
- Enumeration of Seven-Argument Threshold Functions
- A Correspondence Between Bayesian Estimation on Stochastic Processes and Smoothing by Splines
- On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities
- Some Comments on C P
- Convexity, Classification, and Risk Bounds
- Convergence of stochastic processes
- Introduction to nonparametric estimation
- Some applications of concentration inequalities to statistics
- Concentration inequalities for set-indexed empirical processes
- A note on Talagrand's concentration inequality
- Improved bounds on the sample complexity of learning
- The elements of statistical learning. Data mining, inference, and prediction
- Model selection and error estimation
- Logistic regression, AdaBoost and Bregman distances
- New concentration inequalities in product spaces
- A new look at the statistical model identification
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- A new approach to least-squares estimation, with applications
- Polynomial bounds for VC dimension of sigmoidal and general Pfaffian neural networks
- Neural networks with quadratic VC dimension
- The Glivenko-Cantelli problem, ten years later
- Poincaré's inequalities and Talagrand's concentration phenomenon for the exponential distribution
- Some limit theorems for empirical processes (with discussion)
- Uniform and universal Glivenko-Cantelli classes
- Combinatorics of random processes and sections of convex bodies
- Risk bounds for statistical learning
- Concentration inequalities and model selection. Ecole d'Eté de Probabilités de Saint-Flour XXXIII -- 2003.
- Estimating a regression function
- Large sample optimality of least squares cross-validation in density estimation
- Universal Donsker classes and metric entropy
- The Glivenko-Cantelli problem
- Some special Vapnik-Chervonenkis classes
- Estimation of dependences based on empirical data. Transl. from the Russian by Samuel Kotz
- Decision theoretic generalizations of the PAC model for neural net and other learning applications
- Exponential inequalities for sums of random vectors
- The densest hemisphere problem
- Existence of submatrices with all possible columns
- Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation
- A finite sample distribution-free performance bound for local discrimination rules
- Central limit theorems for empirical measures
- Bootstrap methods: another look at the jackknife
- Balls in \(\mathbb{R}^k\) do not cut all subsets of \(k+2\) points
- A graph-theoretic generalization of the Sauer-Shelah lemma
- Minimum contrast estimators on sieves: Exponential bounds and rates of convergence
- Risk bounds for model selection via penalization
- A result of Vapnik with applications
- Sharper bounds for Gaussian and empirical processes
- Rates of convergence for minimum contrast estimators
- Predicting \(\{ 0,1\}\)-functions on randomly drawn points
- Sphere packing numbers for subsets of the Boolean \(n\)-cube with bounded Vapnik-Chervonenkis dimension
- On the convexified Sauer-Shelah theorem
- Information inequalities and concentration of measure
- Optimal spatial adaptation to inhomogeneous smoothness: An approach based on kernel estimates with variable bandwidth selectors
- On nonparametric estimation of density level sets
- Erratum to: ``A measure concentration inequality for contracting Markov chains
- A decision-theoretic generalization of on-line learning and an application to boosting
- Strong minimax lower bounds for learning
- Model selection in nonparametric regression
- Entropy and the combinatorial dimension
- PAC-Bayesian stochastic model selection
- Vapnik-Chervonenkis type conditions and uniform Donsker classes of functions
- Concentration inequalities using the entropy method
- Symmetrization approach to concentration inequalities for empirical processes.
- Smooth discrimination analysis
- Adaptive model selection using empirical complexities
- Model selection for regression on a fixed design
- A Bennett concentration inequality and its application to suprema of empirical processes
- Hardness results for neural network approximation problems
- A note on margin-based loss functions in classification
- Moment inequalities for functions of independent random variables
- Arcing classifiers. (With discussion)
- Boosting the margin: a new explanation for the effectiveness of voting methods
- Behavioral and prescriptive explanations of a reverse sunk cost effect
- A general lower bound on the number of examples needed for learning
- On the trace of finite sets
- Density and dimension
- Additive logistic regression: a statistical view of boosting. (With discussion and a rejoinder by the authors)
- The consistency of the BIC Markov order estimator.
- Empirical margin distributions and bounding the generalization error of combined classifiers
- Some extensions of an inequality of Vapnik and Chervonenkis
- Une inégalité de Bennett pour les maxima de processus empiriques. (A Bennet type inequality for maxima of empirical processes)
- About the constants in Talagrand's concentration inequalities for empirical processes.
- Support vector machines are universally consistent
- Concentration for locally acting permutations
- Complexity regularization via localized random penalties
- Generalization bounds for averaged classifiers
- Statistical learning theory and stochastic optimization. Ecole d'Eté de Probabilitiés de Saint-Flour XXXI -- 2001.
- Population theory for boosting ensembles.
- Process consistency for AdaBoost.
- On the Bayes-risk consistency of regularized boosting methods.
- Statistical behavior and consistency of classification methods based on convex risk minimization.
- Optimal aggregation of classifiers in statistical learning.
- Boosting a weak learning algorithm by majority
- Bounding the Vapnik-Chervonenkis dimension of concept classes parameterized by real numbers
- Support-vector networks
- Measuring mass concentrations and estimating density contour clusters -- An excess mass approach
- Concentration of measure and isoperimetric inequalities in product spaces
- Weak convergence and empirical processes. With applications to statistics
- Empirical processes and applications: An overview. (With discussion)
- A new look at independence
- Regularization networks and support vector machines
- Local Rademacher complexities and oracle inequalities in risk minimization. (2004 IMS Medallion Lecture). (With discussions and rejoinder)
- Statistical performance of support vector machines
- The probability problem of pattern recognition learning and the method of potential functions
- The method of potential functions for the problem of restoring the characteristic of a function converter from randomly observed points
- Theoretical foundations of the potential function method in pattern recognition learning
- Weighted sums of certain dependent random variables
- Potential function algorithms for pattern recognition learning machines
- A combinatorial problem; stability and order for models and theories in infinitary languages
- On the density of families of sets
- Bounding \(\bar d\)-distance by informational divergence: A method to prove measure concentration
- Majorizing measures: The generic chaining
- Square root penalty: Adaption to the margin in classification and in edge estimation
- Local Rademacher complexities
This page was built for publication: Theory of Classification: a Survey of Some Recent Advances