Blessing of dimensionality: mathematical foundations of the statistical physics of data
From MaRDI portal
Publication:5154201
Abstract: The concentration of measure phenomena were discovered as the mathematical background of statistical mechanics at the end of the XIX - beginning of the XX century and were then explored in mathematics of the XX-XXI centuries. At the beginning of the XXI century, it became clear that the proper utilisation of these phenomena in machine learning might transform the curse of dimensionality into the blessing of dimensionality. This paper summarises recently discovered phenomena of measure concentration which drastically simplify some machine learning problems in high dimension, and allow us to correct legacy artificial intelligence systems. The classical concentration of measure theorems state that i.i.d. random points are concentrated in a thin layer near a surface (a sphere or equators of a sphere, an average or median level set of energy or another Lipschitz function, etc.). The new stochastic separation theorems describe the thin structure of these thin layers: the random points are not only concentrated in a thin layer but are all linearly separable from the rest of the set, even for exponentially large random sets. The linear functionals for separation of points can be selected in the form of the linear Fisher's discriminant. All artificial intelligence systems make errors. Non-destructive correction requires separation of the situations (samples) with errors from the samples corresponding to correct behaviour by a simple and robust classifier. The stochastic separation theorems provide us by such classifiers and a non-iterative (one-shot) procedure for learning.
Recommendations
Cites work
- scientific article; zbMATH DE number 1666100 (Why is no real title available?)
- scientific article; zbMATH DE number 3123490 (Why is no real title available?)
- scientific article; zbMATH DE number 3530497 (Why is no real title available?)
- scientific article; zbMATH DE number 1149836 (Why is no real title available?)
- scientific article; zbMATH DE number 1182755 (Why is no real title available?)
- scientific article; zbMATH DE number 1391397 (Why is no real title available?)
- scientific article; zbMATH DE number 3231758 (Why is no real title available?)
- scientific article; zbMATH DE number 3057307 (Why is no real title available?)
- scientific article; zbMATH DE number 3067044 (Why is no real title available?)
- scientific article; zbMATH DE number 3109251 (Why is no real title available?)
- A simplified neuron model as a principal component analyzer
- An elementary proof of a theorem of Johnson and Lindenstrauss
- Approximation with random bases: pro et contra
- Clustering. A data recovery approach.
- Concentration of measure and isoperimetric inequalities in product spaces
- Concentration property on probability spaces.
- Data complexity measured by principal graphs
- Data mining. The textbook
- David Hilbert and the axiomatization of physics (1894--1905)
- Extensions of Lipschitz mappings into a Hilbert space
- Generic Hamiltonian dynamical systems are neither integrable nor ergodic
- High-dimensional brain: a tool for encoding and rapid learning of memories by single neurons
- Is the \(k\)-NN classifier in high dimensions affected by the curse of dimensionality?
- Learning deep architectures for AI
- Measure-preserving homeomorphisms and metrical transitivity
- Oded Schramm 1961--2008
- On the mathematical foundations of learning
- One-trial correction of legacy AI systems and stochastic separation theorems
- Pattern classification.
- Piece-wise quadratic approximations of arbitrary error functions for fast and robust machine learning
- Principal manifolds for data visualization and dimension reduction. Reviews and original papers presented partially at the workshop `Principal manifolds for data cartography and dimension reduction', Leicester, UK, August 24--26, 2006.
- Probabilistic lower bounds for approximation by shallow perceptron networks
- Probability Inequalities for Sums of Bounded Random Variables
- Quasiorthogonal dimension of Euclidean spaces
- Statistical mechanics of learning
- Stochastic separation theorems
- The concentration of measure phenomenon
- Training a Support Vector Machine in the Primal
Cited in
(19)- Tensor train based isogeometric analysis for PDE approximation on parameter dependent geometries
- On a posteriori error estimation using distances between numerical solutions and angles between truncation errors
- On a posteriori estimation of the approximation error norm for an ensemble of independent solutions
- Fast construction of correcting ensembles for legacy artificial intelligence systems: algorithms and a case study
- Approximation of classifiers by deep perceptron networks
- Ignorance is a bliss: Mathematical structure of many-box models
- Revisiting `survival of the fittest' principle in global stochastic optimisation: incorporating anisotropic mutations
- The independent component analysis with the linear regression – predicting the energy costs of the public sector buildings in Croatia
- General stochastic separation theorems with optimal bounds
- Coping with AI errors with provable guarantees
- Correction of AI systems by linear discriminants: probabilistic foundations
- Blessing of dimensionality at the edge and geometry of few-shot learning
- Hilbert’s sixth problem: the endless road to rigour
- Replica analysis of Bayesian data clustering
- Generalised Watson distribution on the hypersphere with applications to clustering
- Global optimisation in Hilbert spaces using the survival of the fittest algorithm
- Stochastic separation theorems
- High-dimensional brain: a tool for encoding and rapid learning of memories by single neurons
- Modelling biological evolution: developing novel approaches
This page was built for publication: Blessing of dimensionality: mathematical foundations of the statistical physics of data
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5154201)