Randomized near-neighbor graphs, giant components and applications in data science
From MaRDI portal
Publication:3299443
Abstract: If we pick random points uniformly in and connect each point to its nearest neighbors, then it is well known that there exists a giant connected component with high probability. We prove that in it suffices to connect every point to points chosen randomly among its nearest neighbors to ensure a giant component of size with high probability. This construction yields a much sparser random graph with instead of edges that has comparable connectivity properties. This result has nontrivial implications for problems in data science where an affinity matrix is constructed: instead of picking the nearest neighbors, one can often pick random points out of the nearest neighbors without sacrificing efficiency. This can massively simplify and accelerate computation, we illustrate this with several numerical examples.
Recommendations
Cites work
- scientific article; zbMATH DE number 6378173 (Why is no real title available?)
- scientific article; zbMATH DE number 5942363 (Why is no real title available?)
- scientific article; zbMATH DE number 428352 (Why is no real title available?)
- scientific article; zbMATH DE number 3150484 (Why is no real title available?)
- scientific article; zbMATH DE number 3193293 (Why is no real title available?)
- A critical constant for the k nearest-neighbour model
- A new lower bound for the geometric traveling salesman problem in terms of discrepancy
- A randomized algorithm for principal component analysis
- Algorithm 971
- Algorithms for the Assignment and Transportation Problems
- Clustering with t-SNE, provably
- Comments on the ``Core vector machines: fast SVM training on very large data sets
- Connectivity of random k-nearest-neighbour graphs
- Connectivity threshold of Bluetooth graphs
- Diffusion maps
- From graph to manifold Laplacian: the convergence rate
- Highly connected random geometric graphs
- How the result of graph clustering methods depends on the construction of the graph
- Laplacian Eigenmaps for Dimensionality Reduction and Data Representation
- Large deviations for discrete and continuous percolation
- Learning Theory
- New Bounds for the Traveling Salesman Constant
- Probability Inequalities for Sums of Bounded Random Variables
- Probability theory of classical Euclidean optimization problems
- Random Geometric Graphs
- Sharpness in the \(k\)-nearest-neighbours random geometric graph model
- Shortest Paths Through Pseudo-Random Points in the d-Cube
- Small components in \(k\)-nearest neighbour graphs
- Subadditive Euclidean functionals and nonlinear growth in geometric probability
- The Scottish Book. Mathematics from the Scottish Café. With selected problems from the New Scottish Book
- The shortest path and the shortest road through n points
- Visualizing data using t-SNE
- Weak laws of large numbers in geometric probability
- \(k\)-nearest-neighbor clustering and percolation theory
This page was built for publication: Randomized near-neighbor graphs, giant components and applications in data science
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3299443)