Randomized near-neighbor graphs, giant components and applications in data science

From MaRDI portal
Publication:3299443

DOI10.1017/JPR.2020.21zbMATH Open1446.60013arXiv1711.04712OpenAlexW3042432370WikidataQ99350568 ScholiaQ99350568MaRDI QIDQ3299443FDOQ3299443


Authors: Ariel Jaffe, Yuval Kluger, George C. Linderman, Gal Mishne, Stefan Steinerberger Edit this on Wikidata


Publication date: 22 July 2020

Published in: Journal of Applied Probability (Search for Journal in Brave)

Abstract: If we pick n random points uniformly in [0,1]d and connect each point to its knearest neighbors, then it is well known that there exists a giant connected component with high probability. We prove that in [0,1]d it suffices to connect every point to cd,1loglogn points chosen randomly among its cd,2lognnearest neighbors to ensure a giant component of size no(n) with high probability. This construction yields a much sparser random graph with simnloglogn instead of simnlogn edges that has comparable connectivity properties. This result has nontrivial implications for problems in data science where an affinity matrix is constructed: instead of picking the knearest neighbors, one can often pick kllk random points out of the knearest neighbors without sacrificing efficiency. This can massively simplify and accelerate computation, we illustrate this with several numerical examples.


Full work available at URL: https://arxiv.org/abs/1711.04712




Recommendations




Cites Work


Uses Software





This page was built for publication: Randomized near-neighbor graphs, giant components and applications in data science

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3299443)