New instability results for high-dimensional nearest neighbor search
From MaRDI portal
Publication:990934
DOI10.1016/J.IPL.2009.07.012zbMATH Open1209.68204arXiv0906.0684OpenAlexW2962982486MaRDI QIDQ990934FDOQ990934
Authors: J. Martínez
Publication date: 1 September 2010
Published in: Information Processing Letters (Search for Journal in Brave)
Abstract: Consider a dataset of n(d) points generated independently from R^d according to a common p.d.f. f_d with support(f_d) = [0,1]^d and sup{f_d([0,1]^d)} growing sub-exponentially in d. We prove that: (i) if n(d) grows sub-exponentially in d, then, for any query point q^d in [0,1]^d and any epsilon>0, the ratio of the distance between any two dataset points and q^d is less that 1+epsilon with probability -->1 as d-->infinity; (ii) if n(d)>[4(1+epsilon)]^d for large d, then for all q^d in [0,1]^d (except a small subset) and any epsilon>0, the distance ratio is less than 1+epsilon with limiting probability strictly bounded away from one. Moreover, we provide preliminary results along the lines of (i) when f_d=N(mu_d,Sigma_d).
Full work available at URL: https://arxiv.org/abs/0906.0684
Recommendations
- Instability results for Euclidean distance, nearest neighbor search on high dimensional Gaussian data
- Exact \(L_{\infty}\) nearest neighbor search in high dimensions
- scientific article; zbMATH DE number 2065612
- When is `nearest neighbour' meaningful: A converse theorem and implications
- scientific article; zbMATH DE number 1559575
Cites Work
- Title not available (Why is that?)
- Probability Inequalities for Sums of Bounded Random Variables
- Title not available (Why is that?)
- A Strong Law for the Largest Nearest-Neighbour Link between Random Points
- On the geometry of similarity search: dimensionality curse and concentration of measure
- Bounds on the mean power-weighted nearest neighbour distance
- Concentration of measure and cluster analysis.
- Volume of unit ball in an n-dimensional normed space and its asymptotic properties
Cited In (5)
- Instability results for Euclidean distance, nearest neighbor search on high dimensional Gaussian data
- On the distance concentration awareness of certain data reduction techniques
- Non-parametric detection of meaningless distances in high dimensional data
- Intrinsic dimension of geometric data sets
- Title not available (Why is that?)
This page was built for publication: New instability results for high-dimensional nearest neighbor search
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q990934)