Proximity in the age of distraction: robust approximate nearest neighbor search
From MaRDI portal
Publication:4575733
Abstract: We introduce a new variant of the nearest neighbor search problem, which allows for some coordinates of the dataset to be arbitrarily corrupted or unknown. Formally, given a dataset of points in high-dimensions, and a parameter , the goal is to preprocess the dataset, such that given a query point , one can compute quickly a point , such that the distance of the query to the point is minimized, when ignoring the "optimal" coordinates. Note, that the coordinates being ignored are a function of both the query point and the point returned. We present a general reduction from this problem to answering ANN queries, which is similar in spirit to LSH (locality sensitive hashing) [IM98]. Specifically, we give a sampling technique which achieves a bi-criterion approximation for this problem. If the distance to the nearest neighbor after ignoring coordinates is , the data-structure returns a point that is within a distance of after ignoring coordinates. We also present other applications and further extensions and refinements of the above result. The new data-structures are simple and (arguably) elegant, and should be practical -- specifically, all bounds are polynomial in all relevant parameters (including the dimension of the space, and the robustness parameter ).
Recommendations
Cited in
(3)
This page was built for publication: Proximity in the age of distraction: robust approximate nearest neighbor search
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4575733)