Weighted sampling without replacement from data streams
From MaRDI portal
Publication:495669
DOI10.1016/J.IPL.2015.07.007zbMATH Open1338.68295arXiv1506.01747OpenAlexW1963394116MaRDI QIDQ495669FDOQ495669
Vladimir Braverman, Gregory Vorsanger, Rafail Ostrovsky
Publication date: 15 September 2015
Published in: Information Processing Letters (Search for Journal in Brave)
Abstract: Weighted sampling without replacement has proved to be a very important tool in designing new algorithms. Efraimidis and Spirakis (IPL 2006) presented an algorithm for weighted sampling without replacement from data streams. Their algorithm works under the assumption of precise computations over the interval [0,1]. Cohen and Kaplan (VLDB 2008) used similar methods for their bottom-k sketches. Efraimidis and Spirakis ask as an open question whether using finite precision arithmetic impacts the accuracy of their algorithm. In this paper we show a method to avoid this problem by providing a precise reduction from k-sampling without replacement to k-sampling with replacement. We call the resulting method Cascade Sampling.
Full work available at URL: https://arxiv.org/abs/1506.01747
Recommendations
Online algorithms; streaming algorithms (68W27) Sampling theory in information and communication theory (94A20)
Cites Work
- Title not available (Why is that?)
- Weighted random sampling with a reservoir
- The DLT priority sampling is essentially optimal
- Random sampling with a reservoir
- Sampling in dynamic data streams and applications
- Title not available (Why is that?)
- Sampling algorithms.
- Sequential reservoir sampling with a nonuniform distribution
- Reservoir-sampling algorithms of time complexity O ( n (1 + log( N / n )))
- An Efficient Method for Weighted Sampling without Replacement
This page was built for publication: Weighted sampling without replacement from data streams
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q495669)