Weighted sampling without replacement from data streams
From MaRDI portal
Abstract: Weighted sampling without replacement has proved to be a very important tool in designing new algorithms. Efraimidis and Spirakis (IPL 2006) presented an algorithm for weighted sampling without replacement from data streams. Their algorithm works under the assumption of precise computations over the interval [0,1]. Cohen and Kaplan (VLDB 2008) used similar methods for their bottom-k sketches. Efraimidis and Spirakis ask as an open question whether using finite precision arithmetic impacts the accuracy of their algorithm. In this paper we show a method to avoid this problem by providing a precise reduction from k-sampling without replacement to k-sampling with replacement. We call the resulting method Cascade Sampling.
Recommendations
Cites work
- scientific article; zbMATH DE number 1099195 (Why is no real title available?)
- scientific article; zbMATH DE number 819814 (Why is no real title available?)
- An Efficient Method for Weighted Sampling without Replacement
- Random sampling with a reservoir
- Reservoir-sampling algorithms of time complexity O ( n (1 + log( N / n )))
- Sampling algorithms.
- Sampling in dynamic data streams and applications
- Sequential reservoir sampling with a nonuniform distribution
- The DLT priority sampling is essentially optimal
- Weighted random sampling with a reservoir
This page was built for publication: Weighted sampling without replacement from data streams
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q495669)