Frugal Streaming for Estimating Quantiles
From MaRDI portal
Publication:2848969
DOI10.1007/978-3-642-40273-9_7zbMATH Open1394.68454arXiv1407.1121OpenAlexW2183565309MaRDI QIDQ2848969FDOQ2848969
Authors: Qiang Ma, S. Muthukrishnan, Mark Sandler
Publication date: 13 September 2013
Published in: Lecture Notes in Computer Science (Search for Journal in Brave)
Abstract: Modern applications require processing streams of data for estimating statistical quantities such as quantiles with small amount of memory. In many such applications, in fact, one needs to compute such statistical quantities for each of a large number of groups, which additionally restricts the amount of memory available for the stream for any particular group. We address this challenge and introduce frugal streaming, that is algorithms that work with tiny -- typically, sub-streaming -- amount of memory per group. We design a frugal algorithm that uses only one unit of memory per group to compute a quantile for each group. For stochastic streams where data items are drawn from a distribution independently, we analyze and show that the algorithm finds an approximation to the quantile rapidly and remains stably close to it. We also propose an extension of this algorithm that uses two units of memory per group. We show with extensive experiments with real world data from HTTP trace and Twitter that our frugal algorithms are comparable to existing streaming algorithms for estimating any quantile, but these existing algorithms use far more space per group and are unrealistic in frugal applications; further, the two memory frugal algorithm converges significantly faster than the one memory algorithm.
Full work available at URL: https://arxiv.org/abs/1407.1121
Recommendations
- Lower Bounds for Quantile Estimation in Random-Order and Multi-pass Streaming
- Low-storage quantile estimation
- Stream Order and Order Statistics: Quantile Estimation in Random-Order Streams
- Optimal approximations of the frequency moments of data streams
- Efficient stream sampling for variance-optimal estimation of subset sums
- On Estimating Frequency Moments of Data Streams
- Space-efficient estimation of statistics over sub-sampled streams
Cites Work
Cited In (3)
This page was built for publication: Frugal Streaming for Estimating Quantiles
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2848969)