Sublinear-time algorithms for counting star subgraphs via edge sampling

From MaRDI portal
Publication:1709591

DOI10.1007/S00453-017-0287-3zbMATH Open1391.68120arXiv1601.04233OpenAlexW2586277680MaRDI QIDQ1709591FDOQ1709591


Authors: M. Aliakbarpour, Amartya Shankha Biswas, Themis Gouleakis, John Peebles, Ronitt Rubinfeld, Anak Yodpinyanee Edit this on Wikidata


Publication date: 6 April 2018

Published in: Algorithmica (Search for Journal in Brave)

Abstract: We study the problem of estimating the value of sums of the form when one has the ability to sample xigeq0 with probability proportional to its magnitude. When p=2, this problem is equivalent to estimating the selectivity of a self-join query in database systems when one can sample rows randomly. We also study the special case when xi is the degree sequence of a graph, which corresponds to counting the number of p-stars in a graph when one has the ability to sample edges randomly. Our algorithm for a (1pmvarepsilon)-multiplicative approximation of Sp has query and time complexities O(fracmloglognepsilon2Sp1/p). Here, m=sumxi/2 is the number of edges in the graph, or equivalently, half the number of records in the database table. Similarly, n is the number of vertices in the graph and the number of unique values in the database table. We also provide tight lower bounds (up to polylogarithmic factors) in almost all cases, even when xi is a degree sequence and one is allowed to use the structure of the graph to try to get a better estimate. We are not aware of any prior lower bounds on the problem of join selectivity estimation. For the graph problem, prior work which assumed the ability to sample only emph{vertices} uniformly gave algorithms with matching lower bounds [Gonen, Ron, and Shavitt. extit{SIAM J. Comput.}, 25 (2011), pp. 1365-1411]. With the ability to sample edges randomly, we show that one can achieve faster algorithms for approximating the number of star subgraphs, bypassing the lower bounds in this prior work. For example, in the regime where Spleqn, and p=2, our upper bound is ildeO(n/Sp1/2), in contrast to their Omega(n/Sp1/3) lower bound when no random edge queries are available.


Full work available at URL: https://arxiv.org/abs/1601.04233




Recommendations




Cites Work


Cited In (9)

Uses Software





This page was built for publication: Sublinear-time algorithms for counting star subgraphs via edge sampling

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1709591)