Sublinear-time algorithms for counting star subgraphs via edge sampling
From MaRDI portal
(Redirected from Publication:1709591)
Abstract: We study the problem of estimating the value of sums of the form when one has the ability to sample with probability proportional to its magnitude. When , this problem is equivalent to estimating the selectivity of a self-join query in database systems when one can sample rows randomly. We also study the special case when is the degree sequence of a graph, which corresponds to counting the number of -stars in a graph when one has the ability to sample edges randomly. Our algorithm for a -multiplicative approximation of has query and time complexities . Here, is the number of edges in the graph, or equivalently, half the number of records in the database table. Similarly, is the number of vertices in the graph and the number of unique values in the database table. We also provide tight lower bounds (up to polylogarithmic factors) in almost all cases, even when is a degree sequence and one is allowed to use the structure of the graph to try to get a better estimate. We are not aware of any prior lower bounds on the problem of join selectivity estimation. For the graph problem, prior work which assumed the ability to sample only emph{vertices} uniformly gave algorithms with matching lower bounds [Gonen, Ron, and Shavitt. extit{SIAM J. Comput.}, 25 (2011), pp. 1365-1411]. With the ability to sample edges randomly, we show that one can achieve faster algorithms for approximating the number of star subgraphs, bypassing the lower bounds in this prior work. For example, in the regime where , and , our upper bound is , in contrast to their lower bound when no random edge queries are available.
Recommendations
- A Simple Sublinear-Time Algorithm for Counting Arbitrary Subgraphs via Edge Sampling
- Counting stars and other small subgraphs in sublinear-time
- Counting stars and other small subgraphs in sublinear time
- Efficient and near-optimal algorithms for sampling connected subgraphs
- Efficient and Near-optimal Algorithms for Sampling Small Connected Subgraphs
- Faster algorithms for counting subgraphs in sparse graphs
- Faster Subgraph Counting in Sparse Graphs
- On counting and approximation variants of the subgraph connecting problem
- Estimating the number of connected components in a graph via subgraph sampling
- Graph characterization by counting sink star subgraphs
Cites work
- scientific article; zbMATH DE number 2119719 (Why is no real title available?)
- scientific article; zbMATH DE number 819814 (Why is no real title available?)
- A Fast Approximation Algorithm for Computing the Frequencies of Subgraphs in a Given Graph
- A near-optimal sublinear-time algorithm for approximating the minimum vertex cover size
- A sublinear-time approximation scheme for bin packing
- An improved data stream algorithm for frequency moments
- Approximate counting of cycles in streams
- Approximately counting triangles in sublinear time
- Approximating average parameters of graphs
- Approximating the minimum vertex cover in sublinear time and a connection to distributed algorithms
- Approximating the number of network motifs
- Balanced hashing, color coding and approximate counting
- Counting arbitrary subgraphs in data streams
- Counting stars and other small subgraphs in sublinear-time
- Counting subgraphs via homomorphisms
- Discovering and Exploiting Statistical Properties for Query Optimization in Relational Databases: A Survey
- Efficient triangle counting in large graphs via degree-based vertex partitioning
- Estimating Sum by Weighted Sampling
- Faster algorithms for finding and counting subgraphs
- Finding and counting given length cycles
- Finding paths of length \(k\) in \(O^{*}(2^k)\) time
- Finding, minimizing, and counting weighted subgraphs
- Improved constant-time approximation algorithms for maximum matchings and other optimization problems
- Local Graph Partitions for Approximation and Testing
- Motifs in evolving cooperative networks look like protein structure networks
- On Sums of Independent Random Variables with Unbounded Variance and Estimating the Average Degree in a Graph
- On the Communication Complexity Methodology for Proving Lower Bounds on the Query Complexity of Property Testing
- Optimal approximations of the frequency moments of data streams
- Property testing lower bounds via communication complexity
- Research in Computational Molecular Biology
- Selectivity and cost estimation for joins based on random sampling
- Simpler algorithm for estimating frequency moments of data streams
- Testing probability distributions underlying aggregated data
- The Parameterized Complexity of Counting Problems
- The space complexity of approximating the frequency moments
- Tracking join and self-join sizes in limited storage
Cited in
(9)- Sublinear time estimation of degree distribution moments: the arboricity connection
- A Simple Sublinear-Time Algorithm for Counting Arbitrary Subgraphs via Edge Sampling
- Lower bounds for approximating graph parameters via communication complexity
- Vector-Matrix-Vector Queries for Solving Linear Algebra, Statistics, and Graph Problems
- Counting stars and other small subgraphs in sublinear-time
- Counting stars and other small subgraphs in sublinear time
- The arboricity captures the complexity of sampling edges
- Motif estimation via subgraph sampling: the fourth-moment phenomenon
- Estimating the number of connected components in a graph via subgraph sampling
This page was built for publication: Sublinear-time algorithms for counting star subgraphs via edge sampling
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1709591)