Multiscale matrix sampling and sublinear-time PageRank computation
From MaRDI portal
Publication:4985347
Abstract: A fundamental problem arising in many applications in Web science and social network analysis is, given an arbitrary approximation factor , to output a set of nodes that with high probability contains all nodes of PageRank at least , and no node of PageRank smaller than . We call this problem {sc SignificantPageRanks}. We develop a nearly optimal, local algorithm for the problem with runtime complexity on networks with nodes. We show that any algorithm for solving this problem must have runtime of , rendering our algorithm optimal up to logarithmic factors. Our algorithm comes with two main technical contributions. The first is a multi-scale sampling scheme for a basic matrix problem that could be of interest on its own. In the abstract matrix problem it is assumed that one can access an unknown {em right-stochastic matrix} by querying its rows, where the cost of a query and the accuracy of the answers depend on a precision parameter . At a cost propositional to , the query will return a list of entries and their indices that provide an -precision approximation of the row. Our task is to find a set that contains all columns whose sum is at least , and omits any column whose sum is less than . Our multi-scale sampling scheme solves this problem with cost , while traditional sampling algorithms would take time . Our second main technical contribution is a new local algorithm for approximating personalized PageRank, which is more robust than the earlier ones developed in cite{JehW03,AndersenCL06} and is highly efficient particularly for networks with large in-degrees or out-degrees. Together with our multiscale sampling scheme we are able to optimally solve the {sc SignificantPageRanks} problem.
Recommendations
Cites work
- A Survey on PageRank Computing
- A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning
- An improved data stream summary: the count-min sketch and its applications
- Introduction to testing graph properties
- Local Computation of PageRank Contributions
- Monte Carlo Methods in PageRank Computation: When One Iteration is Sufficient
- Sparse Matrices in MATLAB: Design and Implementation
- Spectral algorithms
- Spectral methods for matrices and tensors
- Sublinear time algorithms
- Using PageRank to Characterize Web Structure
Cited in
(9)- Sublinear column-wise actions of the matrix exponential on social networks
- scientific article; zbMATH DE number 7559046 (Why is no real title available?)
- On approximating the stationary distribution of time-reversible Markov chains
- Deterministic coresets for stochastic matrices with applications to scalable sparse PageRank
- A sublinear time algorithm for PageRank computations
- A nearly-sublinear method for approximating a column of the matrix exponential for matrices from large, sparse networks
- Seeded PageRank solution paths
- Sublinear Algorithms for Local Graph-Centrality Estimation
- On approximating the stationary distribution of time-reversible Markov chains
This page was built for publication: Multiscale matrix sampling and sublinear-time PageRank computation
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4985347)