Distributed balanced partitioning via linear embedding
From MaRDI portal
Programming involving graphs or networks (90C35) Graph algorithms (graph-theoretic aspects) (05C85) Graph theory (including graph drawing) in computer science (68R10) Social networks; opinion dynamics (91D30) Edge subsets with special properties (factorization, matching, partitioning, covering and packing, etc.) (05C70) Distributed algorithms (68W15)
Abstract: Balanced partitioning is often a crucial first step in solving large-scale graph optimization problems, e.g., in some cases, a big graph can be chopped into pieces that fit on one machine to be processed independently before stitching the results together. In other cases, links between different parts may show up in the running time and/or network communications cost. We study a distributed balanced partitioning problem where the goal is to partition the vertices of a given graph into k pieces so as to minimize the total cut size. Our algorithm is composed of a few steps that are easily implementable in distributed computation frameworks. The algorithm first embeds nodes of the graph onto a line, and then processes nodes in a distributed manner guided by the linear embedding order. We examine various ways to find the first embedding, e.g., via a hierarchical clustering or Hilbert curves. Then we apply four different techniques including local swaps, minimum cuts on the boundaries of partitions, as well as contraction and dynamic programming. As our empirical study, we compare the above techniques with each other, and also to previous work in distributed graph algorithms, e.g., a label propagation method, FENNEL and Spinner. We report our results both on a private map graph and several public social networks, and show that our results beat previous distributed algorithms: e.g., compared to the label propagation algorithm, we report an improvement of 15-25% in the cut value. We also observe that our algorithms allow for scalable distributed implementation for any number of partitions. Finally, we apply our techniques for the Google Maps Driving Directions to minimize the number of multi-shard queries with the goal of saving in CPU usage. During live experiments, we observe an ~40% drop in the number of multi-shard queries when comparing our method with a standard geography-based method.
Recommendations
- Graph partitioning for scalable distributed graph computations
- Streaming balanced graph partitioning algorithms for random graphs
- Algorithms for the Balanced Edge Partitioning Problem
- Impact of minimum-cut density-balanced partitioning solutions in distributed webpage ranking
- Scalable edge partitioning
Cites work
- scientific article; zbMATH DE number 3639144 (Why is no real title available?)
- A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
- A polylogarithmic approximation of the minimum bisection
- An improved approximation ratio for the minimum linear arrangement problem
- Balanced graph partitioning
- Exact combinatorial branch-and-bound for graph bisection
- Global optimization with non-convex constraints. Sequential and parallel algorithms
- New Approximation Techniques for Some Linear Ordering Problems
- Optimal linear arrangement of a rectangular grid
- Streaming balanced graph partitioning algorithms for random graphs
- Towards optimal locality in mesh-indexings
- \(\ell ^2_2\) spreading metrics for vertex ordering problems
Cited in
(10)- On parameterized approximation algorithms for balanced clustering
- Streaming balanced graph partitioning algorithms for random graphs
- Optimizing streaming graph partitioning via a heuristic greedy method and caching strategy
- Design and analysis of bipartite experiments under a linear exposure-response model
- Balanced graph partitioning based on mixed 0-1 linear programming and iteration vertex relocation algorithm
- Scalable edge partitioning
- Algorithms for the Balanced Edge Partitioning Problem
- Graph partitioning for scalable distributed graph computations
- Improved parameterized approximation for balanced \(k\)-median
- Approximation algorithms for the load-balanced capacitated vehicle routing problem
This page was built for publication: Distributed balanced partitioning via linear embedding
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2005567)