Distributed balanced partitioning via linear embedding (Q2005567): Difference between revisions

Summary: Balanced partitioning is often a crucial first step in solving large-scale graph optimization problems, for example, in some cases, a big graph can be chopped into pieces that fit on one machine to be processed independently before stitching the results together, leading to certain suboptimality from the interaction among different pieces. In other cases, links between different parts may show up in the running time and/or network communications cost, hence the desire to have small cut size. We study a distributed balanced-partitioning problem where the goal is to partition the vertices of a given graph into \(k\) pieces so as to minimize the total cut size. Our algorithm is composed of a few steps that are easily implementable in distributed computation frameworks such as MapReduce. The algorithm first embeds nodes of the graph onto a line, and then processes nodes in a distributed manner guided by the \textit{linear embedding} order. We examine various ways to find the first embedding, for example, via a hierarchical clustering or Hilbert curves. Then we apply four different techniques including local swaps, and minimum cuts on the boundaries of partitions, as well as contraction and dynamic programming. As our empirical study, we compare the above techniques with each other, and also to previous work in distributed graph algorithms, for example, a label-propagation method, FENNEL and Spinner. We report our results both on a private map graph and several public social networks, and show that our results beat previous distributed algorithms: For instance, compared to the label-propagation algorithm, we report an improvement of 15--25\% in the cut value. We also observe that our algorithms admit scalable distributed implementation for any number of partitions. Finally, we explain three applications of this work at Google: (1) Balanced partitioning is used to route multi-term queries to different replicas in Google Search backend in a way that reduces the cache miss rates by \(\approx 0.5\%\), which leads to a double-digit gain in throughput of production clusters. (2) Applied to the Google Maps Driving Directions, balanced partitioning minimizes the number of cross-\textit{shard} queries with the goal of saving in CPU usage. This system achieves load balancing by dividing the \textit{world graph} into several ``shards''. Live experiments demonstrate an \(\approx 40\%\) drop in the number of cross-shard queries when compared to a standard geography-based method. (3) In a job scheduling problem for our data centers, we use balanced partitioning to evenly distribute the work while minimizing the amount of communication across geographically distant servers. In fact, the hierarchical nature of our solution goes well with the layering of data center servers, where certain machines are closer to each other and have faster links to one another.

0 references

zbMATH Keywords

cut minimization

0 references

embedding to line

0 references

imbalance

0 references

local improvement

0 references

MapReduce

0 references

maps

0 references

partitioning

0 references

social networks

0 references

describes a project that uses

0 references

0 references

0 references

0 references

MaRDI publication profile

0 references

cites work

Balanced graph partitioning

0 references

Q4198056

0 references

Exact Combinatorial Branch-and-Bound for Graph Bisection

0 references

Streaming Balanced Graph Partitioning Algorithms for Random Graphs

0 references

A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

0 references

A Polylogarithmic Approximation of the Minimum Bisection

0 references

Global optimization with non-convex constraints. Sequential and parallel algorithms

0 references

Towards optimal locality in mesh-indexings

0 references

Optimal linear arrangement of a rectangular grid

0 references

New Approximation Techniques for Some Linear Ordering Problems

0 references

An improved approximation ratio for the minimum linear arrangement problem

0 references

\(\ell ^2_2\) spreading metrics for vertex ordering problems

0 references

Identifiers

zbMATH Open document ID

1461.68141

0 references

DOI

10.3390/a12080162

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:2005567

@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / OpenAlex ID @@
+W2966909581
@@ Property / OpenAlex ID: W2966909581 / rank @@
+Normal rank
@@ Property / arXiv ID @@
+.02727
@@ Property / arXiv ID: 1512.02727 / rank @@
+Normal rank
@@ Property / cites work @@
+Balanced graph partitioning
@@ Property / cites work: Balanced graph partitioning / rank @@
+Normal rank
@@ Property / cites work @@
+Q4198056
@@ Property / cites work: Q4198056 / rank @@
+Normal rank
@@ Property / cites work @@
+Exact Combinatorial Branch-and-Bound for Graph Bisection
+Normal rank
@@ Property / cites work @@
+Streaming Balanced Graph Partitioning Algorithms for Random Graphs
+Normal rank
@@ Property / cites work @@
+A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
+Normal rank
@@ Property / cites work @@
+A Polylogarithmic Approximation of the Minimum Bisection
+Normal rank
@@ Property / cites work @@
+Global optimization with non-convex constraints. Sequential and parallel algorithms
+Normal rank
@@ Property / cites work @@
+Towards optimal locality in mesh-indexings
@@ Property / cites work: Towards optimal locality in mesh-indexings / rank @@
+Normal rank
@@ Property / cites work @@
+Optimal linear arrangement of a rectangular grid
@@ Property / cites work: Optimal linear arrangement of a rectangular grid / rank @@
+Normal rank
@@ Property / cites work @@
+New Approximation Techniques for Some Linear Ordering Problems
+Normal rank
@@ Property / cites work @@
+An improved approximation ratio for the minimum linear arrangement problem
+Normal rank
@@ Property / cites work @@
+\(\ell ^2_2\) spreading metrics for vertex ordering problems
+Normal rank