Distributed balanced partitioning via linear embedding (Q2005567): Difference between revisions

From MaRDI portal
Changed an Item
ReferenceBot (talk | contribs)
Changed an Item
 
(6 intermediate revisions by 4 users not shown)
Property / describes a project that uses
 
Property / describes a project that uses: MapReduce / rank
 
Normal rank
Property / describes a project that uses
 
Property / describes a project that uses: Pregel / rank
 
Normal rank
Property / describes a project that uses
 
Property / describes a project that uses: PEGASUS / rank
 
Normal rank
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2966909581 / rank
 
Normal rank
Property / arXiv ID
 
Property / arXiv ID: 1512.02727 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Balanced graph partitioning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4198056 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Exact Combinatorial Branch-and-Bound for Graph Bisection / rank
 
Normal rank
Property / cites work
 
Property / cites work: Streaming Balanced Graph Partitioning Algorithms for Random Graphs / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Polylogarithmic Approximation of the Minimum Bisection / rank
 
Normal rank
Property / cites work
 
Property / cites work: Global optimization with non-convex constraints. Sequential and parallel algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Towards optimal locality in mesh-indexings / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal linear arrangement of a rectangular grid / rank
 
Normal rank
Property / cites work
 
Property / cites work: New Approximation Techniques for Some Linear Ordering Problems / rank
 
Normal rank
Property / cites work
 
Property / cites work: An improved approximation ratio for the minimum linear arrangement problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: \(\ell ^2_2\) spreading metrics for vertex ordering problems / rank
 
Normal rank

Latest revision as of 18:14, 23 July 2024

scientific article
Language Label Description Also known as
English
Distributed balanced partitioning via linear embedding
scientific article

    Statements

    Distributed balanced partitioning via linear embedding (English)
    0 references
    0 references
    0 references
    0 references
    0 references
    8 October 2020
    0 references
    Summary: Balanced partitioning is often a crucial first step in solving large-scale graph optimization problems, for example, in some cases, a big graph can be chopped into pieces that fit on one machine to be processed independently before stitching the results together, leading to certain suboptimality from the interaction among different pieces. In other cases, links between different parts may show up in the running time and/or network communications cost, hence the desire to have small cut size. We study a distributed balanced-partitioning problem where the goal is to partition the vertices of a given graph into \(k\) pieces so as to minimize the total cut size. Our algorithm is composed of a few steps that are easily implementable in distributed computation frameworks such as MapReduce. The algorithm first embeds nodes of the graph onto a line, and then processes nodes in a distributed manner guided by the \textit{linear embedding} order. We examine various ways to find the first embedding, for example, via a hierarchical clustering or Hilbert curves. Then we apply four different techniques including local swaps, and minimum cuts on the boundaries of partitions, as well as contraction and dynamic programming. As our empirical study, we compare the above techniques with each other, and also to previous work in distributed graph algorithms, for example, a label-propagation method, FENNEL and Spinner. We report our results both on a private map graph and several public social networks, and show that our results beat previous distributed algorithms: For instance, compared to the label-propagation algorithm, we report an improvement of 15--25\% in the cut value. We also observe that our algorithms admit scalable distributed implementation for any number of partitions. Finally, we explain three applications of this work at Google: (1) Balanced partitioning is used to route multi-term queries to different replicas in Google Search backend in a way that reduces the cache miss rates by \(\approx 0.5\%\), which leads to a double-digit gain in throughput of production clusters. (2) Applied to the Google Maps Driving Directions, balanced partitioning minimizes the number of cross-\textit{shard} queries with the goal of saving in CPU usage. This system achieves load balancing by dividing the \textit{world graph} into several ``shards''. Live experiments demonstrate an \(\approx 40\%\) drop in the number of cross-shard queries when compared to a standard geography-based method. (3) In a job scheduling problem for our data centers, we use balanced partitioning to evenly distribute the work while minimizing the amount of communication across geographically distant servers. In fact, the hierarchical nature of our solution goes well with the layering of data center servers, where certain machines are closer to each other and have faster links to one another.
    0 references
    cut minimization
    0 references
    embedding to line
    0 references
    imbalance
    0 references
    local improvement
    0 references
    MapReduce
    0 references
    maps
    0 references
    partitioning
    0 references
    social networks
    0 references
    0 references
    0 references
    0 references
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references