Parallelizing pruned landmark labeling: Dealing with dependencies in graph algorithms

Ruoming Jin; Zhen Peng; Wendell Wu; Feodor Dragan; Gagan Agrawal; Bin Ren

doi:10.1145/3392717.3392745

Parallelizing pruned landmark labeling: Dealing with dependencies in graph algorithms

Ruoming Jin, Zhen Peng, Wendell Wu, Feodor Dragan, Gagan Agrawal, Bin Ren

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

4 Scopus citations

Abstract

To help compute shortest path distances over large graphs efficiently, 2-hop labeling has emerged as a major tool, with Pruned Landmark Labeling (PPL) as a popular algorithm. This paper demonstrates the first scalable parallel implementation of the PPL algorithm that produces the same results as the sequential algorithm. Based on theoretical analysis, we show how computations on each vertex can be performed in parallel while maintaining correctness, resulting in the Vertex-Centrix PLL (VC-PLL) algorithm. We also show a formulation of this algorithm based on linear algebra and argue why the use of a library based on linear algebra operations will not produce an efficient implementation. Next, we introduce a batched VC-PLL (BVC-PLL) algorithm to reduce the computational inefficiency in VC-PLL. We have carried out a parallel implementation of this method for modern clusters, combining shared memory and distributed memory parallelism, that can efficiently execute on graphs with more than a billion edges. We also demonstrate how BVC-PLL algorithm can be extended to handle directed graphs and weighted graphs and how the version for weighted graphs can benefit from SIMD parallelization.

Original language	English (US)
Title of host publication	Proceedings of the 34th ACM International Conference on Supercomputing, ICS 2020
Publisher	Association for Computing Machinery
ISBN (Electronic)	9781450379830
DOIs	https://doi.org/10.1145/3392717.3392745
State	Published - Jun 29 2020
Event	34th ACM International Conference on Supercomputing, ICS 2020 - Barcelona, Spain Duration: Jun 29 2020 → Jul 2 2020

Publication series

Name	Proceedings of the International Conference on Supercomputing

Conference

Conference	34th ACM International Conference on Supercomputing, ICS 2020
Country/Territory	Spain
City	Barcelona
Period	6/29/20 → 7/2/20

Keywords

dependency resolving
multi-level parallelization
parallel graph algorithms

ASJC Scopus subject areas

General Computer Science

Access to Document

10.1145/3392717.3392745

Cite this

Jin, R., Peng, Z., Wu, W., Dragan, F., Agrawal, G., & Ren, B. (2020). Parallelizing pruned landmark labeling: Dealing with dependencies in graph algorithms. In Proceedings of the 34th ACM International Conference on Supercomputing, ICS 2020 (Proceedings of the International Conference on Supercomputing). Association for Computing Machinery. https://doi.org/10.1145/3392717.3392745

Parallelizing pruned landmark labeling: Dealing with dependencies in graph algorithms. / Jin, Ruoming; Peng, Zhen; Wu, Wendell et al.
Proceedings of the 34th ACM International Conference on Supercomputing, ICS 2020. Association for Computing Machinery, 2020. (Proceedings of the International Conference on Supercomputing).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Jin, R, Peng, Z, Wu, W, Dragan, F, Agrawal, G & Ren, B 2020, Parallelizing pruned landmark labeling: Dealing with dependencies in graph algorithms. in Proceedings of the 34th ACM International Conference on Supercomputing, ICS 2020. Proceedings of the International Conference on Supercomputing, Association for Computing Machinery, 34th ACM International Conference on Supercomputing, ICS 2020, Barcelona, Spain, 6/29/20. https://doi.org/10.1145/3392717.3392745

@inproceedings{4fd21bf3430a43688fe282879f9d784f,

title = "Parallelizing pruned landmark labeling: Dealing with dependencies in graph algorithms",

abstract = "To help compute shortest path distances over large graphs efficiently, 2-hop labeling has emerged as a major tool, with Pruned Landmark Labeling (PPL) as a popular algorithm. This paper demonstrates the first scalable parallel implementation of the PPL algorithm that produces the same results as the sequential algorithm. Based on theoretical analysis, we show how computations on each vertex can be performed in parallel while maintaining correctness, resulting in the Vertex-Centrix PLL (VC-PLL) algorithm. We also show a formulation of this algorithm based on linear algebra and argue why the use of a library based on linear algebra operations will not produce an efficient implementation. Next, we introduce a batched VC-PLL (BVC-PLL) algorithm to reduce the computational inefficiency in VC-PLL. We have carried out a parallel implementation of this method for modern clusters, combining shared memory and distributed memory parallelism, that can efficiently execute on graphs with more than a billion edges. We also demonstrate how BVC-PLL algorithm can be extended to handle directed graphs and weighted graphs and how the version for weighted graphs can benefit from SIMD parallelization.",

keywords = "dependency resolving, multi-level parallelization, parallel graph algorithms",

author = "Ruoming Jin and Zhen Peng and Wendell Wu and Feodor Dragan and Gagan Agrawal and Bin Ren",

note = "Funding Information: and/or technical support that have contributed to the results reported within this paper. The work was also partially supported by the Ohio Supercomputer Center under Grant no. PGS0218. Publisher Copyright: {\textcopyright} 2020 ACM. Copyright: Copyright 2020 Elsevier B.V., All rights reserved.; 34th ACM International Conference on Supercomputing, ICS 2020 ; Conference date: 29-06-2020 Through 02-07-2020",

year = "2020",

month = jun,

day = "29",

doi = "10.1145/3392717.3392745",

language = "English (US)",

series = "Proceedings of the International Conference on Supercomputing",

publisher = "Association for Computing Machinery",

booktitle = "Proceedings of the 34th ACM International Conference on Supercomputing, ICS 2020",

}

TY - GEN

T1 - Parallelizing pruned landmark labeling

T2 - 34th ACM International Conference on Supercomputing, ICS 2020

AU - Jin, Ruoming

AU - Peng, Zhen

AU - Wu, Wendell

AU - Dragan, Feodor

AU - Agrawal, Gagan

AU - Ren, Bin

N1 - Funding Information: and/or technical support that have contributed to the results reported within this paper. The work was also partially supported by the Ohio Supercomputer Center under Grant no. PGS0218. Publisher Copyright: © 2020 ACM. Copyright: Copyright 2020 Elsevier B.V., All rights reserved.

PY - 2020/6/29

Y1 - 2020/6/29

N2 - To help compute shortest path distances over large graphs efficiently, 2-hop labeling has emerged as a major tool, with Pruned Landmark Labeling (PPL) as a popular algorithm. This paper demonstrates the first scalable parallel implementation of the PPL algorithm that produces the same results as the sequential algorithm. Based on theoretical analysis, we show how computations on each vertex can be performed in parallel while maintaining correctness, resulting in the Vertex-Centrix PLL (VC-PLL) algorithm. We also show a formulation of this algorithm based on linear algebra and argue why the use of a library based on linear algebra operations will not produce an efficient implementation. Next, we introduce a batched VC-PLL (BVC-PLL) algorithm to reduce the computational inefficiency in VC-PLL. We have carried out a parallel implementation of this method for modern clusters, combining shared memory and distributed memory parallelism, that can efficiently execute on graphs with more than a billion edges. We also demonstrate how BVC-PLL algorithm can be extended to handle directed graphs and weighted graphs and how the version for weighted graphs can benefit from SIMD parallelization.

AB - To help compute shortest path distances over large graphs efficiently, 2-hop labeling has emerged as a major tool, with Pruned Landmark Labeling (PPL) as a popular algorithm. This paper demonstrates the first scalable parallel implementation of the PPL algorithm that produces the same results as the sequential algorithm. Based on theoretical analysis, we show how computations on each vertex can be performed in parallel while maintaining correctness, resulting in the Vertex-Centrix PLL (VC-PLL) algorithm. We also show a formulation of this algorithm based on linear algebra and argue why the use of a library based on linear algebra operations will not produce an efficient implementation. Next, we introduce a batched VC-PLL (BVC-PLL) algorithm to reduce the computational inefficiency in VC-PLL. We have carried out a parallel implementation of this method for modern clusters, combining shared memory and distributed memory parallelism, that can efficiently execute on graphs with more than a billion edges. We also demonstrate how BVC-PLL algorithm can be extended to handle directed graphs and weighted graphs and how the version for weighted graphs can benefit from SIMD parallelization.

KW - dependency resolving

KW - multi-level parallelization

KW - parallel graph algorithms

UR - http://www.scopus.com/inward/record.url?scp=85088499114&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85088499114&partnerID=8YFLogxK

U2 - 10.1145/3392717.3392745

DO - 10.1145/3392717.3392745

M3 - Conference contribution

AN - SCOPUS:85088499114

T3 - Proceedings of the International Conference on Supercomputing

BT - Proceedings of the 34th ACM International Conference on Supercomputing, ICS 2020

PB - Association for Computing Machinery

Y2 - 29 June 2020 through 2 July 2020

ER -

Parallelizing pruned landmark labeling: Dealing with dependencies in graph algorithms

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this