Scaling sparse matrix multiplication on CPU-GPU nodes

Yang Xia, Peng Jiang, Gagan Agrawal, Rajiv Ramnath

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

Multiplication of two sparse matrices (SpGEMM) is a popular kernel behind many numerical solvers, and also features in implementing many common graph algorithms. Though many recent research efforts have focused on implementing SpGEMM efficiently on a single GPU, none of the existing work has considered the case where the memory requirements exceed the size of GPU memory. Similarly, the use of the aggregate computing power of CPU and GPU has also not been addressed for those large matrices. In this paper, we present a framework for scaling SpGEMM computations for matrices that do not fit into GPU memory. We address how the computation and data can be partitioned across kernel executions on GPUs. An important emphasis in our work is overlapping data movement and computation. We achieve this by addressing many challenges, such as avoiding dynamic memory allocations, and re-scheduling data transfers with the computation of chunks. We extend our framework to make efficient use of both GPU and CPU, by developing an efficient work distribution strategy. Our evaluation on 9 large matrices shows that our out-of-core GPU implementation achieves 1.98-3.03X speedups over a state-of-the-art multi-core CPU implementation, our hybrid implementation further achieves speedups up to 3.74x, and that our design choices are directly contributing towards achieving this performance.

Original languageEnglish (US)
Title of host publicationProceedings - 2021 IEEE 35th International Parallel and Distributed Processing Symposium, IPDPS 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages392-401
Number of pages10
ISBN (Electronic)9781665440660
DOIs
StatePublished - May 2021
Event35th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2021 - Virtual, Online
Duration: May 17 2021May 21 2021

Publication series

NameProceedings - 2021 IEEE 35th International Parallel and Distributed Processing Symposium, IPDPS 2021

Conference

Conference35th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2021
CityVirtual, Online
Period5/17/215/21/21

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Scaling sparse matrix multiplication on CPU-GPU nodes'. Together they form a unique fingerprint.

Cite this