Sampling-based Sparse Format Selection on GPUs

Gangyi Zhu; Gagan Agrawal

doi:10.1109/SBAC-PAD53543.2021.00031

Sampling-based Sparse Format Selection on GPUs

Gangyi Zhu, Gagan Agrawal

Computer & Cyber Sciences

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

1 Scopus citations

Abstract

Sparse Matrix-Vector Multiplication (SpMV) is an important kernel in numerous computational disciplines. The overall performance of SpMV is highly dependent on the storage format of the sparse matrix. This has led to much interest in recent years on automatically choosing the appropriate format, typically using machine learning techniques and training a model using a large number of matrices. However, these methods have limitations in practice-besides the dependency on obtaining a large number of sparse matrices of training and expensive overheads of the training, they usually have limited prediction ability across architectures. In this paper, we take a very distinct approach to the same problem. This approach involves obtaining samples from the original matrix, executing the kernel using these samples, and selecting the best format. However, our approach requires obtaining representative samples that can help understand performance associated with using a specific format on the full matrix, which turns out to be challenging. Based on the storage properties and processing granularity associated with different formats, we develop three novel sampling schemes: Row Cropping sampling, Random Warp sampling, and Diagonal Aligning sampling. These sampling methods are designed by observing that certain factors tend to be critical for performance associated with a particular format, and thus preserving that factor through sampling. Experimental results using nearly 2000 matrices demonstrate that our approach delivers high efficiency without the expensive training process, and it is easy to migrate across architectures. At the same time, our approach achieves comparable prediction accuracy with the state-of-art methodologies, and even outperforms them in certain cases (especially for predicting on some of the largest matrices we use). Through our work, we also offer new insights into the performance achieved using different formats on GPUs.

Original language	English (US)
Title of host publication	Proceedings - 2021 IEEE 33rd International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021
Publisher	IEEE Computer Society
Pages	198-208
Number of pages	11
ISBN (Electronic)	9781665443012
DOIs	https://doi.org/10.1109/SBAC-PAD53543.2021.00031
State	Published - 2021
Event	33rd IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021 - Virtual, Online, Brazil Duration: Oct 26 2021 → Oct 29 2021

Publication series

Name	Proceedings - Symposium on Computer Architecture and High Performance Computing
ISSN (Print)	1550-6533

Conference

Conference	33rd IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021
Country/Territory	Brazil
City	Virtual, Online
Period	10/26/21 → 10/29/21

Keywords

GPUs Sampling
Sparse Computations

ASJC Scopus subject areas

Hardware and Architecture
Software

Access to Document

10.1109/SBAC-PAD53543.2021.00031

Cite this

Zhu, G., & Agrawal, G. (2021). Sampling-based Sparse Format Selection on GPUs. In Proceedings - 2021 IEEE 33rd International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021 (pp. 198-208). (Proceedings - Symposium on Computer Architecture and High Performance Computing). IEEE Computer Society. https://doi.org/10.1109/SBAC-PAD53543.2021.00031

Sampling-based Sparse Format Selection on GPUs. / Zhu, Gangyi; Agrawal, Gagan.
Proceedings - 2021 IEEE 33rd International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021. IEEE Computer Society, 2021. p. 198-208 (Proceedings - Symposium on Computer Architecture and High Performance Computing).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Zhu, G & Agrawal, G 2021, Sampling-based Sparse Format Selection on GPUs. in Proceedings - 2021 IEEE 33rd International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021. Proceedings - Symposium on Computer Architecture and High Performance Computing, IEEE Computer Society, pp. 198-208, 33rd IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021, Virtual, Online, Brazil, 10/26/21. https://doi.org/10.1109/SBAC-PAD53543.2021.00031

@inproceedings{727504dfc1ac4f15b2ce2f8b5d487c2c,

title = "Sampling-based Sparse Format Selection on GPUs",

abstract = "Sparse Matrix-Vector Multiplication (SpMV) is an important kernel in numerous computational disciplines. The overall performance of SpMV is highly dependent on the storage format of the sparse matrix. This has led to much interest in recent years on automatically choosing the appropriate format, typically using machine learning techniques and training a model using a large number of matrices. However, these methods have limitations in practice-besides the dependency on obtaining a large number of sparse matrices of training and expensive overheads of the training, they usually have limited prediction ability across architectures. In this paper, we take a very distinct approach to the same problem. This approach involves obtaining samples from the original matrix, executing the kernel using these samples, and selecting the best format. However, our approach requires obtaining representative samples that can help understand performance associated with using a specific format on the full matrix, which turns out to be challenging. Based on the storage properties and processing granularity associated with different formats, we develop three novel sampling schemes: Row Cropping sampling, Random Warp sampling, and Diagonal Aligning sampling. These sampling methods are designed by observing that certain factors tend to be critical for performance associated with a particular format, and thus preserving that factor through sampling. Experimental results using nearly 2000 matrices demonstrate that our approach delivers high efficiency without the expensive training process, and it is easy to migrate across architectures. At the same time, our approach achieves comparable prediction accuracy with the state-of-art methodologies, and even outperforms them in certain cases (especially for predicting on some of the largest matrices we use). Through our work, we also offer new insights into the performance achieved using different formats on GPUs. ",

keywords = "GPUs Sampling, Sparse Computations",

author = "Gangyi Zhu and Gagan Agrawal",

note = "Funding Information: Acknowledgements: This work was partially supported by the following NSF grants: 1629392, 2007793, 2034850, 2131509, and 2018627. Publisher Copyright: {\textcopyright} 2021 IEEE.; 33rd IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021 ; Conference date: 26-10-2021 Through 29-10-2021",

year = "2021",

doi = "10.1109/SBAC-PAD53543.2021.00031",

language = "English (US)",

series = "Proceedings - Symposium on Computer Architecture and High Performance Computing",

publisher = "IEEE Computer Society",

pages = "198--208",

booktitle = "Proceedings - 2021 IEEE 33rd International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021",

}

TY - GEN

T1 - Sampling-based Sparse Format Selection on GPUs

AU - Zhu, Gangyi

AU - Agrawal, Gagan

PY - 2021

Y1 - 2021

N2 - Sparse Matrix-Vector Multiplication (SpMV) is an important kernel in numerous computational disciplines. The overall performance of SpMV is highly dependent on the storage format of the sparse matrix. This has led to much interest in recent years on automatically choosing the appropriate format, typically using machine learning techniques and training a model using a large number of matrices. However, these methods have limitations in practice-besides the dependency on obtaining a large number of sparse matrices of training and expensive overheads of the training, they usually have limited prediction ability across architectures. In this paper, we take a very distinct approach to the same problem. This approach involves obtaining samples from the original matrix, executing the kernel using these samples, and selecting the best format. However, our approach requires obtaining representative samples that can help understand performance associated with using a specific format on the full matrix, which turns out to be challenging. Based on the storage properties and processing granularity associated with different formats, we develop three novel sampling schemes: Row Cropping sampling, Random Warp sampling, and Diagonal Aligning sampling. These sampling methods are designed by observing that certain factors tend to be critical for performance associated with a particular format, and thus preserving that factor through sampling. Experimental results using nearly 2000 matrices demonstrate that our approach delivers high efficiency without the expensive training process, and it is easy to migrate across architectures. At the same time, our approach achieves comparable prediction accuracy with the state-of-art methodologies, and even outperforms them in certain cases (especially for predicting on some of the largest matrices we use). Through our work, we also offer new insights into the performance achieved using different formats on GPUs.

AB - Sparse Matrix-Vector Multiplication (SpMV) is an important kernel in numerous computational disciplines. The overall performance of SpMV is highly dependent on the storage format of the sparse matrix. This has led to much interest in recent years on automatically choosing the appropriate format, typically using machine learning techniques and training a model using a large number of matrices. However, these methods have limitations in practice-besides the dependency on obtaining a large number of sparse matrices of training and expensive overheads of the training, they usually have limited prediction ability across architectures. In this paper, we take a very distinct approach to the same problem. This approach involves obtaining samples from the original matrix, executing the kernel using these samples, and selecting the best format. However, our approach requires obtaining representative samples that can help understand performance associated with using a specific format on the full matrix, which turns out to be challenging. Based on the storage properties and processing granularity associated with different formats, we develop three novel sampling schemes: Row Cropping sampling, Random Warp sampling, and Diagonal Aligning sampling. These sampling methods are designed by observing that certain factors tend to be critical for performance associated with a particular format, and thus preserving that factor through sampling. Experimental results using nearly 2000 matrices demonstrate that our approach delivers high efficiency without the expensive training process, and it is easy to migrate across architectures. At the same time, our approach achieves comparable prediction accuracy with the state-of-art methodologies, and even outperforms them in certain cases (especially for predicting on some of the largest matrices we use). Through our work, we also offer new insights into the performance achieved using different formats on GPUs.

KW - GPUs Sampling

KW - Sparse Computations

UR - http://www.scopus.com/inward/record.url?scp=85124373821&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85124373821&partnerID=8YFLogxK

U2 - 10.1109/SBAC-PAD53543.2021.00031

DO - 10.1109/SBAC-PAD53543.2021.00031

M3 - Conference contribution

AN - SCOPUS:85124373821

T3 - Proceedings - Symposium on Computer Architecture and High Performance Computing

SP - 198

EP - 208

BT - Proceedings - 2021 IEEE 33rd International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021

PB - IEEE Computer Society

T2 - 33rd IEEE International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2021

Y2 - 26 October 2021 through 29 October 2021

ER -

Sampling-based Sparse Format Selection on GPUs

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this