In-memory Blockchain: Toward Efficient and Trustworthy Data Provenance for HPC Systems

Abdullah Al-Mamun, Tonglin Li, Mohammad Sadoghi, Dongfang Zhao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Scopus citations

Abstract

The state-of-the-art approaches for tracking data provenance on high-performance computing (HPC) systems are either supported by file systems or relational databases. These techniques shared the same critique on the provenance data's fidelity and the associated I/O overhead. This paper envisions to track the HPC data provenance using a distributed in-memory ledger - the core technique leveraged by blockchains and proven to be highly trustworthy by many large-scale applications. We pinpoint two system challenges - storage architecture and consensus protocol - for adopting blockchains to HPC and make the following contributions: (i) We design a new in-memory blockchain architecture for HPC systems, exploiting the high-performance network infrastructure InfiniBand and greatly reducing the I/O overhead; and (ii) We develop a new consensus protocol, namely proof-of-reproducibility (PoR), crafted for the new architecture, which takes into account both proof-of-work (PoW) and proof-of-stake (PoS) mechanisms. The correctness of PoR is both theoretically proven and experimentally verified. A prototype system is implemented and evaluated with more than one million transactions, showing 32× speedup compared to the filesystem-based provenance service and four orders of magnitude speedup compared to the database-based provenance service.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE International Conference on Big Data, Big Data 2018
EditorsYang Song, Bing Liu, Kisung Lee, Naoki Abe, Calton Pu, Mu Qiao, Nesreen Ahmed, Donald Kossmann, Jeffrey Saltz, Jiliang Tang, Jingrui He, Huan Liu, Xiaohua Hu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3808-3813
Number of pages6
ISBN (Electronic)9781538650356
DOIs
StatePublished - Jan 22 2019
Externally publishedYes
Event2018 IEEE International Conference on Big Data, Big Data 2018 - Seattle, United States
Duration: Dec 10 2018Dec 13 2018

Publication series

NameProceedings - 2018 IEEE International Conference on Big Data, Big Data 2018

Conference

Conference2018 IEEE International Conference on Big Data, Big Data 2018
Country/TerritoryUnited States
CitySeattle
Period12/10/1812/13/18

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'In-memory Blockchain: Toward Efficient and Trustworthy Data Provenance for HPC Systems'. Together they form a unique fingerprint.

Cite this