SciChain: Blockchain-enabled lightweight and efficient data provenance for reproducible scientific computing

Abdullah Al-Mamun, Feng Yan, Dongfang Zhao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Scopus citations

Abstract

The state-of-the-art for auditing and reproducing scientific applications on high-performance computing (HPC) systems is through a data provenance subsystem. While recent advances in data provenance lie in reducing the performance overhead and improving the user's query flexibility, the fidelity of data provenance is often overlooked: there is no such way to ensure that the provenance data itself has not been fabricated or falsified. This paper advocates leveraging blockchains to deliver immutable and autonomous data provenance services such that scientific discoveries are trustworthy. The challenges for adopting blockchains to HPC include designing a new blockchain architecture compatible with the HPC platforms and, more importantly, a set of new consensus protocols for scientific applications atop blockchains. To this end, we have designed the proof-of-scalable-traceability (POST) protocol and implemented it in a blockchain prototype, namely SciChain, the very first practical blockchain system for provenance services on HPC. We evaluated SciChain by comparing it with multiple state-of-the-art systems; experimental results showed that SciChain guaranteed trustworthy data provenance while incurring orders of magnitude lower overhead than existing solutions.

Original languageEnglish (US)
Title of host publicationProceedings - 2021 IEEE 37th International Conference on Data Engineering, ICDE 2021
PublisherIEEE Computer Society
Pages1853-1858
Number of pages6
ISBN (Electronic)9781728191843
DOIs
StatePublished - Apr 2021
Externally publishedYes
Event37th IEEE International Conference on Data Engineering, ICDE 2021 - Virtual, Chania, Greece
Duration: Apr 19 2021Apr 22 2021

Publication series

NameProceedings - International Conference on Data Engineering
Volume2021-April
ISSN (Print)1084-4627

Conference

Conference37th IEEE International Conference on Data Engineering, ICDE 2021
Country/TerritoryGreece
CityVirtual, Chania
Period4/19/214/22/21

Keywords

  • Blockchain
  • Fault tolerance
  • HPC
  • Provenance

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Fingerprint

Dive into the research topics of 'SciChain: Blockchain-enabled lightweight and efficient data provenance for reproducible scientific computing'. Together they form a unique fingerprint.

Cite this