Smart streaming: A high-throughput fault-tolerant online processing system

Jia Guo, Gagan Agrawal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

In recent years, there has been considerable interest in developing frameworks for processing streaming data. Like the precursor commercial systems for data-intensive processing, these systems have largely not used methods popular within the HPC community (for example, MPI for communication). In this paper, we demonstrate a system for stream processing that offers a high-level API to the users (similar to MapReduce), is fault-tolerant, and is also more efficient and scalable than current solutions. Particularly, a cost-efficient MPI/OpenMP based fault-tolerant scheme is incorporated so that the system can survive node failures with only a modest degradation of performance. We evaluate both the functionality and efficiency of Smart Streaming using four common applications in machine learning and data analytics. A comparison against state-of-the-art streaming frameworks shows our system boosts the throughput of test cases by up to 10X and achieve desirable parallelism when scaled out. Additionally, the performance loss upon failures is only proportional to the share of failed resources.

Original languageEnglish (US)
Title of host publicationProceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages396-405
Number of pages10
ISBN (Electronic)9781728174457
DOIs
StatePublished - May 2020
Event34th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020 - New Orleans, United States
Duration: May 18 2020May 22 2020

Publication series

NameProceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020

Conference

Conference34th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020
Country/TerritoryUnited States
CityNew Orleans
Period5/18/205/22/20

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Hardware and Architecture
  • Safety, Risk, Reliability and Quality
  • Control and Optimization

Fingerprint

Dive into the research topics of 'Smart streaming: A high-throughput fault-tolerant online processing system'. Together they form a unique fingerprint.

Cite this