Coordinated cooperative work using undependable processors with unreliable broadcast

Seda Davtyan, Roberto De Prisco, Chryssis Georgiou, Alexander A. Shvartsman

Research output: Contribution to conferencePaperpeer-review

2 Scopus citations

Abstract

With the end of Moore's Law in sight, parallelism became the main means for speeding up computationally intensive applications, especially in the cases where large collections of tasks need to be performed. Network supercomputing - taking advantage of very large numbers of computers in a distributed environment is an effective approach to massive parallelism that harnesses the processing power inherent in large networked settings. In such settings, processor failures are no longer an exception, but the norm. Any algorithm designed for realistic settings must be able to deal with failures. This paper presents a new message-passing algorithm for distributed cooperative work in synchronous settings where processors may crash, and where any broadcasts performed by crashing processors are unreliable. We specify the algorithm, prove that it is correct, and perform extensive simulations that show that its performance is close to similar algorithms that use reliable broadcast, and that its work compares favorably to the relevant lower bounds.

Original languageEnglish (US)
Pages17-26
Number of pages10
DOIs
StatePublished - 2014
Externally publishedYes
Event2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2014 - Turin, Italy
Duration: Feb 12 2014Feb 14 2014

Conference

Conference2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2014
Country/TerritoryItaly
CityTurin
Period2/12/142/14/14

Keywords

  • distributed algorithms
  • fault-tolerance
  • processor crashes
  • task computing
  • unreliable broadcast

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Software

Fingerprint

Dive into the research topics of 'Coordinated cooperative work using undependable processors with unreliable broadcast'. Together they form a unique fingerprint.

Cite this