Coordinated cooperative task computing using crash-prone processors with unreliable multicast

Seda Davtyan, Roberto De Prisco, Chryssis Georgiou, Theophanis Hadjistasi, Alexander Allister Schwarzmann

Research output: Contribution to journalArticlepeer-review

Abstract

This paper presents a new message-passing algorithm, called Do-UM, for distributed cooperative task computing in synchronous settings where processors may crash, and where any multicasts (or broadcasts) performed by crashing processors are unreliable. We specify the algorithm, prove its correctness and analyse its complexity. We show that its worst case available processor steps is S=Θt+n [Formula presented] +f(n−f) and that the number of messages sent is less than n2t+ [Formula presented], where n is the number of processors, t is the number of tasks to be executed and f is the number of failures. To assess the performance of the algorithm in practical scenarios, we perform an experimental evaluation on a planetary-scale distributed platform. This also allows us to compare our algorithm with the currently best algorithm that is, however, explicitly designed to use reliable multicast; the results suggest that our algorithm does not lose much efficiency in order to cope with unreliable multicast.

Original languageEnglish (US)
Pages (from-to)272-285
Number of pages14
JournalJournal of Parallel and Distributed Computing
Volume109
DOIs
StatePublished - 2017

Keywords

  • Crash faults
  • Fault-tolerant distributed algorithms
  • Task computing
  • Unreliable multicast

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computer Networks and Communications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Coordinated cooperative task computing using crash-prone processors with unreliable multicast'. Together they form a unique fingerprint.

Cite this