The Do-All problem with Byzantine processor failures

Antonio Fernández; Chryssis Georgiou; Alexander Russell; Alex A. Shvartsman

doi:10.1016/j.tcs.2004.06.034

The Do-All problem with Byzantine processor failures

Antonio Fernández, Chryssis Georgiou, Alexander Russell, Alex A. Shvartsman

Research output: Contribution to journal › Conference article › peer-review

10 Scopus citations

Abstract

Do-All is the abstract problem of using n processors to cooperatively perform m independent tasks in the presence of failures. This problem and its derivatives have been a centerpiece in the study of trade-offs between efficiency and fault-tolerance in cooperative computing environments. Many algorithms have been developed for Do-All in various models of computation, including message-passing, partitionable networks, and shared-memory models under a variety of failure models. This work initiates the study of the Do-All problem for synchronous message-passing processors prone to Byzantine failures. In particular, upper and lower bounds are given on the complexity of Do-All for several cases: (a) the case where the maximum number of faulty processors f is known a priori, (b) the case where f is not known, (c) the case where a task execution can be verified (without re-executing the task), and (d) the case where task executions cannot be verified. The efficiency of algorithms is evaluated in terms of work and message complexities. The work complexity accounts for all computational steps taken by the processors and the message complexity accounts for all messages sent by the processors during the computation. The work and messages of a faulty processor are counted only until the processor fails to follow the algorithm. It is shown that in some cases obtaining work Θ(mn) is the best one can do. It is also shown that in certain cases communication cannot help improve work efficiency.

Original language	English (US)
Pages (from-to)	433-454
Number of pages	22
Journal	Theoretical Computer Science
Volume	333
Issue number	3
DOIs	https://doi.org/10.1016/j.tcs.2004.06.034
State	Published - Mar 3 2005
Externally published	Yes
Event	Structural Information and Communication Complexity - Umea, Sweden Duration: Jun 18 2003 → Jun 20 2003

Keywords

Byzantine failures
Distributed cooperation
Independent tasks
Work complexity

ASJC Scopus subject areas

Theoretical Computer Science
General Computer Science

Access to Document

10.1016/j.tcs.2004.06.034

Cite this

@article{9e017b47d8a94296bf40309e8b094144,

title = "The Do-All problem with Byzantine processor failures",

abstract = "Do-All is the abstract problem of using n processors to cooperatively perform m independent tasks in the presence of failures. This problem and its derivatives have been a centerpiece in the study of trade-offs between efficiency and fault-tolerance in cooperative computing environments. Many algorithms have been developed for Do-All in various models of computation, including message-passing, partitionable networks, and shared-memory models under a variety of failure models. This work initiates the study of the Do-All problem for synchronous message-passing processors prone to Byzantine failures. In particular, upper and lower bounds are given on the complexity of Do-All for several cases: (a) the case where the maximum number of faulty processors f is known a priori, (b) the case where f is not known, (c) the case where a task execution can be verified (without re-executing the task), and (d) the case where task executions cannot be verified. The efficiency of algorithms is evaluated in terms of work and message complexities. The work complexity accounts for all computational steps taken by the processors and the message complexity accounts for all messages sent by the processors during the computation. The work and messages of a faulty processor are counted only until the processor fails to follow the algorithm. It is shown that in some cases obtaining work Θ(mn) is the best one can do. It is also shown that in certain cases communication cannot help improve work efficiency.",

keywords = "Byzantine failures, Distributed cooperation, Independent tasks, Work complexity",

author = "Antonio Fern{\'a}ndez and Chryssis Georgiou and Alexander Russell and Shvartsman, {Alex A.}",

note = "Funding Information: ∗Corresponding author. E-mail addresses: anto@gsyc.escet.urjc.es (A. Fern{\'a}ndez), chryssis@ucy.ac.cy (C. Georgiou), acr@cse.uconn.edu (A. Russell), aas@cse.uconn.edu (A.A. Shvartsman). 1Partially supported by the Spanish MCyT under grant TIC2001-1586-C03-01, the Comunidad de Madrid under grant 07T/0022/2003, and the Universidad Rey Juan Carlos under grant PPR-2003-37. Work done while at the University of Connecticut. 2Work done in part while at the University of Connecticut. 3The work of this author is supported in part by the NSF CAREER Award 0093065 and NSF grants 0220264, 0218443, 0121277, and 0311368. 4Partially supported by the NSF CAREER Award 9984778 and the NSF Grants 9988304, 0121277, and 0311368.; Structural Information and Communication Complexity ; Conference date: 18-06-2003 Through 20-06-2003",

year = "2005",

month = mar,

day = "3",

doi = "10.1016/j.tcs.2004.06.034",

language = "English (US)",

volume = "333",

pages = "433--454",

journal = "Theoretical Computer Science",

issn = "0304-3975",

publisher = "Elsevier",

number = "3",

}

TY - JOUR

T1 - The Do-All problem with Byzantine processor failures

AU - Fernández, Antonio

AU - Georgiou, Chryssis

AU - Russell, Alexander

AU - Shvartsman, Alex A.

N1 - Funding Information: ∗Corresponding author. E-mail addresses: anto@gsyc.escet.urjc.es (A. Fernández), chryssis@ucy.ac.cy (C. Georgiou), acr@cse.uconn.edu (A. Russell), aas@cse.uconn.edu (A.A. Shvartsman). 1Partially supported by the Spanish MCyT under grant TIC2001-1586-C03-01, the Comunidad de Madrid under grant 07T/0022/2003, and the Universidad Rey Juan Carlos under grant PPR-2003-37. Work done while at the University of Connecticut. 2Work done in part while at the University of Connecticut. 3The work of this author is supported in part by the NSF CAREER Award 0093065 and NSF grants 0220264, 0218443, 0121277, and 0311368. 4Partially supported by the NSF CAREER Award 9984778 and the NSF Grants 9988304, 0121277, and 0311368.

PY - 2005/3/3

Y1 - 2005/3/3

N2 - Do-All is the abstract problem of using n processors to cooperatively perform m independent tasks in the presence of failures. This problem and its derivatives have been a centerpiece in the study of trade-offs between efficiency and fault-tolerance in cooperative computing environments. Many algorithms have been developed for Do-All in various models of computation, including message-passing, partitionable networks, and shared-memory models under a variety of failure models. This work initiates the study of the Do-All problem for synchronous message-passing processors prone to Byzantine failures. In particular, upper and lower bounds are given on the complexity of Do-All for several cases: (a) the case where the maximum number of faulty processors f is known a priori, (b) the case where f is not known, (c) the case where a task execution can be verified (without re-executing the task), and (d) the case where task executions cannot be verified. The efficiency of algorithms is evaluated in terms of work and message complexities. The work complexity accounts for all computational steps taken by the processors and the message complexity accounts for all messages sent by the processors during the computation. The work and messages of a faulty processor are counted only until the processor fails to follow the algorithm. It is shown that in some cases obtaining work Θ(mn) is the best one can do. It is also shown that in certain cases communication cannot help improve work efficiency.

AB - Do-All is the abstract problem of using n processors to cooperatively perform m independent tasks in the presence of failures. This problem and its derivatives have been a centerpiece in the study of trade-offs between efficiency and fault-tolerance in cooperative computing environments. Many algorithms have been developed for Do-All in various models of computation, including message-passing, partitionable networks, and shared-memory models under a variety of failure models. This work initiates the study of the Do-All problem for synchronous message-passing processors prone to Byzantine failures. In particular, upper and lower bounds are given on the complexity of Do-All for several cases: (a) the case where the maximum number of faulty processors f is known a priori, (b) the case where f is not known, (c) the case where a task execution can be verified (without re-executing the task), and (d) the case where task executions cannot be verified. The efficiency of algorithms is evaluated in terms of work and message complexities. The work complexity accounts for all computational steps taken by the processors and the message complexity accounts for all messages sent by the processors during the computation. The work and messages of a faulty processor are counted only until the processor fails to follow the algorithm. It is shown that in some cases obtaining work Θ(mn) is the best one can do. It is also shown that in certain cases communication cannot help improve work efficiency.

KW - Byzantine failures

KW - Distributed cooperation

KW - Independent tasks

KW - Work complexity

UR - http://www.scopus.com/inward/record.url?scp=13844297012&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=13844297012&partnerID=8YFLogxK

U2 - 10.1016/j.tcs.2004.06.034

DO - 10.1016/j.tcs.2004.06.034

M3 - Conference article

AN - SCOPUS:13844297012

SN - 0304-3975

VL - 333

SP - 433

EP - 454

JO - Theoretical Computer Science

JF - Theoretical Computer Science

IS - 3

T2 - Structural Information and Communication Complexity

Y2 - 18 June 2003 through 20 June 2003

ER -

The Do-All problem with Byzantine processor failures

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this