TY - JOUR
T1 - The Do-All problem with Byzantine processor failures
AU - Fernández, Antonio
AU - Georgiou, Chryssis
AU - Russell, Alexander
AU - Shvartsman, Alex A.
N1 - Funding Information:
∗Corresponding author. E-mail addresses: anto@gsyc.escet.urjc.es (A. Fernández), chryssis@ucy.ac.cy (C. Georgiou), acr@cse.uconn.edu (A. Russell), aas@cse.uconn.edu (A.A. Shvartsman). 1Partially supported by the Spanish MCyT under grant TIC2001-1586-C03-01, the Comunidad de Madrid under grant 07T/0022/2003, and the Universidad Rey Juan Carlos under grant PPR-2003-37. Work done while at the University of Connecticut. 2Work done in part while at the University of Connecticut. 3The work of this author is supported in part by the NSF CAREER Award 0093065 and NSF grants 0220264, 0218443, 0121277, and 0311368. 4Partially supported by the NSF CAREER Award 9984778 and the NSF Grants 9988304, 0121277, and 0311368.
PY - 2005/3/3
Y1 - 2005/3/3
N2 - Do-All is the abstract problem of using n processors to cooperatively perform m independent tasks in the presence of failures. This problem and its derivatives have been a centerpiece in the study of trade-offs between efficiency and fault-tolerance in cooperative computing environments. Many algorithms have been developed for Do-All in various models of computation, including message-passing, partitionable networks, and shared-memory models under a variety of failure models. This work initiates the study of the Do-All problem for synchronous message-passing processors prone to Byzantine failures. In particular, upper and lower bounds are given on the complexity of Do-All for several cases: (a) the case where the maximum number of faulty processors f is known a priori, (b) the case where f is not known, (c) the case where a task execution can be verified (without re-executing the task), and (d) the case where task executions cannot be verified. The efficiency of algorithms is evaluated in terms of work and message complexities. The work complexity accounts for all computational steps taken by the processors and the message complexity accounts for all messages sent by the processors during the computation. The work and messages of a faulty processor are counted only until the processor fails to follow the algorithm. It is shown that in some cases obtaining work Θ(mn) is the best one can do. It is also shown that in certain cases communication cannot help improve work efficiency.
AB - Do-All is the abstract problem of using n processors to cooperatively perform m independent tasks in the presence of failures. This problem and its derivatives have been a centerpiece in the study of trade-offs between efficiency and fault-tolerance in cooperative computing environments. Many algorithms have been developed for Do-All in various models of computation, including message-passing, partitionable networks, and shared-memory models under a variety of failure models. This work initiates the study of the Do-All problem for synchronous message-passing processors prone to Byzantine failures. In particular, upper and lower bounds are given on the complexity of Do-All for several cases: (a) the case where the maximum number of faulty processors f is known a priori, (b) the case where f is not known, (c) the case where a task execution can be verified (without re-executing the task), and (d) the case where task executions cannot be verified. The efficiency of algorithms is evaluated in terms of work and message complexities. The work complexity accounts for all computational steps taken by the processors and the message complexity accounts for all messages sent by the processors during the computation. The work and messages of a faulty processor are counted only until the processor fails to follow the algorithm. It is shown that in some cases obtaining work Θ(mn) is the best one can do. It is also shown that in certain cases communication cannot help improve work efficiency.
KW - Byzantine failures
KW - Distributed cooperation
KW - Independent tasks
KW - Work complexity
UR - http://www.scopus.com/inward/record.url?scp=13844297012&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=13844297012&partnerID=8YFLogxK
U2 - 10.1016/j.tcs.2004.06.034
DO - 10.1016/j.tcs.2004.06.034
M3 - Conference article
AN - SCOPUS:13844297012
SN - 0304-3975
VL - 333
SP - 433
EP - 454
JO - Theoretical Computer Science
JF - Theoretical Computer Science
IS - 3
T2 - Structural Information and Communication Complexity
Y2 - 18 June 2003 through 20 June 2003
ER -