TY - GEN
T1 - Estimating reliability of workers for cooperative distributed computing
AU - Davtyan, Seda
AU - Konwar, Kishori M.
AU - Shvartsman, Alexander A.
N1 - DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.
PY - 2013
Y1 - 2013
N2 - Internet supercomputing is an approach to solving partitionable, computation-intensive problems by harnessing the power of a vast number of interconnected computers. Forthe problem of using network supercomputing to perform a large collection of independent tasks, prior work introduced a decentralized approach and provided randomized synchronousalgorithms that perform all tasks correctly with high probability, while dealing with misbehaving or crash-prone processors. The main weaknesses of existing algorithms is that they assume either that the average probability of a non-crashed processor returningincorrect results is inferior to 1/2, or that the probability of returning incorrect results is known to each processor. Here we present a randomized synchronous distributed algorithm that tightly estimates the probability of each processor returning correct results. Starting with the set P of n processors, let F be the set of processors that crash. Our algorithm estimates the probability p-i of returning a correct result for each processor i \in P - F, making the estimates available to all these processors. The estimation is based on the (e, d)-approximation, where each estimated probability pi of p-i obeys the bound Pr[pi(1 - e) = pi=pi(1+e)]>1 - d, for any constants d>0 and e>0 chosen by the user. An important aspect of this algorithm is that each processor terminates without global coordination. We assess the efficiency of the algorithm in three adversarial models as follows. For the model where the number of non-crashed processors |P - F | is linearly bounded the time complexity T(n) of the algorithm is O(log n), work complexity W(n) is O(n log n), and message complexity M(n) is O(n log2 n). For the model where |P - F | is bounded by a fractional polynomial we have T(n) = O(n1 - a logn loglogn), W(n) = O(nlogn loglogn), and M(n) = O(nlog2n loglogn). For the model where |P - F| is bounded by a poly-logarithm we have T(n) = O(n), W(n) = O(n polylog n), and M(n) = O(n log2 n polylog n). All bounds are shown to hold with high probability.
AB - Internet supercomputing is an approach to solving partitionable, computation-intensive problems by harnessing the power of a vast number of interconnected computers. Forthe problem of using network supercomputing to perform a large collection of independent tasks, prior work introduced a decentralized approach and provided randomized synchronousalgorithms that perform all tasks correctly with high probability, while dealing with misbehaving or crash-prone processors. The main weaknesses of existing algorithms is that they assume either that the average probability of a non-crashed processor returningincorrect results is inferior to 1/2, or that the probability of returning incorrect results is known to each processor. Here we present a randomized synchronous distributed algorithm that tightly estimates the probability of each processor returning correct results. Starting with the set P of n processors, let F be the set of processors that crash. Our algorithm estimates the probability p-i of returning a correct result for each processor i \in P - F, making the estimates available to all these processors. The estimation is based on the (e, d)-approximation, where each estimated probability pi of p-i obeys the bound Pr[pi(1 - e) = pi=pi(1+e)]>1 - d, for any constants d>0 and e>0 chosen by the user. An important aspect of this algorithm is that each processor terminates without global coordination. We assess the efficiency of the algorithm in three adversarial models as follows. For the model where the number of non-crashed processors |P - F | is linearly bounded the time complexity T(n) of the algorithm is O(log n), work complexity W(n) is O(n log n), and message complexity M(n) is O(n log2 n). For the model where |P - F | is bounded by a fractional polynomial we have T(n) = O(n1 - a logn loglogn), W(n) = O(nlogn loglogn), and M(n) = O(nlog2n loglogn). For the model where |P - F| is bounded by a poly-logarithm we have T(n) = O(n), W(n) = O(n polylog n), and M(n) = O(n log2 n polylog n). All bounds are shown to hold with high probability.
KW - Distributed Computing
KW - Internet Supercomputing
KW - Worker Reliability Estimation
UR - http://www.scopus.com/inward/record.url?scp=84892768402&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84892768402&partnerID=8YFLogxK
U2 - 10.1109/ISPDC.2013.22
DO - 10.1109/ISPDC.2013.22
M3 - Conference contribution
AN - SCOPUS:84892768402
SN - 9780769550183
T3 - Proceedings - 2013 IEEE 12th International Symposium on Parallel and Distributed Computing, ISPDC 2013
SP - 101
EP - 108
BT - Proceedings - 2013 IEEE 12th International Symposium on Parallel and Distributed Computing, ISPDC 2013
T2 - 2013 IEEE 12th International Symposium on Parallel and Distributed Computing, ISPDC 2013
Y2 - 27 June 2013 through 30 June 2013
ER -