Abstract
We consider the problem of how to schedule t similar and independent tasks to be performed in a synchronous distributed system of p stations communicating via multiple-access channels. Stations are prone to crashes whose patterns of occurrence are specified by adversarial models. Work, defined as the number of the available processor steps, is the complexity measure. We consider only reliable algorithms that perform all the tasks as long as at least one station remains operational. It is shown that every reliable algorithm has to perform work Ω(t + p√t) even when no failures occur. An optimal deterministic algorithm for the channel with collision detection is developed, which performs work script O sign (t + p√t). Another algorithm, for the channel without collision detection, performs work script O sign (t + p√t+ p min {f,t}), where f < p is the number of failures. This algorithm is proved to be optimal, provided that the adversary is restricted in failing no more than f stations. Finally, we consider the question if randomization helps against weaker adversaries for the channel without collision detection. A randomized algorithm is developed which performs the expected minimum amount script O sign (t + p√t) of work, provided that the adversary may fail a constant fraction of stations and it has to select failure-prone stations prior to the start of an execution of the algorithm.
Original language | English (US) |
---|---|
Pages (from-to) | 435-451 |
Number of pages | 17 |
Journal | Distributed Computing |
Volume | 18 |
Issue number | 6 |
DOIs | |
State | Published - Jun 2006 |
Externally published | Yes |
Keywords
- Adversary
- Distributed algorithm
- Fail-stop failure
- Independent tasks
- Multiple-access channel
ASJC Scopus subject areas
- Theoretical Computer Science
- Hardware and Architecture
- Computer Networks and Communications
- Computational Theory and Mathematics