Runtime fault-handling for job-flow management in Grid environments

Gargi Dasgupta, Onyeka Ezenwoye, Liana Fong, Selim Kalayci, S. Masoud Sadjadi, Balaji Viswanathan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

The execution of job flow applications is a reality today in academic and industrial domains. In this paper, we propose an approach to adding self-healing behavior to the execution of job flows without the need to modify the job flow engines or redevelop the job flows themselves. We show the feasibility of our non-intrusive approach to self-healing by inserting a generic proxy to an existing two-level job-flow management system, which employs job flow based service orchestration at the upper level, and service choreography at the lower level. The generic proxy is inserted transparently between these two layers so that it can intercept all their interactions. We developed a prototype of our approach in a real Grid environment to show how the proxy facilitates runtime handling for failure recovery.

Original languageEnglish (US)
Title of host publication5th International Conference on Autonomic Computing, ICAC 2008
Pages201-202
Number of pages2
DOIs
StatePublished - 2008
Externally publishedYes
Event5th International Conference on Autonomic Computing, ICAC 2008 - Chicago, IL, United States
Duration: Jun 2 2008Jun 6 2008

Publication series

Name5th International Conference on Autonomic Computing, ICAC 2008

Other

Other5th International Conference on Autonomic Computing, ICAC 2008
Country/TerritoryUnited States
CityChicago, IL
Period6/2/086/6/08

Keywords

  • Fault-tolerance
  • Generic proxy
  • Job-flow management
  • Job-flows
  • Meta-scheduler

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Hardware and Architecture
  • Software
  • Control and Systems Engineering

Fingerprint

Dive into the research topics of 'Runtime fault-handling for job-flow management in Grid environments'. Together they form a unique fingerprint.

Cite this