Autonomic runtime system for large scale parallel and distributed applications

Jingmei Yang, Huoping Chen, Byoung Uk Kim, Salim Hariri, Manish Parashar

Research output: Contribution to journalConference article

Abstract

The development of efficient parallel algorithms for large scale wildfire simulations is a challenging research problem because the factors that determine wildfire behavior are complex; they include fuel characteristics and configurations, chemical reactions, balances between different modes of heat transfer, topography, and fire/atmosphere interactions. These factors make static parallel algorithms inefficient, especially when large number of processors are used because we cannot predict accurately the propagation of the fire and its computational requirements at runtime. In this paper, we present an Autonomic Runtime Manager (ARM) to dynamically exploit the physics properties of the fire simulation and use them as the basis of our self-optimization algorithm. At each step of the wildfire simulation, the ARM decomposes the computational domain into several natural regions (e.g., burning, unburned, burned) where each region has the same temporal and special characteristics. The number of burning, unburned and burned cells determines the current state of the fire simulation and can then be used to accurately predict the computational power required for each region. By regularly monitoring the state of the simulation and analyzing it, and use that to drive the runtime optimization, we can achieve significant performance gains because we can efficiently balance the computational load on each processor. Our experimental results show that the performance of the fire simulation has been improved by 45% when compared with a static portioning algorithm that does not take into considerations the state of the computations.

Original languageEnglish (US)
Pages (from-to)297-311
Number of pages15
JournalLECTURE NOTES IN COMPUTER SCIENCE
Volume3566
DOIs
StatePublished - 2005
EventInternational Workshop on Unconventional Programming Paradigms, UPP 2004 - Le Mont Saint Michel, France
Duration: Sep 15 2004Sep 17 2004

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Autonomic runtime system for large scale parallel and distributed applications'. Together they form a unique fingerprint.

  • Cite this