Practical resource management in power-constrained, high performance computing

Tapasya Patki, David K Lowenthal, Anjana Sasidharan, Matthias Maiterth, Barry L. Rountree, Martin Schulz, Bronis R. De Supinski

Research output: Chapter in Book/Report/Conference proceedingConference contribution

41 Citations (Scopus)

Abstract

Power management is one of the key research challenges on the path to exascale. Supercomputers today are designed to be worst-case power provisioned, leading to two main problems| limited application performance and under-utilization of procured power. In this paper, we propose RMAP, a practical, low-overhead resource manager targeted at future power-constrained clusters. The goals for RMAP are to improve application performance as well as system power utilization, and thus minimize the average turnaround time for all jobs. Within RMAP, we design and analyze an adaptive policy, which derives job-level power bounds in a fair-share manner and supports overprovisioning and power-aware backfilling. Our results show that our new policy increases system power utilization while adhering to strict job-level power bounds and leads to 31% (19% on average) and 54% (36% on average) faster average turnaround time when compared to worstcase provisioning and naive overprovisioning respectively.

Original languageEnglish (US)
Title of host publicationHPDC 2015 - Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing
PublisherAssociation for Computing Machinery, Inc
Pages121-132
Number of pages12
ISBN (Electronic)9781450335508
DOIs
StatePublished - Jun 15 2015
Event24th ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC 2015 - Portland, United States
Duration: Jun 15 2015Jun 19 2015

Other

Other24th ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC 2015
CountryUnited States
CityPortland
Period6/15/156/19/15

Fingerprint

Turnaround time
Electric power utilization
Supercomputers
Managers
Power management

Keywords

  • Power-constrained HPC
  • Resource Management

ASJC Scopus subject areas

  • Computer Science Applications
  • Computational Theory and Mathematics
  • Software

Cite this

Patki, T., Lowenthal, D. K., Sasidharan, A., Maiterth, M., Rountree, B. L., Schulz, M., & De Supinski, B. R. (2015). Practical resource management in power-constrained, high performance computing. In HPDC 2015 - Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing (pp. 121-132). Association for Computing Machinery, Inc. https://doi.org/10.1145/2749246.2749262

Practical resource management in power-constrained, high performance computing. / Patki, Tapasya; Lowenthal, David K; Sasidharan, Anjana; Maiterth, Matthias; Rountree, Barry L.; Schulz, Martin; De Supinski, Bronis R.

HPDC 2015 - Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing. Association for Computing Machinery, Inc, 2015. p. 121-132.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Patki, T, Lowenthal, DK, Sasidharan, A, Maiterth, M, Rountree, BL, Schulz, M & De Supinski, BR 2015, Practical resource management in power-constrained, high performance computing. in HPDC 2015 - Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing. Association for Computing Machinery, Inc, pp. 121-132, 24th ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC 2015, Portland, United States, 6/15/15. https://doi.org/10.1145/2749246.2749262
Patki T, Lowenthal DK, Sasidharan A, Maiterth M, Rountree BL, Schulz M et al. Practical resource management in power-constrained, high performance computing. In HPDC 2015 - Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing. Association for Computing Machinery, Inc. 2015. p. 121-132 https://doi.org/10.1145/2749246.2749262
Patki, Tapasya ; Lowenthal, David K ; Sasidharan, Anjana ; Maiterth, Matthias ; Rountree, Barry L. ; Schulz, Martin ; De Supinski, Bronis R. / Practical resource management in power-constrained, high performance computing. HPDC 2015 - Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing. Association for Computing Machinery, Inc, 2015. pp. 121-132
@inproceedings{84a062cb1fb64fc196031b5326b8124f,
title = "Practical resource management in power-constrained, high performance computing",
abstract = "Power management is one of the key research challenges on the path to exascale. Supercomputers today are designed to be worst-case power provisioned, leading to two main problems| limited application performance and under-utilization of procured power. In this paper, we propose RMAP, a practical, low-overhead resource manager targeted at future power-constrained clusters. The goals for RMAP are to improve application performance as well as system power utilization, and thus minimize the average turnaround time for all jobs. Within RMAP, we design and analyze an adaptive policy, which derives job-level power bounds in a fair-share manner and supports overprovisioning and power-aware backfilling. Our results show that our new policy increases system power utilization while adhering to strict job-level power bounds and leads to 31{\%} (19{\%} on average) and 54{\%} (36{\%} on average) faster average turnaround time when compared to worstcase provisioning and naive overprovisioning respectively.",
keywords = "Power-constrained HPC, Resource Management",
author = "Tapasya Patki and Lowenthal, {David K} and Anjana Sasidharan and Matthias Maiterth and Rountree, {Barry L.} and Martin Schulz and {De Supinski}, {Bronis R.}",
year = "2015",
month = "6",
day = "15",
doi = "10.1145/2749246.2749262",
language = "English (US)",
pages = "121--132",
booktitle = "HPDC 2015 - Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - Practical resource management in power-constrained, high performance computing

AU - Patki, Tapasya

AU - Lowenthal, David K

AU - Sasidharan, Anjana

AU - Maiterth, Matthias

AU - Rountree, Barry L.

AU - Schulz, Martin

AU - De Supinski, Bronis R.

PY - 2015/6/15

Y1 - 2015/6/15

N2 - Power management is one of the key research challenges on the path to exascale. Supercomputers today are designed to be worst-case power provisioned, leading to two main problems| limited application performance and under-utilization of procured power. In this paper, we propose RMAP, a practical, low-overhead resource manager targeted at future power-constrained clusters. The goals for RMAP are to improve application performance as well as system power utilization, and thus minimize the average turnaround time for all jobs. Within RMAP, we design and analyze an adaptive policy, which derives job-level power bounds in a fair-share manner and supports overprovisioning and power-aware backfilling. Our results show that our new policy increases system power utilization while adhering to strict job-level power bounds and leads to 31% (19% on average) and 54% (36% on average) faster average turnaround time when compared to worstcase provisioning and naive overprovisioning respectively.

AB - Power management is one of the key research challenges on the path to exascale. Supercomputers today are designed to be worst-case power provisioned, leading to two main problems| limited application performance and under-utilization of procured power. In this paper, we propose RMAP, a practical, low-overhead resource manager targeted at future power-constrained clusters. The goals for RMAP are to improve application performance as well as system power utilization, and thus minimize the average turnaround time for all jobs. Within RMAP, we design and analyze an adaptive policy, which derives job-level power bounds in a fair-share manner and supports overprovisioning and power-aware backfilling. Our results show that our new policy increases system power utilization while adhering to strict job-level power bounds and leads to 31% (19% on average) and 54% (36% on average) faster average turnaround time when compared to worstcase provisioning and naive overprovisioning respectively.

KW - Power-constrained HPC

KW - Resource Management

UR - http://www.scopus.com/inward/record.url?scp=84987740923&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84987740923&partnerID=8YFLogxK

U2 - 10.1145/2749246.2749262

DO - 10.1145/2749246.2749262

M3 - Conference contribution

SP - 121

EP - 132

BT - HPDC 2015 - Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing

PB - Association for Computing Machinery, Inc

ER -