Analyzing and mitigating the impact of manufacturing variability in power-constrained supercomputing

Yuichi Inadomi, Tapasya Patki, Koji Inoue, Mutsumi Aoyagi, Barry Rountree, Martin Schulz, David K Lowenthal, Yasutaka Wada, Keiichiro Fukazawa, Masatsugu Ueda, Masaaki Kondo, Ikuo Miyoshi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

45 Scopus citations

Abstract

A key challenge in next-generation supercomputing is to effectively schedule limited power resources. Modern processors suffer from increasingly large power variations due to the chip manufacturing process. These variations lead to power inhomogeneity in current systems and manifest into performance inhomogeneity in power constrained environments, drastically limiting supercomputing performance. We present a first-of-its-kind study on manufacturing variability on four production HPC systems spanning four microarchitectures, analyze its impact on HPC applications, and propose a novel variation-aware power budgeting scheme to maximize effective application performance. Our low-cost and scalable budgeting algorithm strives to achieve performance homogeneity under a power constraint by deriving application-specific, module-level power allocations. Experimental results using a 1,920 socket system show up to 5.4X speedup, with an average speedup of 1.8X across all benchmarks when compared to a variation-unaware power allocation scheme.

Original languageEnglish (US)
Title of host publicationInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
PublisherIEEE Computer Society
Volume15-20-November-2015
ISBN (Print)9781450337236
DOIs
Publication statusPublished - Nov 15 2015
EventInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015 - Austin, United States
Duration: Nov 15 2015Nov 20 2015

Other

OtherInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015
CountryUnited States
CityAustin
Period11/15/1511/20/15

    Fingerprint

Keywords

  • performance modeling
  • power-constrained HPC

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Software

Cite this

Inadomi, Y., Patki, T., Inoue, K., Aoyagi, M., Rountree, B., Schulz, M., ... Miyoshi, I. (2015). Analyzing and mitigating the impact of manufacturing variability in power-constrained supercomputing. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC (Vol. 15-20-November-2015). [a78] IEEE Computer Society. https://doi.org/10.1145/2807591.2807638