A study of network quality of service in many-core MPI applications

Lee Savoie, David K Lowenthal, Bronis R. De Supinski, Kathryn Mohror

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Network contention in existing high performance computing (HPC) systems increases job execution time and reduces machine throughput. This problem is expected to become worse in future systems as core counts increase and networks become larger and more complicated. In this paper, we investigate the use of network Quality of Service (QoS) to mitigate the effects of network contention. QoS allocates bandwidth to individual jobs, thus limiting the impact that one job can have on another through network contention. We consider coarse-grained QoS, in which each job runs at a different priority level, by running a number of micro-benchmarks and applications in different QoS configurations on real hardware with QoS capabilities. Our results indicate that while network contention reduces job performance by as much as 70%, coarse-grained QoS is unlikely to improve throughput on HPC systems and may increase job execution times by more than 100%. Based on our analysis, finer-grained QoS is more likely to improve performance and throughput.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1313-1322
Number of pages10
ISBN (Print)9781538655559
DOIs
StatePublished - Aug 3 2018
Event32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018 - Vancouver, Canada
Duration: May 21 2018May 25 2018

Other

Other32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
CountryCanada
CityVancouver
Period5/21/185/25/18

Fingerprint

Quality of service
Throughput
Hardware
Bandwidth
High performance

Keywords

  • Contention
  • High performance computing
  • Many core
  • Network
  • Network contention
  • Performance
  • Quality of service
  • Service level

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems and Management

Cite this

Savoie, L., Lowenthal, D. K., De Supinski, B. R., & Mohror, K. (2018). A study of network quality of service in many-core MPI applications. In Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018 (pp. 1313-1322). [8425571] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IPDPSW.2018.00204

A study of network quality of service in many-core MPI applications. / Savoie, Lee; Lowenthal, David K; De Supinski, Bronis R.; Mohror, Kathryn.

Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018. Institute of Electrical and Electronics Engineers Inc., 2018. p. 1313-1322 8425571.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Savoie, L, Lowenthal, DK, De Supinski, BR & Mohror, K 2018, A study of network quality of service in many-core MPI applications. in Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018., 8425571, Institute of Electrical and Electronics Engineers Inc., pp. 1313-1322, 32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018, Vancouver, Canada, 5/21/18. https://doi.org/10.1109/IPDPSW.2018.00204
Savoie L, Lowenthal DK, De Supinski BR, Mohror K. A study of network quality of service in many-core MPI applications. In Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018. Institute of Electrical and Electronics Engineers Inc. 2018. p. 1313-1322. 8425571 https://doi.org/10.1109/IPDPSW.2018.00204
Savoie, Lee ; Lowenthal, David K ; De Supinski, Bronis R. ; Mohror, Kathryn. / A study of network quality of service in many-core MPI applications. Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 1313-1322
@inproceedings{5d1085fcbf88483ea48f6079353a5f2d,
title = "A study of network quality of service in many-core MPI applications",
abstract = "Network contention in existing high performance computing (HPC) systems increases job execution time and reduces machine throughput. This problem is expected to become worse in future systems as core counts increase and networks become larger and more complicated. In this paper, we investigate the use of network Quality of Service (QoS) to mitigate the effects of network contention. QoS allocates bandwidth to individual jobs, thus limiting the impact that one job can have on another through network contention. We consider coarse-grained QoS, in which each job runs at a different priority level, by running a number of micro-benchmarks and applications in different QoS configurations on real hardware with QoS capabilities. Our results indicate that while network contention reduces job performance by as much as 70{\%}, coarse-grained QoS is unlikely to improve throughput on HPC systems and may increase job execution times by more than 100{\%}. Based on our analysis, finer-grained QoS is more likely to improve performance and throughput.",
keywords = "Contention, High performance computing, Many core, Network, Network contention, Performance, Quality of service, Service level",
author = "Lee Savoie and Lowenthal, {David K} and {De Supinski}, {Bronis R.} and Kathryn Mohror",
year = "2018",
month = "8",
day = "3",
doi = "10.1109/IPDPSW.2018.00204",
language = "English (US)",
isbn = "9781538655559",
pages = "1313--1322",
booktitle = "Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - A study of network quality of service in many-core MPI applications

AU - Savoie, Lee

AU - Lowenthal, David K

AU - De Supinski, Bronis R.

AU - Mohror, Kathryn

PY - 2018/8/3

Y1 - 2018/8/3

N2 - Network contention in existing high performance computing (HPC) systems increases job execution time and reduces machine throughput. This problem is expected to become worse in future systems as core counts increase and networks become larger and more complicated. In this paper, we investigate the use of network Quality of Service (QoS) to mitigate the effects of network contention. QoS allocates bandwidth to individual jobs, thus limiting the impact that one job can have on another through network contention. We consider coarse-grained QoS, in which each job runs at a different priority level, by running a number of micro-benchmarks and applications in different QoS configurations on real hardware with QoS capabilities. Our results indicate that while network contention reduces job performance by as much as 70%, coarse-grained QoS is unlikely to improve throughput on HPC systems and may increase job execution times by more than 100%. Based on our analysis, finer-grained QoS is more likely to improve performance and throughput.

AB - Network contention in existing high performance computing (HPC) systems increases job execution time and reduces machine throughput. This problem is expected to become worse in future systems as core counts increase and networks become larger and more complicated. In this paper, we investigate the use of network Quality of Service (QoS) to mitigate the effects of network contention. QoS allocates bandwidth to individual jobs, thus limiting the impact that one job can have on another through network contention. We consider coarse-grained QoS, in which each job runs at a different priority level, by running a number of micro-benchmarks and applications in different QoS configurations on real hardware with QoS capabilities. Our results indicate that while network contention reduces job performance by as much as 70%, coarse-grained QoS is unlikely to improve throughput on HPC systems and may increase job execution times by more than 100%. Based on our analysis, finer-grained QoS is more likely to improve performance and throughput.

KW - Contention

KW - High performance computing

KW - Many core

KW - Network

KW - Network contention

KW - Performance

KW - Quality of service

KW - Service level

UR - http://www.scopus.com/inward/record.url?scp=85052227271&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85052227271&partnerID=8YFLogxK

U2 - 10.1109/IPDPSW.2018.00204

DO - 10.1109/IPDPSW.2018.00204

M3 - Conference contribution

AN - SCOPUS:85052227271

SN - 9781538655559

SP - 1313

EP - 1322

BT - Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018

PB - Institute of Electrical and Electronics Engineers Inc.

ER -