Protocol Customization for Improving MPI Performance on RDMA-Enabled Clusters

Zheng Gu, Matthew Small, Xin Yuan, Aniruddha Marathe, David K Lowenthal

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Optimizing Message Passing Interface (MPI) point-to-point communication for large messages is of paramount importance since most communications in MPI applications are performed by such operations. Remote Direct Memory Access (RDMA) allows one-sided data transfer and provides great flexibility in the design of efficient communication protocols for large messages. However, achieving high point-to-point communication performance on RDMA-enabled clusters is challenging due to both the complexity in communication protocols and the impact of the protocol invocation scenario on the performance of a given protocol. In this work, we analyze existing protocols and show that they are not ideal in many situations, and propose to use protocol customization, that is, different protocols for different situations to improve MPI performance. More specifically, by leveraging the RDMA capability, we develop a set of protocols that can provide high performance for all protocol invocation scenarios. Armed with this set of protocols that can collectively achieve high performance in all situations, we demonstrate the potential of protocol customization by developing a trace-driven toolkit that allows the appropriate protocol to be selected for each communication in an MPI application to maximize performance. We evaluate the performance of the proposed techniques using micro-benchmarks and application benchmarks. The results indicate that protocol customization can out-perform traditional communication schemes by a large degree in many situations.

Original languageEnglish (US)
Pages (from-to)682-703
Number of pages22
JournalInternational Journal of Parallel Programming
Volume41
Issue number5
DOIs
StatePublished - Oct 2013

Fingerprint

Message Passing Interface
Customization
Message passing
Interfaces (computer)
Network protocols
Data storage equipment
Communication
Communication Protocol
High Performance
Benchmark
Scenarios
Data Transfer
Data transfer
Maximise
Flexibility
Trace

Keywords

  • MPI
  • Point-to-point communication
  • Protocol customization

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Theoretical Computer Science

Cite this

Protocol Customization for Improving MPI Performance on RDMA-Enabled Clusters. / Gu, Zheng; Small, Matthew; Yuan, Xin; Marathe, Aniruddha; Lowenthal, David K.

In: International Journal of Parallel Programming, Vol. 41, No. 5, 10.2013, p. 682-703.

Research output: Contribution to journalArticle

Gu, Zheng ; Small, Matthew ; Yuan, Xin ; Marathe, Aniruddha ; Lowenthal, David K. / Protocol Customization for Improving MPI Performance on RDMA-Enabled Clusters. In: International Journal of Parallel Programming. 2013 ; Vol. 41, No. 5. pp. 682-703.
@article{b1400810a7cb4c42ba1a33b3f374ccab,
title = "Protocol Customization for Improving MPI Performance on RDMA-Enabled Clusters",
abstract = "Optimizing Message Passing Interface (MPI) point-to-point communication for large messages is of paramount importance since most communications in MPI applications are performed by such operations. Remote Direct Memory Access (RDMA) allows one-sided data transfer and provides great flexibility in the design of efficient communication protocols for large messages. However, achieving high point-to-point communication performance on RDMA-enabled clusters is challenging due to both the complexity in communication protocols and the impact of the protocol invocation scenario on the performance of a given protocol. In this work, we analyze existing protocols and show that they are not ideal in many situations, and propose to use protocol customization, that is, different protocols for different situations to improve MPI performance. More specifically, by leveraging the RDMA capability, we develop a set of protocols that can provide high performance for all protocol invocation scenarios. Armed with this set of protocols that can collectively achieve high performance in all situations, we demonstrate the potential of protocol customization by developing a trace-driven toolkit that allows the appropriate protocol to be selected for each communication in an MPI application to maximize performance. We evaluate the performance of the proposed techniques using micro-benchmarks and application benchmarks. The results indicate that protocol customization can out-perform traditional communication schemes by a large degree in many situations.",
keywords = "MPI, Point-to-point communication, Protocol customization",
author = "Zheng Gu and Matthew Small and Xin Yuan and Aniruddha Marathe and Lowenthal, {David K}",
year = "2013",
month = "10",
doi = "10.1007/s10766-013-0242-0",
language = "English (US)",
volume = "41",
pages = "682--703",
journal = "International Journal of Parallel Programming",
issn = "0885-7458",
publisher = "Springer New York",
number = "5",

}

TY - JOUR

T1 - Protocol Customization for Improving MPI Performance on RDMA-Enabled Clusters

AU - Gu, Zheng

AU - Small, Matthew

AU - Yuan, Xin

AU - Marathe, Aniruddha

AU - Lowenthal, David K

PY - 2013/10

Y1 - 2013/10

N2 - Optimizing Message Passing Interface (MPI) point-to-point communication for large messages is of paramount importance since most communications in MPI applications are performed by such operations. Remote Direct Memory Access (RDMA) allows one-sided data transfer and provides great flexibility in the design of efficient communication protocols for large messages. However, achieving high point-to-point communication performance on RDMA-enabled clusters is challenging due to both the complexity in communication protocols and the impact of the protocol invocation scenario on the performance of a given protocol. In this work, we analyze existing protocols and show that they are not ideal in many situations, and propose to use protocol customization, that is, different protocols for different situations to improve MPI performance. More specifically, by leveraging the RDMA capability, we develop a set of protocols that can provide high performance for all protocol invocation scenarios. Armed with this set of protocols that can collectively achieve high performance in all situations, we demonstrate the potential of protocol customization by developing a trace-driven toolkit that allows the appropriate protocol to be selected for each communication in an MPI application to maximize performance. We evaluate the performance of the proposed techniques using micro-benchmarks and application benchmarks. The results indicate that protocol customization can out-perform traditional communication schemes by a large degree in many situations.

AB - Optimizing Message Passing Interface (MPI) point-to-point communication for large messages is of paramount importance since most communications in MPI applications are performed by such operations. Remote Direct Memory Access (RDMA) allows one-sided data transfer and provides great flexibility in the design of efficient communication protocols for large messages. However, achieving high point-to-point communication performance on RDMA-enabled clusters is challenging due to both the complexity in communication protocols and the impact of the protocol invocation scenario on the performance of a given protocol. In this work, we analyze existing protocols and show that they are not ideal in many situations, and propose to use protocol customization, that is, different protocols for different situations to improve MPI performance. More specifically, by leveraging the RDMA capability, we develop a set of protocols that can provide high performance for all protocol invocation scenarios. Armed with this set of protocols that can collectively achieve high performance in all situations, we demonstrate the potential of protocol customization by developing a trace-driven toolkit that allows the appropriate protocol to be selected for each communication in an MPI application to maximize performance. We evaluate the performance of the proposed techniques using micro-benchmarks and application benchmarks. The results indicate that protocol customization can out-perform traditional communication schemes by a large degree in many situations.

KW - MPI

KW - Point-to-point communication

KW - Protocol customization

UR - http://www.scopus.com/inward/record.url?scp=84879181413&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84879181413&partnerID=8YFLogxK

U2 - 10.1007/s10766-013-0242-0

DO - 10.1007/s10766-013-0242-0

M3 - Article

AN - SCOPUS:84879181413

VL - 41

SP - 682

EP - 703

JO - International Journal of Parallel Programming

JF - International Journal of Parallel Programming

SN - 0885-7458

IS - 5

ER -