STAR-MPI

Self tuned adaptive routines for MPI collective operations

Ahmad Faraj, Xin Yuan, David K Lowenthal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

48 Citations (Scopus)

Abstract

Message Passing Interface (MPI) collective communication routines are widely used in parallel applications. In order for a collective communication routine to achieve high performance for different applications on different platforms, it must be adaptable to both the system architecture and the application workload. Current MPI implementations do not support such software adaptability and are not able to achieve high performance on many platforms. In this paper, we present STAR-MPI (Self Tuned Adaptive Routines for MPI collective operations), a set of MPI collective communication routines that are capable of adapting to system architecture and application workload. For each operation, STAR-MPI maintains a set of communication algorithms that can potentially be efficient at different situations. As an application executes, a STAR-MPI routine applies the Automatic Empirical Optimization of Software (AEOS) technique at run time to dynamically select the best performing algorithm for the application on the platform. We describe the techniques used in STAR-MPI, analyze STAR-MPI overheads, and evaluate the performance of STAR-MPI with applications and benchmarks. The results of our study indicate that STAR-MPI is robust and efficient. It is able to and efficient algorithms with reasonable overheads, and it out-performs traditional MPI implementations to a large degree in many cases.

Original languageEnglish (US)
Title of host publicationProceedings of the International Conference on Supercomputing
Pages199-208
Number of pages10
DOIs
StatePublished - 2006
Externally publishedYes
Event20th Annual International Conference on Supercomputing, ICS 2006 - Cairns, Queensland, Australia
Duration: Jun 28 2006Jul 1 2006

Other

Other20th Annual International Conference on Supercomputing, ICS 2006
CountryAustralia
CityCairns, Queensland
Period6/28/067/1/06

Fingerprint

Message passing
Communication

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Faraj, A., Yuan, X., & Lowenthal, D. K. (2006). STAR-MPI: Self tuned adaptive routines for MPI collective operations. In Proceedings of the International Conference on Supercomputing (pp. 199-208) https://doi.org/10.1145/1183401.1183431

STAR-MPI : Self tuned adaptive routines for MPI collective operations. / Faraj, Ahmad; Yuan, Xin; Lowenthal, David K.

Proceedings of the International Conference on Supercomputing. 2006. p. 199-208.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Faraj, A, Yuan, X & Lowenthal, DK 2006, STAR-MPI: Self tuned adaptive routines for MPI collective operations. in Proceedings of the International Conference on Supercomputing. pp. 199-208, 20th Annual International Conference on Supercomputing, ICS 2006, Cairns, Queensland, Australia, 6/28/06. https://doi.org/10.1145/1183401.1183431
Faraj A, Yuan X, Lowenthal DK. STAR-MPI: Self tuned adaptive routines for MPI collective operations. In Proceedings of the International Conference on Supercomputing. 2006. p. 199-208 https://doi.org/10.1145/1183401.1183431
Faraj, Ahmad ; Yuan, Xin ; Lowenthal, David K. / STAR-MPI : Self tuned adaptive routines for MPI collective operations. Proceedings of the International Conference on Supercomputing. 2006. pp. 199-208
@inproceedings{2738209c3952496fbb122cbf8fc030ad,
title = "STAR-MPI: Self tuned adaptive routines for MPI collective operations",
abstract = "Message Passing Interface (MPI) collective communication routines are widely used in parallel applications. In order for a collective communication routine to achieve high performance for different applications on different platforms, it must be adaptable to both the system architecture and the application workload. Current MPI implementations do not support such software adaptability and are not able to achieve high performance on many platforms. In this paper, we present STAR-MPI (Self Tuned Adaptive Routines for MPI collective operations), a set of MPI collective communication routines that are capable of adapting to system architecture and application workload. For each operation, STAR-MPI maintains a set of communication algorithms that can potentially be efficient at different situations. As an application executes, a STAR-MPI routine applies the Automatic Empirical Optimization of Software (AEOS) technique at run time to dynamically select the best performing algorithm for the application on the platform. We describe the techniques used in STAR-MPI, analyze STAR-MPI overheads, and evaluate the performance of STAR-MPI with applications and benchmarks. The results of our study indicate that STAR-MPI is robust and efficient. It is able to and efficient algorithms with reasonable overheads, and it out-performs traditional MPI implementations to a large degree in many cases.",
author = "Ahmad Faraj and Xin Yuan and Lowenthal, {David K}",
year = "2006",
doi = "10.1145/1183401.1183431",
language = "English (US)",
isbn = "1595932828",
pages = "199--208",
booktitle = "Proceedings of the International Conference on Supercomputing",

}

TY - GEN

T1 - STAR-MPI

T2 - Self tuned adaptive routines for MPI collective operations

AU - Faraj, Ahmad

AU - Yuan, Xin

AU - Lowenthal, David K

PY - 2006

Y1 - 2006

N2 - Message Passing Interface (MPI) collective communication routines are widely used in parallel applications. In order for a collective communication routine to achieve high performance for different applications on different platforms, it must be adaptable to both the system architecture and the application workload. Current MPI implementations do not support such software adaptability and are not able to achieve high performance on many platforms. In this paper, we present STAR-MPI (Self Tuned Adaptive Routines for MPI collective operations), a set of MPI collective communication routines that are capable of adapting to system architecture and application workload. For each operation, STAR-MPI maintains a set of communication algorithms that can potentially be efficient at different situations. As an application executes, a STAR-MPI routine applies the Automatic Empirical Optimization of Software (AEOS) technique at run time to dynamically select the best performing algorithm for the application on the platform. We describe the techniques used in STAR-MPI, analyze STAR-MPI overheads, and evaluate the performance of STAR-MPI with applications and benchmarks. The results of our study indicate that STAR-MPI is robust and efficient. It is able to and efficient algorithms with reasonable overheads, and it out-performs traditional MPI implementations to a large degree in many cases.

AB - Message Passing Interface (MPI) collective communication routines are widely used in parallel applications. In order for a collective communication routine to achieve high performance for different applications on different platforms, it must be adaptable to both the system architecture and the application workload. Current MPI implementations do not support such software adaptability and are not able to achieve high performance on many platforms. In this paper, we present STAR-MPI (Self Tuned Adaptive Routines for MPI collective operations), a set of MPI collective communication routines that are capable of adapting to system architecture and application workload. For each operation, STAR-MPI maintains a set of communication algorithms that can potentially be efficient at different situations. As an application executes, a STAR-MPI routine applies the Automatic Empirical Optimization of Software (AEOS) technique at run time to dynamically select the best performing algorithm for the application on the platform. We describe the techniques used in STAR-MPI, analyze STAR-MPI overheads, and evaluate the performance of STAR-MPI with applications and benchmarks. The results of our study indicate that STAR-MPI is robust and efficient. It is able to and efficient algorithms with reasonable overheads, and it out-performs traditional MPI implementations to a large degree in many cases.

UR - http://www.scopus.com/inward/record.url?scp=34248373234&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34248373234&partnerID=8YFLogxK

U2 - 10.1145/1183401.1183431

DO - 10.1145/1183401.1183431

M3 - Conference contribution

SN - 1595932828

SN - 9781595932822

SP - 199

EP - 208

BT - Proceedings of the International Conference on Supercomputing

ER -