NBTI aware workload balancing in multi-core systems

Jin Sun, Avinash Kodi, Ahmed Louri, Meiling Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Citations (Scopus)

Abstract

As device feature size continues to shrink, reliability becomes a severe issue due to process variation, particle-induced transient errors, and transistor wear-out/stress such as Negative Bias Temperature Instability (NBTI). Unless this problem is addressed, chip multi-processor (CMP) systems face low yields and short mean-time-to-failure (MTTF). This paper proposes a new design framework for multi-core system that includes device wear-out impact. Based on device fractional NBTI model, we propose a new NBTI aware system workload model, and develop new dynamic tile partition (DTP) algorithm to balance workload among active cores while relaxing stressed ones. Experimental results on 64 cores show that by allowing a small number of cores (around 10%)to relax in a short time period (10 second), the proposed methodology improves CMP system yield. We use the percentage of core failure to represent the yield improvement. The new strategy improves the core failure number by 20 %, and extend MTTF by 30% with little degradation in performance (less than 6%).

Original languageEnglish (US)
Title of host publicationProceedings of the 10th International Symposium on Quality Electronic Design, ISQED 2009
Pages833-838
Number of pages6
DOIs
StatePublished - 2009
Externally publishedYes
Event10th International Symposium on Quality Electronic Design, ISQED 2009 - San Jose, CA, United States
Duration: Mar 16 2009Mar 18 2009

Other

Other10th International Symposium on Quality Electronic Design, ISQED 2009
CountryUnited States
CitySan Jose, CA
Period3/16/093/18/09

Fingerprint

Wear of materials
Partitions (building)
Tile
Transistors
Degradation
Negative bias temperature instability

ASJC Scopus subject areas

  • Hardware and Architecture
  • Electrical and Electronic Engineering

Cite this

Sun, J., Kodi, A., Louri, A., & Wang, M. (2009). NBTI aware workload balancing in multi-core systems. In Proceedings of the 10th International Symposium on Quality Electronic Design, ISQED 2009 (pp. 833-838). [4810400] https://doi.org/10.1109/ISQED.2009.4810400

NBTI aware workload balancing in multi-core systems. / Sun, Jin; Kodi, Avinash; Louri, Ahmed; Wang, Meiling.

Proceedings of the 10th International Symposium on Quality Electronic Design, ISQED 2009. 2009. p. 833-838 4810400.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sun, J, Kodi, A, Louri, A & Wang, M 2009, NBTI aware workload balancing in multi-core systems. in Proceedings of the 10th International Symposium on Quality Electronic Design, ISQED 2009., 4810400, pp. 833-838, 10th International Symposium on Quality Electronic Design, ISQED 2009, San Jose, CA, United States, 3/16/09. https://doi.org/10.1109/ISQED.2009.4810400
Sun J, Kodi A, Louri A, Wang M. NBTI aware workload balancing in multi-core systems. In Proceedings of the 10th International Symposium on Quality Electronic Design, ISQED 2009. 2009. p. 833-838. 4810400 https://doi.org/10.1109/ISQED.2009.4810400
Sun, Jin ; Kodi, Avinash ; Louri, Ahmed ; Wang, Meiling. / NBTI aware workload balancing in multi-core systems. Proceedings of the 10th International Symposium on Quality Electronic Design, ISQED 2009. 2009. pp. 833-838
@inproceedings{457ea236819c415e9c7061b4d518c00f,
title = "NBTI aware workload balancing in multi-core systems",
abstract = "As device feature size continues to shrink, reliability becomes a severe issue due to process variation, particle-induced transient errors, and transistor wear-out/stress such as Negative Bias Temperature Instability (NBTI). Unless this problem is addressed, chip multi-processor (CMP) systems face low yields and short mean-time-to-failure (MTTF). This paper proposes a new design framework for multi-core system that includes device wear-out impact. Based on device fractional NBTI model, we propose a new NBTI aware system workload model, and develop new dynamic tile partition (DTP) algorithm to balance workload among active cores while relaxing stressed ones. Experimental results on 64 cores show that by allowing a small number of cores (around 10{\%})to relax in a short time period (10 second), the proposed methodology improves CMP system yield. We use the percentage of core failure to represent the yield improvement. The new strategy improves the core failure number by 20 {\%}, and extend MTTF by 30{\%} with little degradation in performance (less than 6{\%}).",
author = "Jin Sun and Avinash Kodi and Ahmed Louri and Meiling Wang",
year = "2009",
doi = "10.1109/ISQED.2009.4810400",
language = "English (US)",
isbn = "9781424429530",
pages = "833--838",
booktitle = "Proceedings of the 10th International Symposium on Quality Electronic Design, ISQED 2009",

}

TY - GEN

T1 - NBTI aware workload balancing in multi-core systems

AU - Sun, Jin

AU - Kodi, Avinash

AU - Louri, Ahmed

AU - Wang, Meiling

PY - 2009

Y1 - 2009

N2 - As device feature size continues to shrink, reliability becomes a severe issue due to process variation, particle-induced transient errors, and transistor wear-out/stress such as Negative Bias Temperature Instability (NBTI). Unless this problem is addressed, chip multi-processor (CMP) systems face low yields and short mean-time-to-failure (MTTF). This paper proposes a new design framework for multi-core system that includes device wear-out impact. Based on device fractional NBTI model, we propose a new NBTI aware system workload model, and develop new dynamic tile partition (DTP) algorithm to balance workload among active cores while relaxing stressed ones. Experimental results on 64 cores show that by allowing a small number of cores (around 10%)to relax in a short time period (10 second), the proposed methodology improves CMP system yield. We use the percentage of core failure to represent the yield improvement. The new strategy improves the core failure number by 20 %, and extend MTTF by 30% with little degradation in performance (less than 6%).

AB - As device feature size continues to shrink, reliability becomes a severe issue due to process variation, particle-induced transient errors, and transistor wear-out/stress such as Negative Bias Temperature Instability (NBTI). Unless this problem is addressed, chip multi-processor (CMP) systems face low yields and short mean-time-to-failure (MTTF). This paper proposes a new design framework for multi-core system that includes device wear-out impact. Based on device fractional NBTI model, we propose a new NBTI aware system workload model, and develop new dynamic tile partition (DTP) algorithm to balance workload among active cores while relaxing stressed ones. Experimental results on 64 cores show that by allowing a small number of cores (around 10%)to relax in a short time period (10 second), the proposed methodology improves CMP system yield. We use the percentage of core failure to represent the yield improvement. The new strategy improves the core failure number by 20 %, and extend MTTF by 30% with little degradation in performance (less than 6%).

UR - http://www.scopus.com/inward/record.url?scp=67649656433&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=67649656433&partnerID=8YFLogxK

U2 - 10.1109/ISQED.2009.4810400

DO - 10.1109/ISQED.2009.4810400

M3 - Conference contribution

SN - 9781424429530

SP - 833

EP - 838

BT - Proceedings of the 10th International Symposium on Quality Electronic Design, ISQED 2009

ER -