ROBUST

A new self-healing fault-tolerant NoC router

Jacques Henri Collet, Ahmed Louri, Vivek Tulsidas Bhat, Pavan Poluri

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Citations (Scopus)

Abstract

This work addresses the general problem of making Network-on-Chips (NoCs) routers totally self-healing in massively defective technologies. There are three main contributions. First, we propose a new hardware approach based on Built-In Self-Test techniques and multi-functional blocks (called Universal Logic Blocks, ULBs) to autonomously diagnose permanent faults and repair faulty units. ULBs have the capability to assume the functionality of various functional units within the router through simple reconfiguration and thus enable the repair of multiple permanent faults within the NoC router. Second, we propose a new reliability metric and introduce a probabilistic model to estimate the router reliability improvement achieved by the protection circuitry. Third, we compare our architecture to two router architectures (Vicis and Bulletproof) and we show that our design provides superior reliability improvement especially in extremely defective nanoscale technologies (i.e., typically above 30% of faulty routers). The most striking result is that the self-healing of the routers enables maintaining the communications at fault levels, where it is normally impossible to preserve communications.

Original languageEnglish (US)
Title of host publicationACM International Conference Proceeding Series
Pages11-16
Number of pages6
DOIs
StatePublished - 2011
Event4th International Workshop on Network on Chip Architectures, NoCArc 2011, in Conjunction with the 44th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO44 - Porto Alegre, Brazil
Duration: Dec 4 2011Dec 4 2011

Other

Other4th International Workshop on Network on Chip Architectures, NoCArc 2011, in Conjunction with the 44th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO44
CountryBrazil
CityPorto Alegre
Period12/4/1112/4/11

Fingerprint

Routers
Repair
Built-in self test
Communication
Network-on-chip
Hardware

Keywords

  • fault-tolerance
  • multi-core architectures
  • network-on-chip
  • self-healing

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Collet, J. H., Louri, A., Bhat, V. T., & Poluri, P. (2011). ROBUST: A new self-healing fault-tolerant NoC router. In ACM International Conference Proceeding Series (pp. 11-16) https://doi.org/10.1145/2076501.2076504

ROBUST : A new self-healing fault-tolerant NoC router. / Collet, Jacques Henri; Louri, Ahmed; Bhat, Vivek Tulsidas; Poluri, Pavan.

ACM International Conference Proceeding Series. 2011. p. 11-16.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Collet, JH, Louri, A, Bhat, VT & Poluri, P 2011, ROBUST: A new self-healing fault-tolerant NoC router. in ACM International Conference Proceeding Series. pp. 11-16, 4th International Workshop on Network on Chip Architectures, NoCArc 2011, in Conjunction with the 44th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO44, Porto Alegre, Brazil, 12/4/11. https://doi.org/10.1145/2076501.2076504
Collet JH, Louri A, Bhat VT, Poluri P. ROBUST: A new self-healing fault-tolerant NoC router. In ACM International Conference Proceeding Series. 2011. p. 11-16 https://doi.org/10.1145/2076501.2076504
Collet, Jacques Henri ; Louri, Ahmed ; Bhat, Vivek Tulsidas ; Poluri, Pavan. / ROBUST : A new self-healing fault-tolerant NoC router. ACM International Conference Proceeding Series. 2011. pp. 11-16
@inproceedings{e12a4ef16f3e4514ad17e52227494bdb,
title = "ROBUST: A new self-healing fault-tolerant NoC router",
abstract = "This work addresses the general problem of making Network-on-Chips (NoCs) routers totally self-healing in massively defective technologies. There are three main contributions. First, we propose a new hardware approach based on Built-In Self-Test techniques and multi-functional blocks (called Universal Logic Blocks, ULBs) to autonomously diagnose permanent faults and repair faulty units. ULBs have the capability to assume the functionality of various functional units within the router through simple reconfiguration and thus enable the repair of multiple permanent faults within the NoC router. Second, we propose a new reliability metric and introduce a probabilistic model to estimate the router reliability improvement achieved by the protection circuitry. Third, we compare our architecture to two router architectures (Vicis and Bulletproof) and we show that our design provides superior reliability improvement especially in extremely defective nanoscale technologies (i.e., typically above 30{\%} of faulty routers). The most striking result is that the self-healing of the routers enables maintaining the communications at fault levels, where it is normally impossible to preserve communications.",
keywords = "fault-tolerance, multi-core architectures, network-on-chip, self-healing",
author = "Collet, {Jacques Henri} and Ahmed Louri and Bhat, {Vivek Tulsidas} and Pavan Poluri",
year = "2011",
doi = "10.1145/2076501.2076504",
language = "English (US)",
isbn = "9781450309479",
pages = "11--16",
booktitle = "ACM International Conference Proceeding Series",

}

TY - GEN

T1 - ROBUST

T2 - A new self-healing fault-tolerant NoC router

AU - Collet, Jacques Henri

AU - Louri, Ahmed

AU - Bhat, Vivek Tulsidas

AU - Poluri, Pavan

PY - 2011

Y1 - 2011

N2 - This work addresses the general problem of making Network-on-Chips (NoCs) routers totally self-healing in massively defective technologies. There are three main contributions. First, we propose a new hardware approach based on Built-In Self-Test techniques and multi-functional blocks (called Universal Logic Blocks, ULBs) to autonomously diagnose permanent faults and repair faulty units. ULBs have the capability to assume the functionality of various functional units within the router through simple reconfiguration and thus enable the repair of multiple permanent faults within the NoC router. Second, we propose a new reliability metric and introduce a probabilistic model to estimate the router reliability improvement achieved by the protection circuitry. Third, we compare our architecture to two router architectures (Vicis and Bulletproof) and we show that our design provides superior reliability improvement especially in extremely defective nanoscale technologies (i.e., typically above 30% of faulty routers). The most striking result is that the self-healing of the routers enables maintaining the communications at fault levels, where it is normally impossible to preserve communications.

AB - This work addresses the general problem of making Network-on-Chips (NoCs) routers totally self-healing in massively defective technologies. There are three main contributions. First, we propose a new hardware approach based on Built-In Self-Test techniques and multi-functional blocks (called Universal Logic Blocks, ULBs) to autonomously diagnose permanent faults and repair faulty units. ULBs have the capability to assume the functionality of various functional units within the router through simple reconfiguration and thus enable the repair of multiple permanent faults within the NoC router. Second, we propose a new reliability metric and introduce a probabilistic model to estimate the router reliability improvement achieved by the protection circuitry. Third, we compare our architecture to two router architectures (Vicis and Bulletproof) and we show that our design provides superior reliability improvement especially in extremely defective nanoscale technologies (i.e., typically above 30% of faulty routers). The most striking result is that the self-healing of the routers enables maintaining the communications at fault levels, where it is normally impossible to preserve communications.

KW - fault-tolerance

KW - multi-core architectures

KW - network-on-chip

KW - self-healing

UR - http://www.scopus.com/inward/record.url?scp=84855302363&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84855302363&partnerID=8YFLogxK

U2 - 10.1145/2076501.2076504

DO - 10.1145/2076501.2076504

M3 - Conference contribution

SN - 9781450309479

SP - 11

EP - 16

BT - ACM International Conference Proceeding Series

ER -