A self-tuning cache architecture for embedded systems

Chuanjun Zhang, Frank Vahid, Roman L Lysecky

Research output: Chapter in Book/Report/Conference proceedingConference contribution

40 Citations (Scopus)

Abstract

Memory accesses can account for about half of a microprocessor system's power consumption. Customizing a microprocessor cache's total size, line size and associativity to a particular program is well known to have tremendous benefits for performance and power. Customizing caches has until recently been restricted to core-based flows, in which a new chip will be fabricated. However, several configurable cache architectures have been proposed recently for use in pre-fabricated microprocessor platforms. Tuning those caches to a program is still however a cumbersome task left for designers, assisted in part by recent computer-aided design (CAD) tuning aids. We propose to move that CAD on-chip, which can greatly increase the acceptance of configurable caches. We introduce on-chip hardware implementing an efficient cache tuning heuristic that can automatically, transparently, and dynamically tune the cache to an executing program. We carefully designed the heuristic to avoid any cache flushing, since flushing is power and performance costly. By simulating numerous Powerstone and MediaBench benchmarks, we show that such a dynamic self-tuning cache can reduce memory-access energy by 45% to 55% on average, and as much as 97%, compared with a four-way set-associative base cache, completely transparently to the programmer.

Original languageEnglish (US)
Title of host publicationProceedings - Design, Automation and Test in Europe Conference and Exhibition
EditorsG. Gielen, J. Figueras
Pages142-147
Number of pages6
Volume1
StatePublished - 2004
Externally publishedYes
EventProceedings - Design, Automation and Test in Europe Conference and Exhibition, DATE 04 - Paris, France
Duration: Feb 16 2004Feb 20 2004

Other

OtherProceedings - Design, Automation and Test in Europe Conference and Exhibition, DATE 04
CountryFrance
CityParis
Period2/16/042/20/04

Fingerprint

Embedded systems
Tuning
Microprocessor chips
Computer aided design
Data storage equipment
Electric power utilization
Hardware

Keywords

  • Architecture tuning
  • Cache
  • Configurable
  • Dynamic optimization
  • Embedded systems
  • Low energy
  • Low power
  • On-chip CAD

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Zhang, C., Vahid, F., & Lysecky, R. L. (2004). A self-tuning cache architecture for embedded systems. In G. Gielen, & J. Figueras (Eds.), Proceedings - Design, Automation and Test in Europe Conference and Exhibition (Vol. 1, pp. 142-147)

A self-tuning cache architecture for embedded systems. / Zhang, Chuanjun; Vahid, Frank; Lysecky, Roman L.

Proceedings - Design, Automation and Test in Europe Conference and Exhibition. ed. / G. Gielen; J. Figueras. Vol. 1 2004. p. 142-147.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhang, C, Vahid, F & Lysecky, RL 2004, A self-tuning cache architecture for embedded systems. in G Gielen & J Figueras (eds), Proceedings - Design, Automation and Test in Europe Conference and Exhibition. vol. 1, pp. 142-147, Proceedings - Design, Automation and Test in Europe Conference and Exhibition, DATE 04, Paris, France, 2/16/04.
Zhang C, Vahid F, Lysecky RL. A self-tuning cache architecture for embedded systems. In Gielen G, Figueras J, editors, Proceedings - Design, Automation and Test in Europe Conference and Exhibition. Vol. 1. 2004. p. 142-147
Zhang, Chuanjun ; Vahid, Frank ; Lysecky, Roman L. / A self-tuning cache architecture for embedded systems. Proceedings - Design, Automation and Test in Europe Conference and Exhibition. editor / G. Gielen ; J. Figueras. Vol. 1 2004. pp. 142-147
@inproceedings{d3229a92954c450cb605d8fd9394a7a8,
title = "A self-tuning cache architecture for embedded systems",
abstract = "Memory accesses can account for about half of a microprocessor system's power consumption. Customizing a microprocessor cache's total size, line size and associativity to a particular program is well known to have tremendous benefits for performance and power. Customizing caches has until recently been restricted to core-based flows, in which a new chip will be fabricated. However, several configurable cache architectures have been proposed recently for use in pre-fabricated microprocessor platforms. Tuning those caches to a program is still however a cumbersome task left for designers, assisted in part by recent computer-aided design (CAD) tuning aids. We propose to move that CAD on-chip, which can greatly increase the acceptance of configurable caches. We introduce on-chip hardware implementing an efficient cache tuning heuristic that can automatically, transparently, and dynamically tune the cache to an executing program. We carefully designed the heuristic to avoid any cache flushing, since flushing is power and performance costly. By simulating numerous Powerstone and MediaBench benchmarks, we show that such a dynamic self-tuning cache can reduce memory-access energy by 45{\%} to 55{\%} on average, and as much as 97{\%}, compared with a four-way set-associative base cache, completely transparently to the programmer.",
keywords = "Architecture tuning, Cache, Configurable, Dynamic optimization, Embedded systems, Low energy, Low power, On-chip CAD",
author = "Chuanjun Zhang and Frank Vahid and Lysecky, {Roman L}",
year = "2004",
language = "English (US)",
isbn = "0769520855",
volume = "1",
pages = "142--147",
editor = "G. Gielen and J. Figueras",
booktitle = "Proceedings - Design, Automation and Test in Europe Conference and Exhibition",

}

TY - GEN

T1 - A self-tuning cache architecture for embedded systems

AU - Zhang, Chuanjun

AU - Vahid, Frank

AU - Lysecky, Roman L

PY - 2004

Y1 - 2004

N2 - Memory accesses can account for about half of a microprocessor system's power consumption. Customizing a microprocessor cache's total size, line size and associativity to a particular program is well known to have tremendous benefits for performance and power. Customizing caches has until recently been restricted to core-based flows, in which a new chip will be fabricated. However, several configurable cache architectures have been proposed recently for use in pre-fabricated microprocessor platforms. Tuning those caches to a program is still however a cumbersome task left for designers, assisted in part by recent computer-aided design (CAD) tuning aids. We propose to move that CAD on-chip, which can greatly increase the acceptance of configurable caches. We introduce on-chip hardware implementing an efficient cache tuning heuristic that can automatically, transparently, and dynamically tune the cache to an executing program. We carefully designed the heuristic to avoid any cache flushing, since flushing is power and performance costly. By simulating numerous Powerstone and MediaBench benchmarks, we show that such a dynamic self-tuning cache can reduce memory-access energy by 45% to 55% on average, and as much as 97%, compared with a four-way set-associative base cache, completely transparently to the programmer.

AB - Memory accesses can account for about half of a microprocessor system's power consumption. Customizing a microprocessor cache's total size, line size and associativity to a particular program is well known to have tremendous benefits for performance and power. Customizing caches has until recently been restricted to core-based flows, in which a new chip will be fabricated. However, several configurable cache architectures have been proposed recently for use in pre-fabricated microprocessor platforms. Tuning those caches to a program is still however a cumbersome task left for designers, assisted in part by recent computer-aided design (CAD) tuning aids. We propose to move that CAD on-chip, which can greatly increase the acceptance of configurable caches. We introduce on-chip hardware implementing an efficient cache tuning heuristic that can automatically, transparently, and dynamically tune the cache to an executing program. We carefully designed the heuristic to avoid any cache flushing, since flushing is power and performance costly. By simulating numerous Powerstone and MediaBench benchmarks, we show that such a dynamic self-tuning cache can reduce memory-access energy by 45% to 55% on average, and as much as 97%, compared with a four-way set-associative base cache, completely transparently to the programmer.

KW - Architecture tuning

KW - Cache

KW - Configurable

KW - Dynamic optimization

KW - Embedded systems

KW - Low energy

KW - Low power

KW - On-chip CAD

UR - http://www.scopus.com/inward/record.url?scp=3042522990&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=3042522990&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:3042522990

SN - 0769520855

SN - 9780769520858

VL - 1

SP - 142

EP - 147

BT - Proceedings - Design, Automation and Test in Europe Conference and Exhibition

A2 - Gielen, G.

A2 - Figueras, J.

ER -