Hardware-Level Thread Migration to Reduce On-Chip Data Movement Via Reinforcement Learning

Quintin Fettes, Avinash Karanth, Razvan Bunescu, Ahmed Louri, Kyle Shiflett

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

As the number of processing cores and associated threads in chip multiprocessors (CMPs) continues to scale out, on-chip memory access latency dominates application execution time due to increased data movement. Although tiled CMP architectures with distributed shared caches provide a scalable design, increased physical distance between requesting and responding cores has led to both increased on-chip memory access latency and excess energy consumption. Near data processing is a promising approach that can migrate threads closer to data, however prior hand-engineered rules for fine-grained hardware-level thread migration are either too slow to react to changes in data access patterns, or unable to exploit the large variety of data access patterns. In this article, we propose to use reinforcement learning (RL) to learn relatively complex data access patterns to improve on hardware-level thread migration techniques. By utilizing the recent history of memory access locations as input, each thread learns to recognize the relationship between prior access patterns and future memory access locations. This leads to the unique ability of the proposed technique to make fewer, more effective migrations to intermediate cores that minimize the distance to multiple distinct memory access locations. By allowing a low-overhead RL agent to learn a policy from real interaction with parallel programming benchmarks in a parallel simulator, we show that a migration policy which recognizes more complex data access patterns can be learned. The proposed approach reduces on-chip data movement and energy consumption by an average of 41%, while reducing execution time by 43% when compared to a simple baseline with no thread migration; furthermore, energy consumption and execution time are reduced by an additional 10% when compared to a hand-engineered fine-grained migration policy.

Original languageEnglish (US)
Article number9211404
Pages (from-to)3638-3649
Number of pages12
JournalIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Volume39
Issue number11
DOIs
StatePublished - Nov 2020
Externally publishedYes

Keywords

  • Chip multiprocessors (CMPs)
  • data movement
  • reinforcement learning (RL)
  • thread migration

ASJC Scopus subject areas

  • Software
  • Computer Graphics and Computer-Aided Design
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Hardware-Level Thread Migration to Reduce On-Chip Data Movement Via Reinforcement Learning'. Together they form a unique fingerprint.

Cite this