Efficient instruction scheduling for delayed-load architectures

Steven M. Kurlander, Todd A. Proebsting, Todd A Proebsting

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

A fast, optimal code-scheduling algorithm for processors with a delayed load of one instruction cycle is described. The algorithm minimizes both execution time and register use and runs in time proportional to the size of the expression-tree. An extension that spills registers when too few registers are available is also presented. The algorithm also performs very well for delayed loads of greater than one instruction cycle. A heuristic that schedules DAGs and is based on our optimal expression-tree-scheduling algorithm is presented and compared with Goodman and Hsu's algorithm Integrated Prepass Scheduling (IPS). Both schedulers perform well on benchmarks with small basic blocks, but on large basic blocks our scheduler outperforms IPS and is significantly faster.

Original languageEnglish (US)
Pages (from-to)740-776
Number of pages37
JournalACM Transactions on Programming Languages and Systems
Volume17
Issue number5
DOIs
StatePublished - Sep 1995

Fingerprint

Scheduling
Scheduling algorithms
Hazardous materials spills

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Software

Cite this

Efficient instruction scheduling for delayed-load architectures. / Kurlander, Steven M.; Proebsting, Todd A.; Proebsting, Todd A.

In: ACM Transactions on Programming Languages and Systems, Vol. 17, No. 5, 09.1995, p. 740-776.

Research output: Contribution to journalArticle

@article{3d47967387ae41e3a0251d95f68bbb6c,
title = "Efficient instruction scheduling for delayed-load architectures",
abstract = "A fast, optimal code-scheduling algorithm for processors with a delayed load of one instruction cycle is described. The algorithm minimizes both execution time and register use and runs in time proportional to the size of the expression-tree. An extension that spills registers when too few registers are available is also presented. The algorithm also performs very well for delayed loads of greater than one instruction cycle. A heuristic that schedules DAGs and is based on our optimal expression-tree-scheduling algorithm is presented and compared with Goodman and Hsu's algorithm Integrated Prepass Scheduling (IPS). Both schedulers perform well on benchmarks with small basic blocks, but on large basic blocks our scheduler outperforms IPS and is significantly faster.",
author = "Kurlander, {Steven M.} and Proebsting, {Todd A.} and Proebsting, {Todd A}",
year = "1995",
month = "9",
doi = "10.1145/213978.213987",
language = "English (US)",
volume = "17",
pages = "740--776",
journal = "ACM Transactions on Programming Languages and Systems",
issn = "0164-0925",
publisher = "Association for Computing Machinery (ACM)",
number = "5",

}

TY - JOUR

T1 - Efficient instruction scheduling for delayed-load architectures

AU - Kurlander, Steven M.

AU - Proebsting, Todd A.

AU - Proebsting, Todd A

PY - 1995/9

Y1 - 1995/9

N2 - A fast, optimal code-scheduling algorithm for processors with a delayed load of one instruction cycle is described. The algorithm minimizes both execution time and register use and runs in time proportional to the size of the expression-tree. An extension that spills registers when too few registers are available is also presented. The algorithm also performs very well for delayed loads of greater than one instruction cycle. A heuristic that schedules DAGs and is based on our optimal expression-tree-scheduling algorithm is presented and compared with Goodman and Hsu's algorithm Integrated Prepass Scheduling (IPS). Both schedulers perform well on benchmarks with small basic blocks, but on large basic blocks our scheduler outperforms IPS and is significantly faster.

AB - A fast, optimal code-scheduling algorithm for processors with a delayed load of one instruction cycle is described. The algorithm minimizes both execution time and register use and runs in time proportional to the size of the expression-tree. An extension that spills registers when too few registers are available is also presented. The algorithm also performs very well for delayed loads of greater than one instruction cycle. A heuristic that schedules DAGs and is based on our optimal expression-tree-scheduling algorithm is presented and compared with Goodman and Hsu's algorithm Integrated Prepass Scheduling (IPS). Both schedulers perform well on benchmarks with small basic blocks, but on large basic blocks our scheduler outperforms IPS and is significantly faster.

UR - http://www.scopus.com/inward/record.url?scp=0029368698&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0029368698&partnerID=8YFLogxK

U2 - 10.1145/213978.213987

DO - 10.1145/213978.213987

M3 - Article

AN - SCOPUS:0029368698

VL - 17

SP - 740

EP - 776

JO - ACM Transactions on Programming Languages and Systems

JF - ACM Transactions on Programming Languages and Systems

SN - 0164-0925

IS - 5

ER -