Toward AI research methodology: Three case studies in evaluation

Paul R Cohen, Adele E. Howe

Research output: Contribution to journalArticle

29 Citations (Scopus)

Abstract

The roles of evaluation in empirical artificial intelligence (AI) research are described, in an idealized cyclic model and in the context of three case studies. The case studies illustrate pitfalls in evaluation and the contributions of evaluation at all stages of the research cycle. Evaluation methods are constrasted with those of the behavioral sciences, and it is concluded that AI must define and refine its own methods. To this end, several experiment schemas and many specific evaluation criteria are described. Recommendations are offered in the hope of encouraging the development and practice of evaluation methods in AI. The first case study illustrates problems with evaluating knowledge-based systems, specifically a portfolio management expert system called FOLIO. The second study focuses on the relationship between evaluation and the evolution of the GRANT system, specifically, how the evaluations changed as GRANT's knowledge base was sealed up. Third, the cyclic nature of a given research model is examined.

Original languageEnglish (US)
Pages (from-to)634-646
Number of pages13
JournalIEEE Transactions on Systems, Man and Cybernetics
Volume19
Issue number3
DOIs
StatePublished - May 1989
Externally publishedYes

Fingerprint

Artificial intelligence
Knowledge based systems
Expert systems
Experiments

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Toward AI research methodology : Three case studies in evaluation. / Cohen, Paul R; Howe, Adele E.

In: IEEE Transactions on Systems, Man and Cybernetics, Vol. 19, No. 3, 05.1989, p. 634-646.

Research output: Contribution to journalArticle

@article{c1dffd844eb140c1bd0eb6af14a3811a,
title = "Toward AI research methodology: Three case studies in evaluation",
abstract = "The roles of evaluation in empirical artificial intelligence (AI) research are described, in an idealized cyclic model and in the context of three case studies. The case studies illustrate pitfalls in evaluation and the contributions of evaluation at all stages of the research cycle. Evaluation methods are constrasted with those of the behavioral sciences, and it is concluded that AI must define and refine its own methods. To this end, several experiment schemas and many specific evaluation criteria are described. Recommendations are offered in the hope of encouraging the development and practice of evaluation methods in AI. The first case study illustrates problems with evaluating knowledge-based systems, specifically a portfolio management expert system called FOLIO. The second study focuses on the relationship between evaluation and the evolution of the GRANT system, specifically, how the evaluations changed as GRANT's knowledge base was sealed up. Third, the cyclic nature of a given research model is examined.",
author = "Cohen, {Paul R} and Howe, {Adele E.}",
year = "1989",
month = "5",
doi = "10.1109/21.31069",
language = "English (US)",
volume = "19",
pages = "634--646",
journal = "IEEE Transactions on Systems, Man and Cybernetics",
issn = "0018-9472",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "3",

}

TY - JOUR

T1 - Toward AI research methodology

T2 - Three case studies in evaluation

AU - Cohen, Paul R

AU - Howe, Adele E.

PY - 1989/5

Y1 - 1989/5

N2 - The roles of evaluation in empirical artificial intelligence (AI) research are described, in an idealized cyclic model and in the context of three case studies. The case studies illustrate pitfalls in evaluation and the contributions of evaluation at all stages of the research cycle. Evaluation methods are constrasted with those of the behavioral sciences, and it is concluded that AI must define and refine its own methods. To this end, several experiment schemas and many specific evaluation criteria are described. Recommendations are offered in the hope of encouraging the development and practice of evaluation methods in AI. The first case study illustrates problems with evaluating knowledge-based systems, specifically a portfolio management expert system called FOLIO. The second study focuses on the relationship between evaluation and the evolution of the GRANT system, specifically, how the evaluations changed as GRANT's knowledge base was sealed up. Third, the cyclic nature of a given research model is examined.

AB - The roles of evaluation in empirical artificial intelligence (AI) research are described, in an idealized cyclic model and in the context of three case studies. The case studies illustrate pitfalls in evaluation and the contributions of evaluation at all stages of the research cycle. Evaluation methods are constrasted with those of the behavioral sciences, and it is concluded that AI must define and refine its own methods. To this end, several experiment schemas and many specific evaluation criteria are described. Recommendations are offered in the hope of encouraging the development and practice of evaluation methods in AI. The first case study illustrates problems with evaluating knowledge-based systems, specifically a portfolio management expert system called FOLIO. The second study focuses on the relationship between evaluation and the evolution of the GRANT system, specifically, how the evaluations changed as GRANT's knowledge base was sealed up. Third, the cyclic nature of a given research model is examined.

UR - http://www.scopus.com/inward/record.url?scp=0024668625&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0024668625&partnerID=8YFLogxK

U2 - 10.1109/21.31069

DO - 10.1109/21.31069

M3 - Article

AN - SCOPUS:0024668625

VL - 19

SP - 634

EP - 646

JO - IEEE Transactions on Systems, Man and Cybernetics

JF - IEEE Transactions on Systems, Man and Cybernetics

SN - 0018-9472

IS - 3

ER -