Building simple models: A case study with decision trees

David Jensen, Tim Oates, Paul R Cohen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Building correctly-sized models is a central challenge for induction algorithms. Many approaches to decision tree induction fail this challenge. Under a broad range of circumstances, these approaches exhibit a nearly linear relationship between training set size and tree size, even after accuracy has ceased to increase. These algorithms fail to adjust for the statistical effects of comparing multiple subtrees. Adjusting for these effects produces trees with little or no excess structure.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages211-222
Number of pages12
Volume1280
ISBN (Print)9783540633464
StatePublished - 1997
Externally publishedYes
Event2nd International Symposium on Intelligent Data Analysis, IDA 1997 - London, United Kingdom
Duration: Aug 4 1997Aug 6 1997

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1280
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other2nd International Symposium on Intelligent Data Analysis, IDA 1997
CountryUnited Kingdom
CityLondon
Period8/4/978/6/97

Fingerprint

Decision trees
Decision tree
Proof by induction
Excess
Model
Range of data
Training
Relationships

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Jensen, D., Oates, T., & Cohen, P. R. (1997). Building simple models: A case study with decision trees. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1280, pp. 211-222). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1280). Springer Verlag.

Building simple models : A case study with decision trees. / Jensen, David; Oates, Tim; Cohen, Paul R.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 1280 Springer Verlag, 1997. p. 211-222 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1280).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Jensen, D, Oates, T & Cohen, PR 1997, Building simple models: A case study with decision trees. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 1280, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 1280, Springer Verlag, pp. 211-222, 2nd International Symposium on Intelligent Data Analysis, IDA 1997, London, United Kingdom, 8/4/97.
Jensen D, Oates T, Cohen PR. Building simple models: A case study with decision trees. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 1280. Springer Verlag. 1997. p. 211-222. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Jensen, David ; Oates, Tim ; Cohen, Paul R. / Building simple models : A case study with decision trees. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 1280 Springer Verlag, 1997. pp. 211-222 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{441ea60fc5184e5baf83bdb54f7e1b32,
title = "Building simple models: A case study with decision trees",
abstract = "Building correctly-sized models is a central challenge for induction algorithms. Many approaches to decision tree induction fail this challenge. Under a broad range of circumstances, these approaches exhibit a nearly linear relationship between training set size and tree size, even after accuracy has ceased to increase. These algorithms fail to adjust for the statistical effects of comparing multiple subtrees. Adjusting for these effects produces trees with little or no excess structure.",
author = "David Jensen and Tim Oates and Cohen, {Paul R}",
year = "1997",
language = "English (US)",
isbn = "9783540633464",
volume = "1280",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "211--222",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Building simple models

T2 - A case study with decision trees

AU - Jensen, David

AU - Oates, Tim

AU - Cohen, Paul R

PY - 1997

Y1 - 1997

N2 - Building correctly-sized models is a central challenge for induction algorithms. Many approaches to decision tree induction fail this challenge. Under a broad range of circumstances, these approaches exhibit a nearly linear relationship between training set size and tree size, even after accuracy has ceased to increase. These algorithms fail to adjust for the statistical effects of comparing multiple subtrees. Adjusting for these effects produces trees with little or no excess structure.

AB - Building correctly-sized models is a central challenge for induction algorithms. Many approaches to decision tree induction fail this challenge. Under a broad range of circumstances, these approaches exhibit a nearly linear relationship between training set size and tree size, even after accuracy has ceased to increase. These algorithms fail to adjust for the statistical effects of comparing multiple subtrees. Adjusting for these effects produces trees with little or no excess structure.

UR - http://www.scopus.com/inward/record.url?scp=84867530100&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867530100&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84867530100

SN - 9783540633464

VL - 1280

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 211

EP - 222

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

PB - Springer Verlag

ER -