Constrained cascade generalization of decision trees

Huimin Zhao, Sudha Ram

Research output: Contribution to journalArticle

29 Citations (Scopus)

Abstract

While decision tree techniques have been widely used in classification applications, a shortcoming of many decision tree inducers is that they do not learn intermediate concepts, i.e., at each node, only one of the original features is involved In the branching decision. Combining other classification methods, which learn intermediate concepts, with decision tree inducers can produce more flexible decision boundaries that separate different classes, potentially Improving classification accuracy. We propose a generic algorithm for cascade generalization of decision tree inducers with the maximum cascading depth as a parameter to constrain the degree of cascading. Cascading methods proposed in the past, i.e., loose coupling and tight coupling, are strictly special cases of this new algorithm. We have empirically evaluated the proposed algorithm using logistic regression and C4.5 as base inducers on 32 UCI data sets and found that neither loose coupling nor tight coupling is always the best cascading strategy and that the maximum cascading depth in the proposed algorithm can be tuned for better classification accuracy. We have also empirically compared the proposed algorithm and ensemble methods such as bagging and boosting and found that the proposed algorithm performs marginally better than bagging and boosting on the average.

Original languageEnglish (US)
Pages (from-to)727-739
Number of pages13
JournalIEEE Transactions on Knowledge and Data Engineering
Volume16
Issue number6
DOIs
StatePublished - Jun 2004

Fingerprint

Decision trees
Logistics

Keywords

  • Cascade generalization
  • Classification
  • Data mining
  • Decision tree
  • Machine learning

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Artificial Intelligence
  • Information Systems

Cite this

Constrained cascade generalization of decision trees. / Zhao, Huimin; Ram, Sudha.

In: IEEE Transactions on Knowledge and Data Engineering, Vol. 16, No. 6, 06.2004, p. 727-739.

Research output: Contribution to journalArticle

@article{ec929bef78b14851863f69e18ebb542e,
title = "Constrained cascade generalization of decision trees",
abstract = "While decision tree techniques have been widely used in classification applications, a shortcoming of many decision tree inducers is that they do not learn intermediate concepts, i.e., at each node, only one of the original features is involved In the branching decision. Combining other classification methods, which learn intermediate concepts, with decision tree inducers can produce more flexible decision boundaries that separate different classes, potentially Improving classification accuracy. We propose a generic algorithm for cascade generalization of decision tree inducers with the maximum cascading depth as a parameter to constrain the degree of cascading. Cascading methods proposed in the past, i.e., loose coupling and tight coupling, are strictly special cases of this new algorithm. We have empirically evaluated the proposed algorithm using logistic regression and C4.5 as base inducers on 32 UCI data sets and found that neither loose coupling nor tight coupling is always the best cascading strategy and that the maximum cascading depth in the proposed algorithm can be tuned for better classification accuracy. We have also empirically compared the proposed algorithm and ensemble methods such as bagging and boosting and found that the proposed algorithm performs marginally better than bagging and boosting on the average.",
keywords = "Cascade generalization, Classification, Data mining, Decision tree, Machine learning",
author = "Huimin Zhao and Sudha Ram",
year = "2004",
month = "6",
doi = "10.1109/TKDE.2004.3",
language = "English (US)",
volume = "16",
pages = "727--739",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE Computer Society",
number = "6",

}

TY - JOUR

T1 - Constrained cascade generalization of decision trees

AU - Zhao, Huimin

AU - Ram, Sudha

PY - 2004/6

Y1 - 2004/6

N2 - While decision tree techniques have been widely used in classification applications, a shortcoming of many decision tree inducers is that they do not learn intermediate concepts, i.e., at each node, only one of the original features is involved In the branching decision. Combining other classification methods, which learn intermediate concepts, with decision tree inducers can produce more flexible decision boundaries that separate different classes, potentially Improving classification accuracy. We propose a generic algorithm for cascade generalization of decision tree inducers with the maximum cascading depth as a parameter to constrain the degree of cascading. Cascading methods proposed in the past, i.e., loose coupling and tight coupling, are strictly special cases of this new algorithm. We have empirically evaluated the proposed algorithm using logistic regression and C4.5 as base inducers on 32 UCI data sets and found that neither loose coupling nor tight coupling is always the best cascading strategy and that the maximum cascading depth in the proposed algorithm can be tuned for better classification accuracy. We have also empirically compared the proposed algorithm and ensemble methods such as bagging and boosting and found that the proposed algorithm performs marginally better than bagging and boosting on the average.

AB - While decision tree techniques have been widely used in classification applications, a shortcoming of many decision tree inducers is that they do not learn intermediate concepts, i.e., at each node, only one of the original features is involved In the branching decision. Combining other classification methods, which learn intermediate concepts, with decision tree inducers can produce more flexible decision boundaries that separate different classes, potentially Improving classification accuracy. We propose a generic algorithm for cascade generalization of decision tree inducers with the maximum cascading depth as a parameter to constrain the degree of cascading. Cascading methods proposed in the past, i.e., loose coupling and tight coupling, are strictly special cases of this new algorithm. We have empirically evaluated the proposed algorithm using logistic regression and C4.5 as base inducers on 32 UCI data sets and found that neither loose coupling nor tight coupling is always the best cascading strategy and that the maximum cascading depth in the proposed algorithm can be tuned for better classification accuracy. We have also empirically compared the proposed algorithm and ensemble methods such as bagging and boosting and found that the proposed algorithm performs marginally better than bagging and boosting on the average.

KW - Cascade generalization

KW - Classification

KW - Data mining

KW - Decision tree

KW - Machine learning

UR - http://www.scopus.com/inward/record.url?scp=3042546261&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=3042546261&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2004.3

DO - 10.1109/TKDE.2004.3

M3 - Article

AN - SCOPUS:3042546261

VL - 16

SP - 727

EP - 739

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 6

ER -