An empirical study of the effects of principal component analysis on symbolic classifiers

Huimin Zhao, Atish P. Sinha, Sudha Ram

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Classification is a frequently encountered data mining problem. While symbolic classifiers have high comprehensibility, their language bias may hamper their classification performance. Incorporating new features constructed based on the original features may relax such language bias and lead to performance improvement. Among others, principal component analysis (PCA) has been proposed as a possible method for enhancing the performance of decision trees. However, since PCA is an unsupervised method, the principal components may not represent the ideal projection directions for optimizing the classification performance. Thus, we expect PCA to have varying effects; it may improve classification performance if the projections enhance class differences, but may degrade performance otherwise. We also posit that the effects of PCA are similar on symbolic classifiers, including decision rules, decision trees, and decision tables. In this paper, we empirically evaluate the effects of PCA on symbolic classifiers and discuss the findings.

Original languageEnglish (US)
Title of host publication14th Americas Conference on Information Systems, AMCIS 2008
Pages563-569
Number of pages7
Volume1
StatePublished - 2008
Event14th Americas Conference on Information Systems, AMCIS 2008 - Toronto, ON, Canada
Duration: Aug 14 2008Aug 17 2008

Other

Other14th Americas Conference on Information Systems, AMCIS 2008
CountryCanada
CityToronto, ON
Period8/14/088/17/08

Fingerprint

Principal component analysis
Classifiers
performance
Decision trees
projection
Decision tables
Data mining
trend
language

Keywords

  • Classification
  • Data mining
  • Decision rule
  • Decision table
  • Decision tree
  • Principal component analysis
  • Symbolic classifier

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Networks and Communications
  • Library and Information Sciences
  • Information Systems

Cite this

Zhao, H., Sinha, A. P., & Ram, S. (2008). An empirical study of the effects of principal component analysis on symbolic classifiers. In 14th Americas Conference on Information Systems, AMCIS 2008 (Vol. 1, pp. 563-569)

An empirical study of the effects of principal component analysis on symbolic classifiers. / Zhao, Huimin; Sinha, Atish P.; Ram, Sudha.

14th Americas Conference on Information Systems, AMCIS 2008. Vol. 1 2008. p. 563-569.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhao, H, Sinha, AP & Ram, S 2008, An empirical study of the effects of principal component analysis on symbolic classifiers. in 14th Americas Conference on Information Systems, AMCIS 2008. vol. 1, pp. 563-569, 14th Americas Conference on Information Systems, AMCIS 2008, Toronto, ON, Canada, 8/14/08.
Zhao H, Sinha AP, Ram S. An empirical study of the effects of principal component analysis on symbolic classifiers. In 14th Americas Conference on Information Systems, AMCIS 2008. Vol. 1. 2008. p. 563-569
Zhao, Huimin ; Sinha, Atish P. ; Ram, Sudha. / An empirical study of the effects of principal component analysis on symbolic classifiers. 14th Americas Conference on Information Systems, AMCIS 2008. Vol. 1 2008. pp. 563-569
@inproceedings{854ebec4abab4ac98696768ed064d07a,
title = "An empirical study of the effects of principal component analysis on symbolic classifiers",
abstract = "Classification is a frequently encountered data mining problem. While symbolic classifiers have high comprehensibility, their language bias may hamper their classification performance. Incorporating new features constructed based on the original features may relax such language bias and lead to performance improvement. Among others, principal component analysis (PCA) has been proposed as a possible method for enhancing the performance of decision trees. However, since PCA is an unsupervised method, the principal components may not represent the ideal projection directions for optimizing the classification performance. Thus, we expect PCA to have varying effects; it may improve classification performance if the projections enhance class differences, but may degrade performance otherwise. We also posit that the effects of PCA are similar on symbolic classifiers, including decision rules, decision trees, and decision tables. In this paper, we empirically evaluate the effects of PCA on symbolic classifiers and discuss the findings.",
keywords = "Classification, Data mining, Decision rule, Decision table, Decision tree, Principal component analysis, Symbolic classifier",
author = "Huimin Zhao and Sinha, {Atish P.} and Sudha Ram",
year = "2008",
language = "English (US)",
isbn = "9781605609539",
volume = "1",
pages = "563--569",
booktitle = "14th Americas Conference on Information Systems, AMCIS 2008",

}

TY - GEN

T1 - An empirical study of the effects of principal component analysis on symbolic classifiers

AU - Zhao, Huimin

AU - Sinha, Atish P.

AU - Ram, Sudha

PY - 2008

Y1 - 2008

N2 - Classification is a frequently encountered data mining problem. While symbolic classifiers have high comprehensibility, their language bias may hamper their classification performance. Incorporating new features constructed based on the original features may relax such language bias and lead to performance improvement. Among others, principal component analysis (PCA) has been proposed as a possible method for enhancing the performance of decision trees. However, since PCA is an unsupervised method, the principal components may not represent the ideal projection directions for optimizing the classification performance. Thus, we expect PCA to have varying effects; it may improve classification performance if the projections enhance class differences, but may degrade performance otherwise. We also posit that the effects of PCA are similar on symbolic classifiers, including decision rules, decision trees, and decision tables. In this paper, we empirically evaluate the effects of PCA on symbolic classifiers and discuss the findings.

AB - Classification is a frequently encountered data mining problem. While symbolic classifiers have high comprehensibility, their language bias may hamper their classification performance. Incorporating new features constructed based on the original features may relax such language bias and lead to performance improvement. Among others, principal component analysis (PCA) has been proposed as a possible method for enhancing the performance of decision trees. However, since PCA is an unsupervised method, the principal components may not represent the ideal projection directions for optimizing the classification performance. Thus, we expect PCA to have varying effects; it may improve classification performance if the projections enhance class differences, but may degrade performance otherwise. We also posit that the effects of PCA are similar on symbolic classifiers, including decision rules, decision trees, and decision tables. In this paper, we empirically evaluate the effects of PCA on symbolic classifiers and discuss the findings.

KW - Classification

KW - Data mining

KW - Decision rule

KW - Decision table

KW - Decision tree

KW - Principal component analysis

KW - Symbolic classifier

UR - http://www.scopus.com/inward/record.url?scp=84870366437&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84870366437&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84870366437

SN - 9781605609539

VL - 1

SP - 563

EP - 569

BT - 14th Americas Conference on Information Systems, AMCIS 2008

ER -