Sparsifying the Fisher linear discriminant by rotation

Ning - Hao, Bin Dong, Jianqing Fan

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

Many high dimensional classification techniques have been proposed in the literature based on sparse linear discriminant analysis. To use them efficiently, sparsity of linear classifiers is a prerequisite. However, this might not be readily available in many applications, and rotations of data are required to create the sparsity needed. We propose a family of rotations to create the sparsity required. The basic idea is to use the principal components of the sample covariance matrix of the pooled samples and its variants to rotate the data first and then to apply an existing high dimensional classifier. This rotate-and-solve procedure can be combined with any existing classifiers and is robust against the level of sparsity of the true model. We show that these rotations do create the sparsity that is needed for high dimensional classifications and we provide theoretical understanding why such a rotation works empirically. The effectiveness of the method proposed is demonstrated by several simulated and real data examples, and the improvements of our method over some popular high dimensional classification rules are clearly shown.

Original languageEnglish (US)
Pages (from-to)827-851
Number of pages25
JournalJournal of the Royal Statistical Society. Series B: Statistical Methodology
Volume77
Issue number4
DOIs
StatePublished - Sep 1 2015

Fingerprint

Sparsity
Discriminant
High-dimensional
Classifier
Sample Covariance Matrix
Classification Rules
Principal Components
Discriminant Analysis

Keywords

  • Classification
  • Equivariance
  • High dimensional data
  • Linear discriminant analysis
  • Principal components
  • Rotate-and-solve procedure

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

Sparsifying the Fisher linear discriminant by rotation. / Hao, Ning -; Dong, Bin; Fan, Jianqing.

In: Journal of the Royal Statistical Society. Series B: Statistical Methodology, Vol. 77, No. 4, 01.09.2015, p. 827-851.

Research output: Contribution to journalArticle

@article{e8a9c4cc597747598dede87c5a0b6d38,
title = "Sparsifying the Fisher linear discriminant by rotation",
abstract = "Many high dimensional classification techniques have been proposed in the literature based on sparse linear discriminant analysis. To use them efficiently, sparsity of linear classifiers is a prerequisite. However, this might not be readily available in many applications, and rotations of data are required to create the sparsity needed. We propose a family of rotations to create the sparsity required. The basic idea is to use the principal components of the sample covariance matrix of the pooled samples and its variants to rotate the data first and then to apply an existing high dimensional classifier. This rotate-and-solve procedure can be combined with any existing classifiers and is robust against the level of sparsity of the true model. We show that these rotations do create the sparsity that is needed for high dimensional classifications and we provide theoretical understanding why such a rotation works empirically. The effectiveness of the method proposed is demonstrated by several simulated and real data examples, and the improvements of our method over some popular high dimensional classification rules are clearly shown.",
keywords = "Classification, Equivariance, High dimensional data, Linear discriminant analysis, Principal components, Rotate-and-solve procedure",
author = "Hao, {Ning -} and Bin Dong and Jianqing Fan",
year = "2015",
month = "9",
day = "1",
doi = "10.1111/rssb.12092",
language = "English (US)",
volume = "77",
pages = "827--851",
journal = "Journal of the Royal Statistical Society. Series B: Statistical Methodology",
issn = "1369-7412",
publisher = "Wiley-Blackwell",
number = "4",

}

TY - JOUR

T1 - Sparsifying the Fisher linear discriminant by rotation

AU - Hao, Ning -

AU - Dong, Bin

AU - Fan, Jianqing

PY - 2015/9/1

Y1 - 2015/9/1

N2 - Many high dimensional classification techniques have been proposed in the literature based on sparse linear discriminant analysis. To use them efficiently, sparsity of linear classifiers is a prerequisite. However, this might not be readily available in many applications, and rotations of data are required to create the sparsity needed. We propose a family of rotations to create the sparsity required. The basic idea is to use the principal components of the sample covariance matrix of the pooled samples and its variants to rotate the data first and then to apply an existing high dimensional classifier. This rotate-and-solve procedure can be combined with any existing classifiers and is robust against the level of sparsity of the true model. We show that these rotations do create the sparsity that is needed for high dimensional classifications and we provide theoretical understanding why such a rotation works empirically. The effectiveness of the method proposed is demonstrated by several simulated and real data examples, and the improvements of our method over some popular high dimensional classification rules are clearly shown.

AB - Many high dimensional classification techniques have been proposed in the literature based on sparse linear discriminant analysis. To use them efficiently, sparsity of linear classifiers is a prerequisite. However, this might not be readily available in many applications, and rotations of data are required to create the sparsity needed. We propose a family of rotations to create the sparsity required. The basic idea is to use the principal components of the sample covariance matrix of the pooled samples and its variants to rotate the data first and then to apply an existing high dimensional classifier. This rotate-and-solve procedure can be combined with any existing classifiers and is robust against the level of sparsity of the true model. We show that these rotations do create the sparsity that is needed for high dimensional classifications and we provide theoretical understanding why such a rotation works empirically. The effectiveness of the method proposed is demonstrated by several simulated and real data examples, and the improvements of our method over some popular high dimensional classification rules are clearly shown.

KW - Classification

KW - Equivariance

KW - High dimensional data

KW - Linear discriminant analysis

KW - Principal components

KW - Rotate-and-solve procedure

UR - http://www.scopus.com/inward/record.url?scp=84937707047&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84937707047&partnerID=8YFLogxK

U2 - 10.1111/rssb.12092

DO - 10.1111/rssb.12092

M3 - Article

VL - 77

SP - 827

EP - 851

JO - Journal of the Royal Statistical Society. Series B: Statistical Methodology

JF - Journal of the Royal Statistical Society. Series B: Statistical Methodology

SN - 1369-7412

IS - 4

ER -