A sparse structured shrinkage estimator for nonparametric varying-coefficient model with an application in genomics

Z. John Daye, Jichun Xie, Hongzhe Li

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Many problems in genomics are related to variable selection where high-dimensional genomic data are treated as covariates. Such genomic covariates often have certain structures and can be represented as vertices of an undirected graph. Biological processes also vary as functions depending upon some biological state, such as time. High-dimensional variable selection where covariates are graph-structured and underlying model is nonparametric presents an important but largely unaddressed statistical challenge. Motivated by the problem of regression-based motif discovery, we consider the problem of variable selection for high-dimensional nonparametric varying-coefficient models and introduce a sparse structured shrinkage (SSS) estimator based on basis function expansions and a novel smoothed penalty function. We present an efficient algorithm for computing the SSS estimator. Results on model selection consistency and estimation bounds are derived. Moreover, finite-sample performances are studied via simulations, and the effects of high-dimensionality and structural information of the covariates are especially highlighted. We apply our method to motif finding problem using a yeast cell-cycle gene expression dataset and word counts in genes' promoter sequences. Our results demonstrate that the proposed method can result in better variable selection and prediction for high-dimensional regression when the underlying model is nonparametric and covariates are structured. Supplemental materials for the article are available online.

Original languageEnglish (US)
Pages (from-to)110-133
Number of pages24
JournalJournal of Computational and Graphical Statistics
Volume21
Issue number1
DOIs
StatePublished - Apr 23 2012

Keywords

  • High-dimensional data
  • Model selection
  • Motif analysis
  • Nonparametric regression
  • Sparsity
  • Structured covariates

ASJC Scopus subject areas

  • Statistics and Probability
  • Discrete Mathematics and Combinatorics
  • Statistics, Probability and Uncertainty

Fingerprint Dive into the research topics of 'A sparse structured shrinkage estimator for nonparametric varying-coefficient model with an application in genomics'. Together they form a unique fingerprint.

Cite this