Automatic model selection for partially linear models

Xiao Ni, Hao Zhang, Daowen Zhang

Research output: Contribution to journalArticle

31 Citations (Scopus)

Abstract

We propose and study a unified procedure for variable selection in partially linear models. A new type of double-penalized least squares is formulated, using the smoothing spline to estimate the nonparametric part and applying a shrinkage penalty on parametric components to achieve model parsimony. Theoretically we show that, with proper choices of the smoothing and regularization parameters, the proposed procedure can be as efficient as the oracle estimator [J. Fan, R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of American Statistical Association 96 (2001) 1348-1360]. We also study the asymptotic properties of the estimator when the number of parametric effects diverges with the sample size. Frequentist and Bayesian estimates of the covariance and confidence intervals are derived for the estimators. One great advantage of this procedure is its linear mixed model (LMM) representation, which greatly facilitates its implementation by using standard statistical software. Furthermore, the LMM framework enables one to treat the smoothing parameter as a variance component and hence conveniently estimate it together with other regression coefficients. Extensive numerical studies are conducted to demonstrate the effective performance of the proposed procedure.

Original languageEnglish (US)
Pages (from-to)2100-2111
Number of pages12
JournalJournal of Multivariate Analysis
Volume100
Issue number9
DOIs
StatePublished - Oct 2009
Externally publishedYes

Fingerprint

Partially Linear Model
Model Selection
Linear Mixed Model
Smoothing Parameter
Variable Selection
Estimator
Estimate
Penalized Least Squares
Oracle Property
Penalized Likelihood
Parsimony
Statistical Software
Smoothing Splines
Variance Components
Regularization Parameter
Regression Coefficient
Shrinkage
Diverge
Splines
Asymptotic Properties

Keywords

  • Semiparametric regression
  • Smoothing splines
  • Smoothly clipped absolute deviation
  • Variable selection

ASJC Scopus subject areas

  • Statistics, Probability and Uncertainty
  • Numerical Analysis
  • Statistics and Probability

Cite this

Automatic model selection for partially linear models. / Ni, Xiao; Zhang, Hao; Zhang, Daowen.

In: Journal of Multivariate Analysis, Vol. 100, No. 9, 10.2009, p. 2100-2111.

Research output: Contribution to journalArticle

Ni, Xiao ; Zhang, Hao ; Zhang, Daowen. / Automatic model selection for partially linear models. In: Journal of Multivariate Analysis. 2009 ; Vol. 100, No. 9. pp. 2100-2111.
@article{4cde78dba37a40d0930ec9be87c8a00c,
title = "Automatic model selection for partially linear models",
abstract = "We propose and study a unified procedure for variable selection in partially linear models. A new type of double-penalized least squares is formulated, using the smoothing spline to estimate the nonparametric part and applying a shrinkage penalty on parametric components to achieve model parsimony. Theoretically we show that, with proper choices of the smoothing and regularization parameters, the proposed procedure can be as efficient as the oracle estimator [J. Fan, R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of American Statistical Association 96 (2001) 1348-1360]. We also study the asymptotic properties of the estimator when the number of parametric effects diverges with the sample size. Frequentist and Bayesian estimates of the covariance and confidence intervals are derived for the estimators. One great advantage of this procedure is its linear mixed model (LMM) representation, which greatly facilitates its implementation by using standard statistical software. Furthermore, the LMM framework enables one to treat the smoothing parameter as a variance component and hence conveniently estimate it together with other regression coefficients. Extensive numerical studies are conducted to demonstrate the effective performance of the proposed procedure.",
keywords = "Semiparametric regression, Smoothing splines, Smoothly clipped absolute deviation, Variable selection",
author = "Xiao Ni and Hao Zhang and Daowen Zhang",
year = "2009",
month = "10",
doi = "10.1016/j.jmva.2009.06.009",
language = "English (US)",
volume = "100",
pages = "2100--2111",
journal = "Journal of Multivariate Analysis",
issn = "0047-259X",
publisher = "Academic Press Inc.",
number = "9",

}

TY - JOUR

T1 - Automatic model selection for partially linear models

AU - Ni, Xiao

AU - Zhang, Hao

AU - Zhang, Daowen

PY - 2009/10

Y1 - 2009/10

N2 - We propose and study a unified procedure for variable selection in partially linear models. A new type of double-penalized least squares is formulated, using the smoothing spline to estimate the nonparametric part and applying a shrinkage penalty on parametric components to achieve model parsimony. Theoretically we show that, with proper choices of the smoothing and regularization parameters, the proposed procedure can be as efficient as the oracle estimator [J. Fan, R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of American Statistical Association 96 (2001) 1348-1360]. We also study the asymptotic properties of the estimator when the number of parametric effects diverges with the sample size. Frequentist and Bayesian estimates of the covariance and confidence intervals are derived for the estimators. One great advantage of this procedure is its linear mixed model (LMM) representation, which greatly facilitates its implementation by using standard statistical software. Furthermore, the LMM framework enables one to treat the smoothing parameter as a variance component and hence conveniently estimate it together with other regression coefficients. Extensive numerical studies are conducted to demonstrate the effective performance of the proposed procedure.

AB - We propose and study a unified procedure for variable selection in partially linear models. A new type of double-penalized least squares is formulated, using the smoothing spline to estimate the nonparametric part and applying a shrinkage penalty on parametric components to achieve model parsimony. Theoretically we show that, with proper choices of the smoothing and regularization parameters, the proposed procedure can be as efficient as the oracle estimator [J. Fan, R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of American Statistical Association 96 (2001) 1348-1360]. We also study the asymptotic properties of the estimator when the number of parametric effects diverges with the sample size. Frequentist and Bayesian estimates of the covariance and confidence intervals are derived for the estimators. One great advantage of this procedure is its linear mixed model (LMM) representation, which greatly facilitates its implementation by using standard statistical software. Furthermore, the LMM framework enables one to treat the smoothing parameter as a variance component and hence conveniently estimate it together with other regression coefficients. Extensive numerical studies are conducted to demonstrate the effective performance of the proposed procedure.

KW - Semiparametric regression

KW - Smoothing splines

KW - Smoothly clipped absolute deviation

KW - Variable selection

UR - http://www.scopus.com/inward/record.url?scp=68949215515&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=68949215515&partnerID=8YFLogxK

U2 - 10.1016/j.jmva.2009.06.009

DO - 10.1016/j.jmva.2009.06.009

M3 - Article

AN - SCOPUS:68949215515

VL - 100

SP - 2100

EP - 2111

JO - Journal of Multivariate Analysis

JF - Journal of Multivariate Analysis

SN - 0047-259X

IS - 9

ER -