Group coordinate descent algorithms for nonconvex penalized regression

  • Authors:
  • Fengrong Wei;Hongxiao Zhu

  • Affiliations:
  • Department of Mathematics, University of West Georgia, 1601 Maple Street, Carrollton, GA 30118, United States;Statistical and Applied Mathematical Sciences Institute, 19 T.W. Alexander Drive, P.O. Box 14006, Research Triangle Park, NC 27709-4006, United States

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2012

Quantified Score

Hi-index 0.03

Visualization

Abstract

We consider the problem of selecting grouped variables in linear regression and generalized linear regression models, based on penalized likelihood. A number of penalty functions have been used for this purpose, including the smoothly clipped absolute deviation (SCAD) penalty and the minimax concave penalty (MCP). These penalty functions, in comparison to the popularly used Lasso, have attractive theoretical properties such as unbiasedness and selection consistency. Although the model fitting methods using these penalties are well developed for individual variable selection, the extension to grouped variable selection is not straightforward, and the fitting can be unstable due to the nonconvexity of the penalty functions. To this end, we propose the group coordinate descent (GCD) algorithms, which extend the regular coordinate descent algorithms. These GCD algorithms are efficient, in that the computation burden only increases linearly with the number of the covariate groups. We also show that using the GCD algorithm, the estimated parameters converge to a global minimum when the sample size is larger than the dimension of the covariates, and converge to a local minimum otherwise. In addition, we demonstrate the regions of the parameter space in which the objective function is locally convex, even though the penalty is nonconvex. In addition to group selection in the linear model, the GCD algorithms can also be extended to generalized linear regression. We present details of the extension using an example of logistic regression. The efficiency of the proposed algorithms are presented through simulation studies and a real data example, in which the MCP based and SCAD based GCD algorithms provide improved group selection results as compared to the group Lasso.