Finding the Best Regression Subset by Reduction in Nonfull-Rank Cases

  • Authors:
  • Alan H. Feiveson

  • Affiliations:
  • -

  • Venue:
  • SIAM Journal on Matrix Analysis and Applications
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

The computational problem of finding the best fitting subset of independent variables in least-squares regression with a fixed subset size is addressed, especially in the context of the nonfull-rank case with more variables than observations. For the full-rank case, the most efficient widely used methods work by finding the complementary subset with minimum reduction to the total regression sum of squares; a task that can usually be accomplished with far less computation than exhaustive evaluation of all subsets. Here, a method using Cholesky-type factorizations (Algorithm 2) has been developed, which also takes advantage of the computational savings offered by the "reduction" approach, but which can be used in nonfull-rank cases where existing methods are not applicable. Algorithm 2 is derived by examining the asymptotic properties of a full-rank procedure (Algorithm 1) used on a "ridge" perturbation of the cross-product matrix. In the course of testing, it was discovered that Algorithm 1, with the appropriate ridge parameter, usually selected the best subset with less computation than Algorithm 2; however, if one requires mathematical certitude, use of Algorithm 2 is indicated. Also, some new approaches are proposed for developing efficient methods of identifying the best subset directly, rather than by complement to the minimum-reduction subset.