Multinomial logit models with implicit variable selection

  • Authors:
  • Faisal Maqbool Zahid;Gerhard Tutz

  • Affiliations:
  • Government College University Faisalabad, Faisalabad, Pakistan;Ludwig-Maximilians-Universität Munich, Munich, Germany 80799

  • Venue:
  • Advances in Data Analysis and Classification
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The multinomial logit model is the most widely used model for the unordered multi-category responses. However, applications are typically restricted to the use of few predictors because in the high-dimensional case maximum likelihood estimates frequently do not exist. In this paper we are developing a boosting technique called multinomBoost that performs variable selection and fits the multinomial logit model also when predictors are high-dimensional. Since in multi-category models the effect of one predictor variable is represented by several parameters one has to distinguish between variable selection and parameter selection. A special feature of the approach is that, in contrast to existing approaches, it selects variables not parameters. The method can also distinguish between mandatory predictors and optional predictors. Moreover, it adapts to metric, binary, nominal and ordinal predictors. Regularization within the algorithm allows to include nominal and ordinal variables which have many categories. In the case of ordinal predictors the order information is used. The performance of boosting technique with respect to mean squared error, prediction error and the identification of relevant variables is investigated in a simulation study. The method is applied to the national Indonesia contraceptive prevalence survey and the identification of glass. Results are also compared with the Lasso approach which selects parameters.