Two-stage multinomial logit model

  • Authors:
  • Jin-Hyung Kim;Mijung Kim

  • Affiliations:
  • Department of Information and Industrial Engineering, Yonsei University, 134 Shinchon-Dong, Seodaemun-Gu, Seoul 120-752, Republic of Korea;Department of Mathematics and Institute for Mathematical Sciences, Yonsei University, 134 Shinchon-Dong, Seodaemun-Gu, Seoul 120-752, Republic of Korea

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 12.05

Visualization

Abstract

We suggest a two-stage multinomial logit model (TMLM) for incorporating and interpreting both the interaction and main effects in the model for multi-categorized responses. TMLM combines the robustness of multinomial logit model (MLM) with the good properties of decision tree (DT), which makes it possible to cluster homogeneous subjects and thus to incorporate the interaction effects of explanatory variables in MLM. In the first step of TMLM, DT is applied to determine the most influential interaction effects and to create a cluster variable that represents categories with best splits for optimal tree. In the second step, the cluster variable is involved in MLM as an explanatory variable. With TMLM, it is possible to interpret not only the interactions among explanatory variables, but also the main effects. It is also possible to cluster and characterize homogeneous subjects; these would not be possible with MLM. This model also improves the accuracy rate in multi-classification for multi-categorized responses. We apply TMLM to the national pension data of disability pensioners in Korea and compare the results with two types of MLM models. TMLM is suggested as a statistical model for characterizing both the interaction and main effects of explanatory variables and also for improving accuracy rates comparing to MLM.