Hybrid singular value decomposition: a model of human text classification

  • Authors:
  • Amirali Noorinaeini;Mark R. Lehto;Sze-jung Wu

  • Affiliations:
  • School of Industrial Engineering, Purdue University, West Lafayette, IN;School of Industrial Engineering, Purdue University, West Lafayette, IN;School of Industrial Engineering, Purdue University, West Lafayette, IN

  • Venue:
  • Proceedings of the 2007 conference on Human interface: Part I
  • Year:
  • 2007

Quantified Score

Hi-index 0.03

Visualization

Abstract

This study compared the accuracy of three Singular Value Decomposition (SVD) based models developed for classifying injury narratives. Two SVD-Bayesian models and one SVD-Regression model were developed to classify bodies of free text. Injury narratives and corresponding E-codes assigned by human experts from the 1997 and 1998 US National Health Interview Survey (NHIS) were used on all three models. Using the E-code categories assigned by experts as the basis for comparison all methods were compared. Further experiments showed that the performance of the equidistant Bayes model and regression model improved as more SVD vectors were used for the input. The regression model was compared to a fuzzy Bayes model. It was concluded that all three models are capable of learning from human experts to accurately categorize cause-of-injury codes from injury narratives, with the regression-based model being the strongest, while all were dominated by multiple-word fuzzy Bayes model.