Noise reduction in a statistical approach to text categorization
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating the Utility of Statistical Phrases and Latent Semantic Indexing for Text Classification
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
A commonsense approach to predictive text entry
CHI '04 Extended Abstracts on Human Factors in Computing Systems
Hi-index | 0.03 |
This study compared the accuracy of three Singular Value Decomposition (SVD) based models developed for classifying injury narratives. Two SVD-Bayesian models and one SVD-Regression model were developed to classify bodies of free text. Injury narratives and corresponding E-codes assigned by human experts from the 1997 and 1998 US National Health Interview Survey (NHIS) were used on all three models. Using the E-code categories assigned by experts as the basis for comparison all methods were compared. Further experiments showed that the performance of the equidistant Bayes model and regression model improved as more SVD vectors were used for the input. The regression model was compared to a fuzzy Bayes model. It was concluded that all three models are capable of learning from human experts to accurately categorize cause-of-injury codes from injury narratives, with the regression-based model being the strongest, while all were dominated by multiple-word fuzzy Bayes model.