Text classification: combining grouping, LSA and kNN vs support vector machine

Authors:
Naohiro Ishii;Takeshi Murai;Takahiro Yamada;Yongguang Bao;Susumu Suzuki
Affiliations:
Aichi Institute of Technology, Toyota, Japan;Aichi Institute of Technology, Toyota, Japan;Aichi Institute of Technology, Toyota, Japan;Aichi Institute of Technology, Toyota, Japan;Aichi Institute of Technology, Toyota, Japan
Venue:
KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Year:
2006

Citing 7
Cited 2

Support-Vector Networks

Machine Learning
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A statistical learning learning model of text classification for support vector machines

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Combining Multiple K-Nearest Neighbor Classifiers for Text Classification by Reducts

DS '02 Proceedings of the 5th International Conference on Discovery Science
Text Mining and Its Applications: Results of the Nemis Launch Conference (Studies in Fuzziness and Soft Computing, V. 138)

Text Mining and Its Applications: Results of the Nemis Launch Conference (Studies in Fuzziness and Soft Computing, V. 138)
Information Retrieval: Algorithms and Heuristics (The Kluwer International Series on Information Retrieval)

Information Retrieval: Algorithms and Heuristics (The Kluwer International Series on Information Retrieval)

PCA document reconstruction for email classification

Computational Statistics & Data Analysis
Web objectionable text content detection using topic modeling technique

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Text classification is a key technique for handling and organizing text data. The support vector machine(SVM) is shown to be better for the classification among well-known methods. In this paper, the grouping method of the similar words, is proposed for the classification of documents, which is applied to Reuters news and it is shown that the grouping of words has equivalent ability to the Latent Semantic Analysis(LSA) in the classification accuracy. Further, a new combining method is proposed for the classification, which consists of Grouping, LSA followed by the k-Nearest Neighbor classification ( k-NN ). The combining method proposed here, shows the higher accuracy in the classification than the conventional methods of the kNN, and the LSA followed by the kNN. Then, the combining method shows almost same accuracies as SVM.