Using text mining and sentiment analysis for online forums hotspot detection and forecast
Decision Support Systems
An improved growing LVQ for text classification
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Fast text categorization based on a novel class space model
MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
A soft real-time web news classification system with double control loops
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
The Effect of Stemming on Arabic Text Classification: An Empirical Study
International Journal of Information Retrieval Research
Hi-index | 0.00 |
With the rapid growth of on-line information available,text classification is becoming more and more important.kNN is a widely used text classification method of high performance. However, this method is inefficient because itrequires a large amount of computation or evaluating thesimilarity between a test document and each training document. In this paper, we propose a fast kNN text classification approach based on pruning the training corpus. Byusing this approach, the size of training corpus can be condensed sharply so that time-consuming on kNN searchingcan be cut off significantly, and consequently classificationefficiency can be improved substantially while classification performance is preserved comparable to that of withoutpruning. Effective algorithm for text corpus pruning is designed. Experiments over the Reuters corpus are carriedout, which validate the practicability of the proposed approach. Our approach is especially suitable or on-line textclassification applications.