Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
Training algorithms for linear text classifiers
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
ACM Computing Surveys (CSUR)
A vector space model for automatic indexing
Communications of the ACM
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Efficient k-NN search on vertically decomposed data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Hierarchical Text Categorization Using Neural Networks
Information Retrieval
Array-index: a plug&search K nearest neighbors method for high-dimensional data
Data & Knowledge Engineering
A clustering-based method for unsupervised intrusion detections
Pattern Recognition Letters
Expert Systems with Applications: An International Journal
Information Sciences: an International Journal
Boosting multi-label hierarchical text categorization
Information Retrieval
An improved centroid classifier for text categorization
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
A survey of learning-based techniques of email spam filtering
Artificial Intelligence Review
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Neighbor-weighted K-nearest neighbor for unbalanced text corpus
Expert Systems with Applications: An International Journal
Naive bayes for text classification with unbalanced classes
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
ICE - Intelligent Clustering Engine: A clustering gadget for Google Desktop
Expert Systems with Applications: An International Journal
The decomposed k-nearest neighbor algorithm for imbalanced text classification
FGIT'12 Proceedings of the 4th international conference on Future Generation Information Technology
Expert Systems with Applications: An International Journal
An effective class-centroid-based dimension reduction method for text classification
Proceedings of the 22nd international conference on World Wide Web companion
Projected-prototype based classifier for text categorization
Knowledge-Based Systems
Hi-index | 12.05 |
Text categorization is a significant tool to manage and organize the surging text data. Many text categorization algorithms have been explored in previous literatures, such as KNN, Naive Bayes and Support Vector Machine. KNN text categorization is an effective but less efficient classification method. In this paper, we propose an improved KNN algorithm for text categorization, which builds the classification model by combining constrained one pass clustering algorithm and KNN text categorization. Empirical results on three benchmark corpora show that our algorithm can reduce the text similarity computation substantially and outperform the-state-of-the-art KNN, Naive Bayes and Support Vector Machine classifiers. In addition, the classification model constructed by the proposed algorithm can be updated incrementally, and it has great scalability in many real-word applications.