An effective refinement strategy for KNN text classifier

Authors:
Songbo Tan
Affiliations:
Software Department, Institute of Computing Technology, Chinese Academy of Sciences, P.O. Box 2704, Beijing 100080, People's Republic of China and Graduate School of the Chinese Academy of Science ...
Venue:
Expert Systems with Applications: An International Journal
Year:
2006

Citing 2
Cited 35

Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Using Error-Correcting Codes for Text Classification

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning

A fuzzy clustering approach for finding similar documents using a novel similarity measure

Expert Systems with Applications: An International Journal
Computer assisted customer churn management: State-of-the-art and future trends

Computers and Operations Research
Multilabel text categorization based on a new linear classifier learning method and a category-sensitive refinement method

Expert Systems with Applications: An International Journal
A new approach on search for similar documents with multiple categories using fuzzy clustering

Expert Systems with Applications: An International Journal
An improved centroid classifier for text categorization

Expert Systems with Applications: An International Journal
An efficient document classification model using an improved back propagation neural network and singular value decomposition

Expert Systems with Applications: An International Journal
Combination of modified BPNN algorithms and an efficient feature selection method for text categorization

Information Processing and Management: an International Journal
An automatically constructed thesaurus for neural network based document categorization

Expert Systems with Applications: An International Journal
Enhancing the Performance of Centroid Classifier by ECOC and Model Refinement

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Evaluation of k-Nearest Neighbor classifier performance for direct marketing

Expert Systems with Applications: An International Journal
Graph ranking for sentiment transfer

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Automatic text categorization based on content analysis with cognitive situation models

Information Sciences: an International Journal
Chaotic maps based on binary particle swarm optimization for feature selection

Applied Soft Computing
Arabic script web page language identifications using decision tree neural networks

Pattern Recognition
Two-level hierarchical combination method for text classification

Expert Systems with Applications: An International Journal
An approach to expert recommendation based on fuzzy linguistic method and fuzzy text classification in knowledge management systems

Expert Systems with Applications: An International Journal
Adapting centroid classifier for document categorization

Expert Systems with Applications: An International Journal
Gene selection and classification using Taguchi chaotic binary particle swarm optimization

Expert Systems with Applications: An International Journal
Text categorization algorithms using semantic approaches, corpus-based thesaurus and WordNet

Expert Systems with Applications: An International Journal
FSKNN: Multi-label text categorization based on fuzzy similarity and k nearest neighbors

Expert Systems with Applications: An International Journal
Classifying the Geometric Dilution of Precision of GPS satellites utilizing Bayesian decision theory

Computers and Electrical Engineering
Using key sentence to improve sentiment classification

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
An architecture-centered framework for developing blog crawlers

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Active refinement of clone anomaly reports

Proceedings of the 34th International Conference on Software Engineering
A global-ranking local feature selection method for text categorization

Expert Systems with Applications: An International Journal
Persian text classification based on K-NN using wordnet

IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence
RetriBlog: An architecture-centered framework for developing blog crawlers

Expert Systems with Applications: An International Journal
A generalized cluster centroid based classifier for text categorization

Information Processing and Management: an International Journal
Free-gram phrase identification for modeling Chinese text

Information Processing Letters
The decomposed k-nearest neighbor algorithm for imbalanced text classification

FGIT'12 Proceedings of the 4th international conference on Future Generation Information Technology
Automated crime report analysis and classification for e-government and decision support

Proceedings of the 14th Annual International Conference on Digital Government Research
Facing the classification of binary problems with a hybrid system based on quantum-inspired binary gravitational search algorithm and K-NN method

Engineering Applications of Artificial Intelligence
Intelligent churn prediction in telecom: employing mRMR feature selection and RotBoost based ensemble classification

Applied Intelligence
Semi-supervised text categorization: Exploiting unlabeled data using ensemble learning algorithms

Intelligent Data Analysis
Parallel Training of An Improved Neural Network for Text Categorization

International Journal of Parallel Programming

Quantified Score

Hi-index	12.07

Visualization

Abstract

Due to the exponential growth of documents on the Internet and the emergent need to organize them, the automated categorization of documents into predefined labels has received an ever-increased attention in the recent years. A wide range of supervised learning algorithms has been introduced to deal with text classification. Among all these classifiers, K-Nearest Neighbors (KNN) is a widely used classifier in text categorization community because of its simplicity and efficiency. However, KNN still suffers from inductive biases or model misfits that result from its assumptions, such as the presumption that training data are evenly distributed among all categories. In this paper, we propose a new refinement strategy, which we called as DragPushing, for the KNN Classifier. The experiments on three benchmark evaluation collections show that DragPushing achieved a significant improvement on the performance of the KNN Classifier.