Supervised and semi-supervised learning in text classification using enhanced KNN algorithm: a comparative study of supervised and semi-supervised classification in text categorisation

Authors:
M. A. Wajeed;T. Adilakshmi
Affiliations:
School of Computer Science & Informatics, Sreenidhi Institute of Science & Technology, Ghatkesar, Hyderabad, India;Department of CSE, Vasavi College of Engineering, Ibrahimbagh, Hyderabad, India
Venue:
International Journal of Intelligent Systems Technologies and Applications
Year:
2012

Citing 11
Cited 0

Instance-Based Learning Algorithms

Machine Learning
Fuzzy logic: intelligence, control, and information

Fuzzy logic: intelligence, control, and information
An Evaluation of Statistical Approaches to Text Categorization

Information Retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Maximizing Text-Mining Performance

IEEE Intelligent Systems
Automatic Text Categorization: Case Study

SBRN '02 Proceedings of the VII Brazilian Symposium on Neural Networks (SBRN'02)
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data

The Journal of Machine Learning Research
The rendezvous algorithm: multiclass semi-supervised learning with Markov random walks

Proceedings of the 24th international conference on Machine learning
Scaling up text classification for large file systems

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A Fuzzy Self-Constructing Feature Clustering Algorithm for Text Classification

IEEE Transactions on Knowledge and Data Engineering
Self-adaptive neuro-fuzzy inference systems for classification applications

IEEE Transactions on Fuzzy Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

To make efficient decisions, knowledge in terms of experience is needed that can be obtained from the process of learning. The present paper's aim and objective are to explore the learning process in text classification using semi-supervised learning paradigm and compare the results obtained with the supervised learning classifier's accuracy. Semi-supervised learning can be applied when limited amount of training data is available. In traditional K-nearest neighbour algorithm all features are given similar weights in all classes which is not reasonable. Few features may play vital role in some classes and in others there presence has no impact. In the present paper, exploration of assigning different weights to the features in different classes based on the concept of variance is discussed. Finally to gain insight in semi-supervised learning paradigm, supervised and semi-supervised learning paradigm in text classification are compared. Results obtained show that the semi-supervised learning paradigm can be applied in cases where very limited training data is available, but still reasonable classifier accuracy can be obtained.