Supervised and semi-supervised learning in text classification using enhanced KNN algorithm: a comparative study of supervised and semi-supervised classification in text categorisation

  • Authors:
  • M. A. Wajeed;T. Adilakshmi

  • Affiliations:
  • School of Computer Science & Informatics, Sreenidhi Institute of Science & Technology, Ghatkesar, Hyderabad, India;Department of CSE, Vasavi College of Engineering, Ibrahimbagh, Hyderabad, India

  • Venue:
  • International Journal of Intelligent Systems Technologies and Applications
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

To make efficient decisions, knowledge in terms of experience is needed that can be obtained from the process of learning. The present paper's aim and objective are to explore the learning process in text classification using semi-supervised learning paradigm and compare the results obtained with the supervised learning classifier's accuracy. Semi-supervised learning can be applied when limited amount of training data is available. In traditional K-nearest neighbour algorithm all features are given similar weights in all classes which is not reasonable. Few features may play vital role in some classes and in others there presence has no impact. In the present paper, exploration of assigning different weights to the features in different classes based on the concept of variance is discussed. Finally to gain insight in semi-supervised learning paradigm, supervised and semi-supervised learning paradigm in text classification are compared. Results obtained show that the semi-supervised learning paradigm can be applied in cases where very limited training data is available, but still reasonable classifier accuracy can be obtained.