Text classification using symbolic similarity measure

  • Authors:
  • B. S. Harish;S. Manjunath;D. S. Guru

  • Affiliations:
  • S J College of Engineering, Mysore, India;JSS College of Arts, Commerce and Science, Mysore, India;University of Mysore, Mysore, India

  • Venue:
  • Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic text classification is a problem of assigning text documents to pre-defined classes. In order to classify text documents, a good set of text representation from the training set need to be extracted. Thus, in this paper, we present a novel method of representing a text document by the use of symbolic representations. Further, we make use of a new symbolic similarity measure to classify text documents. Extensive experimentations are conducted on various datasets to evaluate the performance of the proposed model. Experimental results reveal that the proposed method gives better results when compare to state of the art techniques. In addition, as it is based on simple matching scheme it achieves classification within negligible time and thus it appear to be more effective in classification.