Symbolic representation of text documents

  • Authors:
  • D. S. Guru;B. S. Harish;S. Manjunath

  • Affiliations:
  • University of Mysore, Manasagangotri, Mysore, India;University of Mysore, Manasagangotri, Mysore, India;University of Mysore, Manasagangotri, Mysore, India

  • Venue:
  • Proceedings of the Third Annual ACM Bangalore Conference
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a novel method of representing a text document by the use of interval valued symbolic features. A method of classification of text documents based on the proposed representation is also presented. The newly proposed model significantly reduces the dimension of feature vectors and also the time taken to classify a given document. Further, extensive experimentations are conducted on vehicles-wikipedia datasets to evaluate the performance of the proposed model. The experimental results reveal that the obtained results are on par with the existing results for vehicles-wikipedia dataset. However, the advantage of the proposed model is that it takes relatively a less time for classification as it is based on a simple matching strategy.