Text classification using symbolic similarity measure

Authors:
B. S. Harish;S. Manjunath;D. S. Guru
Affiliations:
S J College of Engineering, Mysore, India;JSS College of Arts, Commerce and Science, Mysore, India;University of Mysore, Mysore, India
Venue:
Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
Year:
2012

Citing 8
Cited 0

Learning to extract symbolic knowledge from the World Wide Web

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
A vector space model for automatic indexing

Communications of the ACM
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Locality preserving indexing for document representation

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Symbolic representation of two-dimensional shapes

Pattern Recognition Letters
Regularized locality preserving indexing via spectral regression

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Text Document Preprocessing with the Bayes Formula for Classification Using the Support Vector Machine

IEEE Transactions on Knowledge and Data Engineering
Symbolic representation of text documents

Proceedings of the Third Annual ACM Bangalore Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic text classification is a problem of assigning text documents to pre-defined classes. In order to classify text documents, a good set of text representation from the training set need to be extracted. Thus, in this paper, we present a novel method of representing a text document by the use of symbolic representations. Further, we make use of a new symbolic similarity measure to classify text documents. Extensive experimentations are conducted on various datasets to evaluate the performance of the proposed model. Experimental results reveal that the proposed method gives better results when compare to state of the art techniques. In addition, as it is based on simple matching scheme it achieves classification within negligible time and thus it appear to be more effective in classification.