Symbolic representation of text documents

Authors:
D. S. Guru;B. S. Harish;S. Manjunath
Affiliations:
University of Mysore, Manasagangotri, Mysore, India;University of Mysore, Manasagangotri, Mysore, India;University of Mysore, Manasagangotri, Mysore, India
Venue:
Proceedings of the Third Annual ACM Bangalore Conference
Year:
2010

Citing 8
Cited 3

Learning to extract symbolic knowledge from the World Wide Web

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Locality preserving indexing for document representation

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Text Classification without Negative Examples Revisit

IEEE Transactions on Knowledge and Data Engineering
Text classification: A least square support vector machine approach

Applied Soft Computing
Regularized locality preserving indexing via spectral regression

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A Latent Semantic Indexing-based approach to multilingual document clustering

Decision Support Systems
Text Document Preprocessing with the Bayes Formula for Classification Using the Support Vector Machine

IEEE Transactions on Knowledge and Data Engineering

Cluster based symbolic representation and feature selection for text classification

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Dissimilarity based feature selection for text classification: a cluster based approach

Proceedings of the International Conference & Workshop on Emerging Trends in Technology
Text classification using symbolic similarity measure

Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a novel method of representing a text document by the use of interval valued symbolic features. A method of classification of text documents based on the proposed representation is also presented. The newly proposed model significantly reduces the dimension of feature vectors and also the time taken to classify a given document. Further, extensive experimentations are conducted on vehicles-wikipedia datasets to evaluate the performance of the proposed model. The experimental results reveal that the obtained results are on par with the existing results for vehicles-wikipedia dataset. However, the advantage of the proposed model is that it takes relatively a less time for classification as it is based on a simple matching strategy.