Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Pattern Recognition with Fuzzy Objective Function Algorithms
Pattern Recognition with Fuzzy Objective Function Algorithms
Enhanced word clustering for hierarchical text classification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
CBC: Clustering Based Text Classification Requiring Minimal Labeled Data
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Two-dimensional clustering for text categorization
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Regularized locality preserving indexing via spectral regression
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
IEEE Transactions on Knowledge and Data Engineering
Symbolic representation of text documents
Proceedings of the Third Annual ACM Bangalore Conference
Comparing dimension reduction techniques for document clustering
AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence
Hi-index | 0.00 |
In this paper, a simple and efficient symbolic text classification is presented. We propose a new method of representing documents based on clustering of term frequency vectors. For each class of documents we propose to create multiple clusters to preserve the intraclass variations. Term frequency vectors of each cluster are used to form a symbolic representation by the use of interval valued features. Subsequently, a new feature selection method based on a new dissimilarity measure is also presented. The new feature selection method reduces the features in the representation phase for effective text classification. It keeps the best features for effective representation and simultaneously reduces the time taken to classify a given document. To corroborate the efficacy of the proposed model we conducted experimentation on various datasets. Experimental results reveal that the proposed method gives better results when compared to the state of the art techniques. In addition, as the method is based on a simple matching scheme, it requires a negligible time.