Automatic Indexing: An Experimental Inquiry
Journal of the ACM (JACM)
An information retrieval model based on vector space method by supervised learning
Information Processing and Management: an International Journal
Support Vector Machine Active Learning with Application sto Text Classification
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Automatic Textual Document Categorization Based on Generalized Instance Sets and a Metamodel
IEEE Transactions on Pattern Analysis and Machine Intelligence
CW '05 Proceedings of the 2005 International Conference on Cyberworlds
Feature and Prototype Evolution for Nearest Neighbor Classification of Web Documents
ITNG '06 Proceedings of the Third International Conference on Information Technology: New Generations
Neighbor-weighted K-nearest neighbor for unbalanced text corpus
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
The main steps for designing an automatic document classification system include feature extraction and classification. In this paper a method to improve feature extraction is proposed. In this method, genetic algorithm (GA) was applied to determine the threshold values of four criteria for extracting the representative keywords for each class. The purpose of these four threshold values is to extract as few representative keywords as possible. This keyword extraction method was combined with two classification algorithms, vector space model (VSM) and support vector machine (SVM), for examining the performance of the proposed classification system under various extracting conditions.