International Journal of Man-Machine Studies
Text classification using ESC-based stochastic decision lists
Proceedings of the eighth international conference on Information and knowledge management
The feature quantity: an information theoretic perspective of Tfidf-like measures
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing
Communications of the ACM
Rough Sets: Theoretical Aspects of Reasoning about Data
Rough Sets: Theoretical Aspects of Reasoning about Data
Scalable Classification Method Based on Rough Sets
TSCTC '02 Proceedings of the Third International Conference on Rough Sets and Current Trends in Computing
Support vector learning for fuzzy rule-based classification systems
IEEE Transactions on Fuzzy Systems
Neural-network feature selector
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Text classification can greatly improve the performance of information retrieval and information filtering, but high dimensionality of documents baffles the applications of most classification approaches. This paper proposed a Difference-Similitude Matrix (DSM) based method to solve the problem. The method represents a pre-classified collection as an item-document matrix, in which documents in same categories are described with similarities while documents in different categories with differences. Using the DSM reduction algorithm, simpler and more efficient than rough set reduction, we reduced the dimensionality of document space and generated rules for text classification.