ACM Computing Surveys (CSUR)
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
SOM-Based Methodology for Building Large Text Archives
DASFAA '01 Proceedings of the 7th International Conference on Database Systems for Advanced Applications
An analysis of the relative hardness of Reuters-21578 subsets: Research Articles
Journal of the American Society for Information Science and Technology
A Hybrid SOM-Based Document Organization System
SBRN '06 Proceedings of the Ninth Brazilian Symposium on Neural Networks
Self organization of a massive document collection
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
In this paper, we present and evaluate a hybrid document organization system based on Self-Organizing Maps. The proposed system uses Semantic Mapping to dimensionality reduction and K-means to volume reduction of document vectors of a medium text collection. The vectors obtained after dimensionality and volume reduction steps are used to train the document maps with the SOM algorithm, thus the training time is reduced without compromising the quality of the generated map. We compare experimentally the hybrid system with the correspondent SOM system in organization of documents of Reuters-21758 v1.0 collection. The performances of the systems were measured in terms of classification error in text categorization and training time. The experimental results show that the proposed system generates pretty good document maps with smallest training time.