A vector space model for automatic indexing
Communications of the ACM
Information Theory, Inference & Learning Algorithms
Information Theory, Inference & Learning Algorithms
Fast clustering algorithm for information organization
CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Overview of the INEX 2009 XML mining track: clustering and classification of XML documents
INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
Hi-index | 0.00 |
The aim of this paper is to use unsupervised classification techniques in order to group the documents of a given huge collection into clusters. We approached this challenge by using a simple clustering algorithm (K-Star) in a recursive clustering process over subsets of the complete collection. The presented approach is a scalable algorithm which may automatically discover the number of clusters. The obtained results outperformed different baselines presented in the INEX 2009 clustering task.