The nature of statistical learning theory
The nature of statistical learning theory
Document clustering with committees
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
'1 + 1 2': Merging Distance and Density Based Clustering
DASFAA '01 Proceedings of the 7th International Conference on Database Systems for Advanced Applications
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Information discovery within organizations using the Athens system
CASCON '04 Proceedings of the 2004 conference of the Centre for Advanced Studies on Collaborative research
Hi-index | 0.00 |
In the study, a new hybrid incremental clustering method is proposed in combination with Support Vector Machine (SVM) and enhanced Clustering by Committee (CBC) algorithm. SVM classifies the incoming document to see if it belongs to the existing classes. Then the enhanced CBC algorithm is used to cluster the unclassified documents. SVM can significantly reduce the amount of calculation and the noise of clustering. The enhanced CBC algorithm can effectively control the number of clusters, improve performance and allow the number of classes to grow gradually based on the structure of current classes without clustering all of documents again. In empirical results, the proposed method outperforms the enhanced CBC clustering method and other algorithms. Also, the enhanced CBC clustering method outperforms original CBC.