Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Clustering and classification of large document bases in a parallel environment
Journal of the American Society for Information Science
MPIJAVA: An Object-Oriented JAVA Interface to MPI
Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
Misuse detection for information retrieval systems
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Automatic clustering of collocation for detecting practical sense boundary
ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Dynamicity vs. effectiveness: studying online clustering for scatter/gather
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Short documents clustering in very large text databases
WISE'06 Proceedings of the 7th international conference on Web Information Systems
Hi-index | 0.00 |
We present a parallel implementation of the Buckshot document clustering algorithm. We demonstrate that this parallel approach is highly efficient both in terms of load balancing and minimization of communication. In a series of experiments using the 2GB of SGML data from TReC disks 4 and 5, our parallel approach was shown to be scalable in terms of processors efficiently used and the number of clusters created.