Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
ACM Computing Surveys (CSUR)
Information Retrieval
Document clustering with committees
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Winnowing: local algorithms for document fingerprinting
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Characterizing Web User Accesses: A Transactional Approach to Web Log Clustering
ITCC '02 Proceedings of the International Conference on Information Technology: Coding and Computing
Cluster-based retrieval using language models
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient randomized pattern-matching algorithms
IBM Journal of Research and Development - Mathematics and computing
The query-vector document model
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Winnowing-based text clustering
Proceedings of the 17th ACM conference on Information and knowledge management
An experimental study of constrained clustering effectiveness in presence of erroneous constraints
Information Processing and Management: an International Journal
Hi-index | 0.00 |
This paper presents a new approach designed to reduce the computational load of the existing clustering algorithms by trimming down the documents size using fingerprinting methods. Thorough evaluation was performed over three different collections and considering four different metrics. The presented approach to document clustering achieved good values of effectiveness with considerable save in memory space and computation time.