Early measurements of a cluster-based architecture for P2P systems
IMW '01 Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement
Information Retrieval
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Content-based retrieval in hybrid peer-to-peer networks
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
An architecture for information retrieval over semi-collaborating Peer-to-Peer networks
Proceedings of the 2004 ACM symposium on Applied computing
A suite of testbeds for the realistic evaluation of peer-to-peer information retrieval systems
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Patch clustering for massive data sets
Neurocomputing
Distributed data clustering in multi-dimensional peer-to-peer networks
ADC '10 Proceedings of the Twenty-First Australasian Conference on Database Technologies - Volume 104
On-line single-pass clustering based on diffusion maps
NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Hi-index | 0.00 |
Document clustering has been a particularly active research field within the Information Retrieval (IR) community. Among the numerous clustering algorithms proposed, single-pass clustering stands out in terms of both time and space efficiency. However, it is generally acknowledged that single-pass clustering has a major defect, namely its output depends on the order in which documents are presented. Building on our previous work, and having identified single-pass clustering as potentially useful for P2P IR, we study the extent to which this is true in practical terms. We do so by experimenting with two large web-based testbeds, which are suitable for Peer-to-Peer IR evaluation. The results of our study show that document ordering does not practically matter for single-pass clustering.