High performance clustering based on the similarity join
Proceedings of the ninth international conference on Information and knowledge management
A scalable content-addressable network
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
ACM Computing Surveys (CSUR)
Nearest neighbor search in metric spaces through Content-Addressable Networks
Information Processing and Management: an International Journal
A content-addressable network for similarity search in metric spaces
DBISP2P'05/06 Proceedings of the 2005/2006 international conference on Databases, information systems, and peer-to-peer computing
Hi-index | 0.00 |
Similarity join is an interesting complement of the well-established similarity range and nearest neighbors search primitives in metric spaces. However, the quadratic computational complexity of similarity join prevents from applications on large data collections. We present MCAN+, an extension of MCAN (a Content-Addressable Network for metric objects) to support similarity self join queries. The challenge of the proposed approach is to address the problem of the intrinsic quadratic complexity of similarity joins, with the aim of limiting the elaboration time, by involving an increasing number of computational nodes as the dataset size grows. To test the scalability of MCAN+, we used a real-life dataset of color features extracted from one million images of the Flickr photo sharing website.