Similarity Searching in Peer-to-Peer Databases

Authors:
Indrajit Bhattacharya;Srinivas R. Kashyap;Srinivasan Parthasarathy
Affiliations:
University of Maryland at College Park;University of Maryland at College Park;University of Maryland at College Park
Venue:
ICDCS '05 Proceedings of the 25th IEEE International Conference on Distributed Computing Systems
Year:
2005

Citing 0
Cited 6

Peer-to-peer similarity search in metric spaces

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Distributed ranked search

HiPC'07 Proceedings of the 14th international conference on High performance computing
Query based clustering method in structured P2P overlay networks

CCNC'10 Proceedings of the 7th IEEE conference on Consumer communications and networking conference
An efficient mechanism for processing similarity search queries in sensor networks

Information Sciences: an International Journal
A flabellate overlay network for multi-attribute search

Journal of Parallel and Distributed Computing
Hierarchical semantic-based index for ad hoc image retrieval

Journal of Mobile Multimedia

Quantified Score

Hi-index	0.01

Visualization

Abstract

We consider the problem of handling similarity queries in peer-to-peer databases. We propose an indexing and searching mechanism which, given a query object, returns the set of objects in the database that are semantically related to the query. We propose an indexing scheme which clusters data such that semantically related objects are partitioned into a small set of clusters, allowing for a simple and efficient similarity search strategy. Our indexing scheme also decouples object and node locations. Our adaptive replication and randomized lookup schemes exploit this feature and ensure that the number of copies of an object is proportional to its popularity and all replicas are equally likely to serve a given query, thus achieving perfect load balancing. The techniques developed in this work are oblivious to the underlying DHT topology and can be implemented on a variety of structured overlays such as CAN, CHORD, Pastry, and Tapestry. We also present DHT-independent analytical guarantees for the performance of our algorithms in terms of search accuracy, cost, and load-balance; the experimental results from our simulationsconfirm the insights derived from these analytical models.