Sample-based creation of peer summaries for efficient similarity search in scalable peer-to-peer networks

Authors:
Daniel Blank;Soufyane El Allali;Wolfgang Mueller;Andreas Henrich
Affiliations:
University of Bamberg, Bamberg, Germany;University of Bamberg, Bamberg, Germany;University of Bamberg, Bamberg, Germany;University of Bamberg, Bamberg, Germany
Venue:
Proceedings of the international workshop on Workshop on multimedia information retrieval
Year:
2007

Citing 21
Cited 3

Efficient resource selection in distributed visual information systems

MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Distance browsing in spatial databases

ACM Transactions on Database Systems (TODS)
Space/time trade-offs in hash coding with allowable errors

Communications of the ACM
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries

IEEE Transactions on Pattern Analysis and Machine Intelligence
pSearch: information retrieval in structured overlays

ACM SIGCOMM Computer Communication Review
Bayesian models for visual information retrieval

Bayesian models for visual information retrieval
Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
Content-based retrieval in hybrid peer-to-peer networks

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Fast retrieval of high-dimensional feature vectors in P2P networks using compact peer data summaries

MIR '03 Proceedings of the 5th ACM SIGMM international workshop on Multimedia information retrieval
Efficient Semantic-Based Content Search in P2P Network

IEEE Transactions on Knowledge and Data Engineering
Distributed content-based visual information retrieval system on peer-to-peer networks

ACM Transactions on Information Systems (TOIS)
One torus to rule them all: multi-dimensional queries in P2P systems

Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Percolation Search in Power Law Networks: Making Unstructured Peer-to-Peer Networks Scalable

P2P '04 Proceedings of the Fourth International Conference on Peer-to-Peer Computing
SWAM: a family of access methods for similarity-search in peer-to-peer data networks

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Scalable summary based retrieval in P2P networks

Proceedings of the 14th ACM international conference on Information and knowledge management
PRISM: indexing multi-dimensional data in P2P networks using reference vectors

Proceedings of the 13th annual ACM international conference on Multimedia
2Fast: Collaborative Downloads in P2P Networks

P2P '06 Proceedings of the Sixth IEEE International Conference on Peer-to-Peer Computing
Comparison of Image Similarity Queries in P2P Systems

P2P '06 Proceedings of the Sixth IEEE International Conference on Peer-to-Peer Computing
Clustering-Based Source Selection for Efficient Image Retrieval in Peer-to-Peer Networks

ISM '06 Proceedings of the Eighth IEEE International Symposium on Multimedia
The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments

IEEE Transactions on Image Processing

On low dimensional random projections and similarity search

Proceedings of the 17th ACM conference on Information and knowledge management
The state of the art in content-based image retrieval in P2P networks

ICIMCS '10 Proceedings of the Second International Conference on Internet Multimedia Computing and Service
Source selection for image retrieval in peer-to-peer networks

FDIA'09 Proceedings of the Third BCS-IRSG conference on Future Directions in Information Access

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we introduce a simple yet experimentally convincing approach in the research field of source selection for content-based similarity search in P2P networks or, more concretely, in summary-based P2P systems. In these systems, summaries are used for data source selection when performing k-NN queries on distributed collections of documents represented by feature vectors. We introduce a new type of cluster-based summaries for source selection that can efficiently and cheaply be calculated and distributed in P2P networks. For the summaries generation, a very large number of sample points is used. Each peer in the network assigns its indexing data to their corresponding closest sample points and publishes its constructed summary. We evaluate the quality of these summaries when changing the number of sample points used in experiments on real-world image feature data obtained from a large crawl of the flickr web photo community and show that for higher numbers of sample points we achieve a better retrieval performance. Our experiments show that the proposed summaries yield four times better performance with respect to previous methods. Intuitively, there are some disadvantages to this approach due to the large size of the generated summaries. We show experimentally, that these disadvantages can easily be overcome due to the sparse nature of the generated summaries by simple compression techniques.