On scalability of the similarity search in the world of peers

Authors:
Michal Batko;David Novak;Fabrizio Falchi;Pavel Zezula
Affiliations:
Masaryk University, Brno, Czech Republic;Masaryk University, Brno, Czech Republic;ISTI-CNR, Pisa, Italy;Masaryk University, Brno, Czech Republic
Venue:
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Year:
2006

Citing 22
Cited 19

Distributing a search tree among a growing number of processors

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Distributed and parallel database systems

ACM Computing Surveys (CSUR)
LH*—a scalable, distributed data structure

ACM Transactions on Database Systems (TODS)
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Searching in metric spaces

ACM Computing Surveys (CSUR)
Skip graphs

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Efficient User-Adaptable Similarity Search in Large Multimedia Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Application-Level Multicast Using Content-Addressable Networks

NGC '01 Proceedings of the Third International COST264 Workshop on Networked Group Communication
Peer-to-peer information retrieval using self-organizing semantic overlay networks

Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
D-Index: Distance Searching Index for Metric Data Sets

Multimedia Tools and Applications
MAAN: A Multi-Attribute Addressable Network for Grid Information Services

GRID '03 Proceedings of the 4th International Workshop on Grid Computing
Index-driven similarity search in metric spaces (Survey Article)

ACM Transactions on Database Systems (TODS)
Mercury: supporting scalable multi-attribute range queries

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
One torus to rule them all: multi-dimensional queries in P2P systems

Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
SWAM: a family of access methods for similarity-search in peer-to-peer data networks

Proceedings of the thirteenth ACM international conference on Information and knowledge management
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search

ACM Transactions on Database Systems (TODS)
Similarity Search: The Metric Space Approach (Advances in Database Systems)

Similarity Search: The Metric Space Approach (Advances in Database Systems)
M-Chord: a scalable distributed similarity search structure

InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
A content-addressable network for similarity search in metric spaces

DBISP2P'05/06 Proceedings of the 2005/2006 international conference on Databases, information systems, and peer-to-peer computing
A scalable nearest neighbor search in p2p systems

DBISP2P'04 Proceedings of the Second international conference on Databases, Information Systems, and Peer-to-Peer Computing
Similarity grid for searching in metric spaces

DELOS'04 Proceedings of the 6th Thematic conference on Peer-to-Peer, Grid, and Service-Orientation in Digital Library Architectures

M-Chord: a scalable distributed similarity search structure

InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Efficient peer-to-peer semantic overlay networks based on statistical language models

P2PIR '06 Proceedings of the international workshop on Information retrieval in peer-to-peer networks
M-Grid: similarity searching in grid

P2PIR '06 Proceedings of the international workshop on Information retrieval in peer-to-peer networks
A digital rights aware similarity measure for multimedia documents

Workshop on multimedia information retrieval on The many faces of multimedia semantics
Nearest neighbor search in metric spaces through Content-Addressable Networks

Information Processing and Management: an International Journal
Processing complex similarity queries in peer-to-peer networks

Proceedings of the 2008 ACM symposium on Applied computing
A distributed incremental nearest neighbor algorithm

Proceedings of the 2nd international conference on Scalable information systems
Scalability comparison of Peer-to-Peer similarity search structures

Future Generation Computer Systems
Distance browsing in distributed multimedia databases

Future Generation Computer Systems
Building self-organized image retrieval network

Proceedings of the 2008 ACM workshop on Large-Scale distributed systems for information retrieval
Efficient range query processing in metric spaces over highly distributed data

Distributed and Parallel Databases
Building a web-scale image similarity search system

Multimedia Tools and Applications
Querying similarity in metric social networks

NBiS'07 Proceedings of the 1st international conference on Network-based information systems
MESSIF: metric similarity search implementation framework

DELOS'07 Proceedings of the 1st international conference on Digital libraries: research and development
Real-life performance of metric searching

SIGSPATIAL Special
A unified multimedia and semantic perspective for data retrieval in the semantic web

Information Systems
Metric-Based similarity search in unstructured peer-to-peer systems

Transactions on Large-Scale Data- and Knowledge-Centered Systems V
Multi feature indexing network MUFIN for similarity search applications

SOFSEM'12 Proceedings of the 38th international conference on Current Trends in Theory and Practice of Computer Science
Large-scale similarity data management with distributed Metric Index

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Due to the increasing complexity of current digital data, similarity search has become a fundamental computational task in many applications. Unfortunately, its costs are still high and the linear scalability of single server implementations prevents from efficient searching in large data volumes. In this paper, we shortly describe four recent scalable distributed similarity search techniques and study their performance of executing queries on three different datasets. Though all the methods employ parallelism to speed up query execution, different advantages for different objectives have been identified by experiments. The reported results can be exploited for choosing the best implementations for specific applications. They can also be used for designing new and better indexing structures in the future.