Real-life performance of metric searching

Authors:
Vlastislav Dohnal;Pavel Zezula
Affiliations:
Masaryk University, Brno, Czech Republic;Masaryk University, Brno, Czech Republic
Venue:
SIGSPATIAL Special
Year:
2010

Citing 14
Cited 0

Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes

EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Processing M-trees with Parallel Resources

RIDE '98 Proceedings of the Workshop on Research Issues in Database Engineering
D-Index: Distance Searching Index for Metric Data Sets

Multimedia Tools and Applications
Distributed content-based visual information retrieval system on peer-to-peer networks

ACM Transactions on Information Systems (TOIS)
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)

Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)
Similarity Search: The Metric Space Approach (Advances in Database Systems)

Similarity Search: The Metric Space Approach (Advances in Database Systems)
On scalability of the similarity search in the world of peers

InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Adaptive Approximate Similarity Searching through Metric Social Networks

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Brute force and indexed approaches to pairwise document similarity comparisons with MapReduce

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Generic similarity search engine demonstrated by an image retrieval application

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Efficient range query processing in metric spaces over highly distributed data

Distributed and Parallel Databases
Efficient parallel set-similarity joins using MapReduce

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Similarity is a central notion throughout human lives and it will soon become the prevalent strategy for dealing with digital content also in computer systems. But the exponential growth of data makes the scalability and performance issues serious matters of concern. Contemporary decentralized media of mass communication allowing cooperative and collaborative practices enable users autonomously contribute to production of global media, whose elements are in fact related by numerous multi-facet links of similarity. As an example, consider the sites like Flickr, YouTube, or Facebook that host user-contributed heterogeneous content for a variety of events. Accordingly, the core ability of future data processing systems is the similarity management of large and ever growing volumes of data. In a simplified way, the real-life performance can be constrained from two points of view: (1) the query response time, and (2) the query execution throughput, i.e. the number of queries processed per a unit of time. Typically, the query response time should be on-line, say less than one second, but the query execution throughput can even be expected in hundreds or thousands in case of large-scale web applications.