Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Processing M-trees with Parallel Resources
RIDE '98 Proceedings of the Workshop on Research Issues in Database Engineering
D-Index: Distance Searching Index for Metric Data Sets
Multimedia Tools and Applications
Distributed content-based visual information retrieval system on peer-to-peer networks
ACM Transactions on Information Systems (TOIS)
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)
Similarity Search: The Metric Space Approach (Advances in Database Systems)
Similarity Search: The Metric Space Approach (Advances in Database Systems)
On scalability of the similarity search in the world of peers
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Adaptive Approximate Similarity Searching through Metric Social Networks
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Brute force and indexed approaches to pairwise document similarity comparisons with MapReduce
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Generic similarity search engine demonstrated by an image retrieval application
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Efficient range query processing in metric spaces over highly distributed data
Distributed and Parallel Databases
Efficient parallel set-similarity joins using MapReduce
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Hi-index | 0.00 |
Similarity is a central notion throughout human lives and it will soon become the prevalent strategy for dealing with digital content also in computer systems. But the exponential growth of data makes the scalability and performance issues serious matters of concern. Contemporary decentralized media of mass communication allowing cooperative and collaborative practices enable users autonomously contribute to production of global media, whose elements are in fact related by numerous multi-facet links of similarity. As an example, consider the sites like Flickr, YouTube, or Facebook that host user-contributed heterogeneous content for a variety of events. Accordingly, the core ability of future data processing systems is the similarity management of large and ever growing volumes of data. In a simplified way, the real-life performance can be constrained from two points of view: (1) the query response time, and (2) the query execution throughput, i.e. the number of queries processed per a unit of time. Typically, the query response time should be on-line, say less than one second, but the query execution throughput can even be expected in hundreds or thousands in case of large-scale web applications.