Efficient Processing of Nearest Neighbor Queries in Parallel Multimedia Databases

Authors:
Jorge Manjarrez-Sanchez;José Martinez;Patrick Valduriez
Affiliations:
INRIA and LINA, Université de Nantes,;INRIA and LINA, Université de Nantes,;INRIA and LINA, Université de Nantes,
Venue:
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Year:
2008

Citing 24
Cited 1

Algorithms for clustering data

Algorithms for clustering data
Parallel R-trees

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Fast parallel similarity search in multimedia databases

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Principles of distributed database systems (2nd ed.)

Principles of distributed database systems (2nd ed.)
On the effects of dimensionality reduction on high dimensional similarity search

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases

ACM Computing Surveys (CSUR)
Query by Image and Video Content: The QBIC System

Computer
Clustering for Approximate Similarity Search in High-Dimensional Spaces

IEEE Transactions on Knowledge and Data Engineering
An Efficient k-Means Clustering Algorithm: Analysis and Implementation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Probabilistic proximity search: fighting the curse of dimensionality in metric spaces

Information Processing Letters
Optimal Allocation of Two-Dimensional Data

ICDT '97 Proceedings of the 6th International Conference on Database Theory
A Parallel Similarity Search in High Dimensional Metric Space Using M-Tree

IWCC '01 Proceedings of the NATO Advanced Research Workshop on Advanced Environments, Tools, and Applications for Cluster Computing-Revised Papers
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Indexing the Distance: An Efficient Method to KNN Processing

Proceedings of the 27th International Conference on Very Large Data Bases
Processing M-trees with Parallel Resources

RIDE '98 Proceedings of the Workshop on Research Issues in Database Engineering
Master-Client R-Trees: A New Parallel R-Tree Architecture

SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
ClusterTree: Integration of Cluster Representation and Nearest-Neighbor Search for Large Data Sets with High Dimensions

IEEE Transactions on Knowledge and Data Engineering
Approximate searches: k-neighbors + precision

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
An Efficient Subspace Sampling Framework for High-Dimensional Data Reduction, Selectivity Estimation, and Nearest-Neighbor Search

IEEE Transactions on Knowledge and Data Engineering
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search

ACM Transactions on Database Systems (TODS)
Clustering Billions of Images with Large Scale Nearest Neighbor Search

WACV '07 Proceedings of the Eighth IEEE Workshop on Applications of Computer Vision
Efficient k-nearest neighbor searches for parallel multidimensional index structures

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
An index structure for parallel processing of multidimensional data

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management

Large-scale similarity-based join processing in multimedia databases

MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper deals with the performance problem of nearest neighbor queries in voluminous multimedia databases. We propose a data allocation method which allows achieving a $0(\sqrt{n})$ query processing time in parallel settings. Our proposal is based on the complexity analysis of content based retrieval when it is used a clustering method. We derive a valid range of values for the number of clusters that should be obtained from the database. Then, to efficiently process nearest neighbor queries, we derive the optimal number of nodes to maximize parallel resources. We validated our method through experiments with different high dimensional databases and implemented a query processing algorithm for full knearest neighbors in a shared nothing cluster.