A data allocation method for efficient content-based retrieval in parallel multimedia databases

Authors:
Jorge Manjarrez-Sanchez;José Martinez;Patrick Valduriez
Affiliations:
INRIA and LINA, Université de Nantes;INRIA and LINA, Université de Nantes;INRIA and LINA, Université de Nantes
Venue:
ISPA'07 Proceedings of the 2007 international conference on Frontiers of High Performance Computing and Networking
Year:
2007

Citing 22
Cited 0

Algorithms for clustering data

Algorithms for clustering data
Parallel R-trees

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Fast parallel similarity search in multimedia databases

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Principles of distributed database systems (2nd ed.)

Principles of distributed database systems (2nd ed.)
On the effects of dimensionality reduction on high dimensional similarity search

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases

ACM Computing Surveys (CSUR)
Clustering for Approximate Similarity Search in High-Dimensional Spaces

IEEE Transactions on Knowledge and Data Engineering
An Efficient k-Means Clustering Algorithm: Analysis and Implementation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Probabilistic proximity search: fighting the curse of dimensionality in metric spaces

Information Processing Letters
Optimal Allocation of Two-Dimensional Data

ICDT '97 Proceedings of the 6th International Conference on Database Theory
A Parallel Similarity Search in High Dimensional Metric Space Using M-Tree

IWCC '01 Proceedings of the NATO Advanced Research Workshop on Advanced Environments, Tools, and Applications for Cluster Computing-Revised Papers
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Indexing the Distance: An Efficient Method to KNN Processing

Proceedings of the 27th International Conference on Very Large Data Bases
Processing M-trees with Parallel Resources

RIDE '98 Proceedings of the Workshop on Research Issues in Database Engineering
Master-Client R-Trees: A New Parallel R-Tree Architecture

SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
ClusterTree: Integration of Cluster Representation and Nearest-Neighbor Search for Large Data Sets with High Dimensions

IEEE Transactions on Knowledge and Data Engineering
Approximate searches: k-neighbors + precision

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
An Efficient Subspace Sampling Framework for High-Dimensional Data Reduction, Selectivity Estimation, and Nearest-Neighbor Search

IEEE Transactions on Knowledge and Data Engineering
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search

ACM Transactions on Database Systems (TODS)
Efficient k-nearest neighbor searches for parallel multidimensional index structures

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
An index structure for parallel processing of multidimensional data

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Scaling up to large multimedia databases with high dimensional metadata descriptions while providing fast content-based retrieval (CBR) is getting increasingly important for many applications. To address this objective, we strive to exploit the popular parallel shared-nothing architecture. In this context, a major problem is data allocation on the different nodes in order to yield efficient parallel content-based retrieval. In this paper, assuming a clustering process and based on a complexity analysis of CBR, we propose a data allocation method with an optimal number of clusters and nodes. We validated our method through experiments with different high dimensional synthetic databases and implemented a query processing algorithm for full k nearest neighbors.