Efficient k-NN search on vertically decomposed data

Authors:
Arjen P. de Vries;Nikos Mamoulis;Niels Nes;Martin Kersten
Affiliations:
Centrum voor Wiskunde en Informatica, Kruislaan 413, 1098 SJ, Amsterdam, The Netherlands;University of Hong Kong, Pokfulam Road, Hong Kong;Centrum voor Wiskunde en Informatica, Kruislaan 413, 1098 SJ, Amsterdam, The Netherlands;Centrum voor Wiskunde en Informatica, Kruislaan 413, 1098 SJ, Amsterdam, The Netherlands
Venue:
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Year:
2002

Citing 18
Cited 28

Color indexing

International Journal of Computer Vision
Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Improved query performance with variant indexes

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Supporting similarity queries in MARS

MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Fuzzy queries in multimedia database systems

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Optimal multi-step k-nearest neighbor search

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A decomposition storage model

SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
Indexing images in Oracle8i

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
MindReader: Querying Databases Through Multiple Examples

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
What Is the Nearest Neighbor in High Dimensional Spaces?

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Optimizing Multi-Feature Queries for Image Databases

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Fast Nearest Neighbor Search in Medical Image Databases

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
MIL primitives for querying a fragmented world

The VLDB Journal — The International Journal on Very Large Data Bases
Independent Quantization: An Index Compression Technique for High-Dimensional Data Spaces

ICDE '00 Proceedings of the 16th International Conference on Data Engineering

Making the Pyramid Technique Robust to Query Types and Workloads

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
LDC: Enabling Search By Partial Distance In A Hyper-Dimensional Space

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Array-index: a plug&search K nearest neighbors method for high-dimensional data

Data & Knowledge Engineering
Efficient and self-tuning incremental query expansion for top-k query processing

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
An efficient and versatile query engine for TopX search

VLDB '05 Proceedings of the 31st international conference on Very large data bases
KLEE: a framework for distributed top-k query algorithms

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Toward Efficient Multifeature Query Processing

IEEE Transactions on Knowledge and Data Engineering
SMART-TV: a fast and scalable nearest neighbor based classifier for data mining

Proceedings of the 2006 ACM symposium on Applied computing
Global distance-based segmentation of trajectories

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
High dimensional nearest neighbor searching

Information Systems
Dynamic similarity search in multi-metric spaces

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Efficient top-k aggregation of ranked inputs

ACM Transactions on Database Systems (TODS)
Indexing large human-motion databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
How to Use SIFT Vectors to Analyze an Image with Database Templates

Adaptive Multimedial Retrieval: Retrieval, User, and Semantics
Dimension-Specific Search for Multimedia Retrieval

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Distributed top-k aggregation queries at large

Distributed and Parallel Databases
Scalable kNN search on vertically stored time series

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
An improved K-nearest-neighbor algorithm for text categorization

Expert Systems with Applications: An International Journal
Progressive processing of subspace dominating queries

The VLDB Journal — The International Journal on Very Large Data Bases
RIVA: indexing and visualization of high-dimensional data via dimension reorderings

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Trading precision for speed: localised similarity functions

CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
Multidimensional descriptor indexing: exploring the bitmatrix

CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
Putting the user in the loop: visual resource discovery

AMR'05 Proceedings of the Third international conference on Adaptive Multimedia Retrieval: user, context, and feedback
Probabilistic top-k dominating queries in uncertain databases

Information Sciences: an International Journal
Efficient processing of probabilistic group subspace skyline queries in uncertain databases

Information Systems
Indexing dataspaces with partitions

World Wide Web
A generalized cluster centroid based classifier for text categorization

Information Processing and Management: an International Journal
Understanding Similarity Metrics in Neighbour-based Recommender Systems

Proceedings of the 2013 Conference on the Theory of Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Applications like multimedia retrieval require efficient support for similarity search on large data collections. Yet, nearest neighbor search is a difficult problem in high dimensional spaces, rendering efficient applications hard to realize: index structures degrade rapidly with increasing dimensionality, while sequential search is not an attractive solution for repositories with millions of objects. This paper approaches the problem from a different angle. A solution is sought in an unconventional storage scheme, that opens up a new range of techniques for processing k-NN queries, especially suited for high dimensional spaces. The suggested (physical) database design accommodates well a novel variant of branch-and-bound search, that reduces the high dimensional space quickly to a small candidate set. The paper provides insight in applying this idea to k-NN search using two similarity metrics commonly encountered in image database applications, and discusses techniques for its implementation in relational database systems. The effectiveness of the proposed method is evaluated empirically on both real and synthetic data sets, reporting the significant improvements in response time yielded.