Array-index: a plug&search K nearest neighbors method for high-dimensional data

Authors:
Zaher Al Aghbari
Affiliations:
Department of Computer Science, University of Sharjah, P.O. Box 27272, Sharjah, UAE
Venue:
Data & Knowledge Engineering
Year:
2005

Citing 28
Cited 4

Spatial query processing in an object-oriented database system

SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Fractals for secondary key retrieval

PODS '89 Proceedings of the eighth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Fast multiresolution image querying

SIGGRAPH '95 Proceedings of the 22nd annual conference on Computer graphics and interactive techniques
Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Two algorithms for nearest-neighbor search in high dimensions

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Self-organizing maps

Self-organizing maps
Fast algorithms for projected clustering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
The Grid File: An Adaptable, Symmetric Multikey File Structure

ACM Transactions on Database Systems (TODS)
Locally adaptive dimensionality reduction for indexing large time series databases

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Searching Multimedia Databases by Content

Searching Multimedia Databases by Content
Efficient k-NN search on vertically decomposed data

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
How to improve the pruning ability of dynamic metric access methods

Proceedings of the eleventh international conference on Information and knowledge management
Query by Image and Video Content: The QBIC System

Computer
Fast and Effective Retrieval of Medical Tumor Shapes

IEEE Transactions on Knowledge and Data Engineering
Similarity Indexing with the SS-tree

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Nearest Neighbors Can Be Found Efficiently If the Dimension Is Small Relative to the Input Size

ICDT '03 Proceedings of the 9th International Conference on Database Theory
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Indexing the Distance: An Efficient Method to KNN Processing

Proceedings of the 27th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Fast k-NN Image Search with Self-Organizing Maps

CIVR '02 Proceedings of the International Conference on Image and Video Retrieval
Independent Quantization: An Index Compression Technique for High-Dimensional Data Spaces

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
An efficient indexing method for nearest neighbor searches inhigh-dirnensional image databases

IEEE Transactions on Multimedia

A privacy preserving technique for distance-based classification with worst case privacy guarantees

Data & Knowledge Engineering
On efficient mutual nearest neighbor query processing in spatial databases

Data & Knowledge Engineering
A flexible framework to ease nearest neighbor search in multidimensional data spaces

Data & Knowledge Engineering
An improved K-nearest-neighbor algorithm for text categorization

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Previous algorithms of data partitioning methods (DPMs) to find the exact K-nearest neighbors (KNN) at high dimensions are outperformed by a linear scan method [J.M. Kleinberg, Two algorithms for nearest neighbor search in high dimensions, 29th ACM Symposium on Theory of computing, 1997; R. Weber, H.-J. Schek, S. Blott. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces, in: Proc. of the 24th VLDB, USA, 1998]. In this paper, we present a "plug& search" method to greatly speed up the exact KNN search of existing DPMs. The idea is to linearize the data partitions produced by a DPM, rather than the points themselves, into a one-dimensional array-index, that is simple, compact and fast. Unlike most DPMs that support KNN search, which require storage space linear, or exponential [J.M. Kleinberg, Two algorithms for nearest neighbor search in high dimensions, 29th ACM Symposium on Theory of computing, 1997; M. Hagedoom, Nearest neighbors can be found efficiently if the dimension is small relative to the input size, ICDT 2003], in dimensions, the array-index requires a storage space that is linear in the number of mapped partitions.