iDistance: An adaptive B+-tree based indexing method for nearest neighbor search

Authors:
H. V. Jagadish;Beng Chin Ooi;Kian-Lee Tan;Cui Yu;Rui Zhang
Affiliations:
University of Michigan, Ann Arbor, MI;National University of Singapore, Singapore;National University of Singapore, Singapore;Monmouth University, West Long Branch, NJ;National University of Singapore, Singapore
Venue:
ACM Transactions on Database Systems (TODS)
Year:
2005

Citing 31
Cited 123

FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Distance-based indexing for high-dimensional metric spaces

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The pyramid-technique: towards breaking the curse of dimensionality

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
An optimal algorithm for approximate nearest neighbor searching fixed dimensions

Journal of the ACM (JACM)
Fast algorithms for projected clustering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
An optimal algorithm for approximate nearest neighbor searching

SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Indexing the edges—a simple and yet efficient approach to high-dimensional indexing

PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases

ACM Computing Surveys (CSUR)
Database Management Systems

Database Management Systems
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
The TV-tree: an index structure for high-dimensional data

The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
Fast Nearest Neighbor Search in High-Dimensional Space

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Similarity Indexing with the SS-tree

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Similarity Search without Tears: The OMNI Family of All-purpose Access Methods

Proceedings of the 17th International Conference on Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Contrast Plots and P-Sphere Trees: Space vs. Time in Nearest Neighbour Searches

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Indexing the Distance: An Efficient Method to KNN Processing

Proceedings of the 27th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Independent Quantization: An Index Compression Technique for High-Dimensional Data Spaces

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Deflating the Dimensionality Curse Using Multiple Fractal Dimensions

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Contorting high dimensional data for efficient main memory KNN processing

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
The power-method: a comprehensive estimation technique for multi-dimensional queries

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Indexing High-Dimensional Data for Efficient In-Memory Similarity Search

IEEE Transactions on Knowledge and Data Engineering
Approximate NN queries on streams with guaranteed error/performance bounds

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Generalized multidimensional data mapping and query processing

ACM Transactions on Database Systems (TODS)
M-Chord: a scalable distributed similarity search structure

InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
On scalability of the similarity search in the world of peers

InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Location-Dependent Queries in Mobile Contexts: Distributed Processing Using Mobile Agents

IEEE Transactions on Mobile Computing
Indexing for function approximation

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Reference-based indexing of sequence databases

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Using high dimensional indexes to support relevance feedback based interactive images retrieval

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Distributed computation of the knn graph for large high-dimensional point sets

Journal of Parallel and Distributed Computing
Efficient index-based KNN join processing for high-dimensional data

Information and Software Technology
Efficient similarity search by summarization in large video database

ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
Interactive high-dimensional index for large Chinese calligraphic character databases

ACM Transactions on Asian Language Information Processing (TALIP)
CM-tree: A dynamic clustered index for similarity search in metric databases

Data & Knowledge Engineering
Composite distance transformation for indexing and k-nearest-neighbor searching in high-dimensional spaces

Journal of Computer Science and Technology
Efficient and compact indexing structure for processing of spatial queries in line-based databases

Data & Knowledge Engineering
On efficient spatial matching

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Peer-to-peer similarity search in metric spaces

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Indexing high-dimensional data in dual distance spaces: a symmetrical encoding approach

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Querying time-series streams

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
A multi-resolution surface distance model for k-NN query processing

The VLDB Journal — The International Journal on Very Large Data Bases
Scalability comparison of Peer-to-Peer similarity search structures

Future Generation Computer Systems
Efficient Processing of Nearest Neighbor Queries in Parallel Multimedia Databases

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Dynamic active probing of helpdesk databases

Proceedings of the VLDB Endowment
The V*-Diagram: a query-dependent approach to moving KNN queries

Proceedings of the VLDB Endowment
Challenges and techniques for effective and efficient similarity search in large video databases

Proceedings of the VLDB Endowment
Locality condensation: a new dimensionality reduction method for image retrieval

MM '08 Proceedings of the 16th ACM international conference on Multimedia
Optimal incremental multi-step nearest-neighbor search

Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems
A Fast and Effective Dichotomy Based Hash Algorithm for Image Matching

ISVC '08 Proceedings of the 4th International Symposium on Advances in Visual Computing
Compressed B+-trees

WSEAS Transactions on Computers
SubSpace Projection: A unified framework for a class of partition-based dimension reduction techniques

Information Sciences: an International Journal
Bounded coordinate system indexing for real-time video clip search

ACM Transactions on Information Systems (TOIS)
Distributed similarity search in high dimensions using locality sensitive hashing

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
A Unified Indexing Structure for Efficient Cross-Media Retrieval

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Dimension-Specific Search for Multimedia Retrieval

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Optimal K-Nearest-Neighbor Query in Data Grid

APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Quality and efficiency in high dimensional nearest neighbor search

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Easing the Dimensionality Curse by Stretching Metric Spaces

SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
A Web-Based Search Engine for Chinese Calligraphic Manuscript Images

ICWL '009 Proceedings of the 8th International Conference on Advances in Web Based Learning
Probabilistic Granule-Based Inside and Nearest Neighbor Queries

ADBIS '09 Proceedings of the 13th East European Conference on Advances in Databases and Information Systems
SiMPSON: Efficient Similarity Search in Metric Spaces over P2P Structured Overlay Networks

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Transfer non-metric measures into metric for similarity search

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Metric Index: An Efficient and Scalable Solution for Similarity Search

SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
Efficient range query processing in metric spaces over highly distributed data

Distributed and Parallel Databases
Maximal metric margin partitioning for similarity search indexes

Proceedings of the 18th ACM conference on Information and knowledge management
QUC-tree: integrating query context information for efficient music retrieval

IEEE Transactions on Multimedia - Special issue on integration of context and content
Similarity search on Bregman divergence: towards non-metric indexing

Proceedings of the VLDB Endowment
Distance-join: pattern match query in a large graph database

Proceedings of the VLDB Endowment
Multiple unordered wide-baseline image matching and grouping

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
High-dimensional indexing with oriented cluster representation for multimedia databases

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
High-dimensional kNN joins with incremental updates

Geoinformatica
Flexible multi-dimensional indexing server for searching non-textual diagnostic annotations

EuroIMSA '08 Proceedings of the IASTED International Conference on Internet and Multimedia Systems and Applications
Effectiveness of NAQ-tree as index structure for similarity search in high-dimensional metric space

Knowledge and Information Systems
Building a web-scale image similarity search system

Multimedia Tools and Applications
Effectiveness of optimal incremental multi-step nearest neighbor search

Expert Systems with Applications: An International Journal
Squeezing long sequence data for efficient similarity search

APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Continuous online index tuning in moving object databases

ACM Transactions on Database Systems (TODS)
Efficient and accurate nearest neighbor and closest pair search in high-dimensional space

ACM Transactions on Database Systems (TODS)
Processing proximity relations in road networks

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Indexing high-dimensional data for main-memory similarity search

Information Systems
TF-Tree: an interactive and efficient retrieval of Chinese calligraphic manuscript images based on triple features

Proceedings of the ACM International Conference on Image and Video Retrieval
PeerLearning: A Content-Based e-Learning Material Sharing System Based on P2P Network

World Wide Web
Efficient nearest neighbor query based on extended B+-tree in high-dimensional space

Pattern Recognition Letters
Best point detour query in road networks

Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems
Optimizing all-nearest-neighbor queries with trigonometric pruning

SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
iPoc: a polar coordinate based indexing method for nearest neighbor search in high dimensional space

WAIM'10 Proceedings of the 11th international conference on Web-age information management
Pivot selection method for optimizing both pruning and balancing in metric space indexes

DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part II
Large scale rich media information search

PCM'10 Proceedings of the Advances in multimedia information processing, and 11th Pacific Rim conference on Multimedia: Part II
Finding the Nearest Neighbors in Biological Databases Using Less Distance Computations

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A disk-aware algorithm for time series motif discovery

Data Mining and Knowledge Discovery
Subspace clustering for indexing high dimensional data: a main memory index based on local reductions and individual multi-representations

Proceedings of the 14th International Conference on Extending Database Technology
Metric Index: An efficient and scalable solution for precise and approximate similarity search

Information Systems
Fully dynamic metric access methods based on hyperplane partitioning

Information Systems
Design and analysis of a ranking approach to private location-based services

ACM Transactions on Database Systems (TODS)
Large scale disk-based metric indexing structure for approximate information retrieval by content

Proceedings of the 1st Workshop on New Trends in Similarity Search
Finding the k-closest pairs in metric spaces

Proceedings of the 1st Workshop on New Trends in Similarity Search
Indexing the fully evolvement of spatiotemporal objects

WSEAS Transactions on Information Science and Applications
On nonmetric similarity search problems in complex domains

ACM Computing Surveys (CSUR)
Exact indexing for support vector machines

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Flexible aggregate similarity search

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Effective data co-reduction for multimedia similarity search

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
An approach to process continuous location-dependent queries on moving objects with support for location granules

Journal of Systems and Software
Efficient histogram-based similarity search in ultra-high dimensional space

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications: Part II
wNeighbors: a method for finding k nearest neighbors in weighted regions

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications: Part II
Probabilistic and interactive retrieval of chinese calligraphic character images based on multiple features

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
Scalable kNN search on vertically stored time series

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Effective monitoring by efficient fingerprint matching using a forest of NAQ-trees

Journal of Intelligent Information Systems
Correlation-based retrieval for heavily changed near-duplicate videos

ACM Transactions on Information Systems (TOIS)
P2P-based multidimensional indexing methods: A survey

Journal of Systems and Software
Matching query processing in high-dimensional space

Proceedings of the 20th ACM international conference on Information and knowledge management
Fast answering k-nearest-neighbor queries over large image databases using dual distance transformation

MMM'07 Proceedings of the 13th international conference on Multimedia Modeling - Volume Part I
Shared execution strategy for neighbor-based pattern mining requests over streaming windows

ACM Transactions on Database Systems (TODS)
Answering pattern match queries in large graph databases via graph embedding

The VLDB Journal — The International Journal on Very Large Data Bases
iDISQUE: tuning high-dimensional similarity queries in DHT networks

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
Peer-to-peer similarity search based on m-tree indexing

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Multidimensional descriptor indexing: exploring the bitmatrix

CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
iCTPH: an approach to publish and lookup CTPH digests in chord

ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
Metric-Based similarity search in unstructured peer-to-peer systems

Transactions on Large-Scale Data- and Knowledge-Centered Systems V
MUD: Mapping-based query processing for high-dimensional uncertain data

Information Sciences: an International Journal
Locality-sensitive hashing scheme based on dynamic collision counting

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
DuoWave: Mitigating the curse of dimensionality for uncertain data

Data & Knowledge Engineering
User oriented trajectory search for trip recommendation

Proceedings of the 15th International Conference on Extending Database Technology
SIMP: accurate and efficient near neighbor search in high dimensional spaces

Proceedings of the 15th International Conference on Extending Database Technology
Efficient probabilistic image retrieval based on a mixed feature model

APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
A context-aware scheme for privacy-preserving location-based services

Computer Networks: The International Journal of Computer and Telecommunications Networking
Efficient processing of k nearest neighbor joins using MapReduce

Proceedings of the VLDB Endowment
Dynamic optimization of queries in pivot-based indexing

Multimedia Tools and Applications
A minimum spanning tree-inspired clustering-based outlier detection technique

ICDM'12 Proceedings of the 12th Industrial conference on Advances in Data Mining: applications and theoretical aspects
Indexing methods for efficient protein 3D surface search

Proceedings of the ACM sixth international workshop on Data and text mining in biomedical informatics
On the usage of clustering for content based image retrieval

CSR'07 Proceedings of the Second international conference on Computer Science: theory and applications
Dual dimensionality reduction for efficient video similarity search

DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
A data allocation method for efficient content-based retrieval in parallel multimedia databases

ISPA'07 Proceedings of the 2007 international conference on Frontiers of High Performance Computing and Networking
Personalized query evaluation in ring-based P2P networks

Information Sciences: an International Journal
Inter-media hashing for large-scale retrieval from heterogeneous data sources

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Effective hashing for large-scale multimedia search

Proceedings of the 2013 Sigmod/PODS Ph.D. symposium on PhD symposium
Indexing RFID data using the VG-curve

ADC '12 Proceedings of the Twenty-Third Australasian Database Conference - Volume 124
Nearest group queries

Proceedings of the 25th International Conference on Scientific and Statistical Database Management
3D motion retrieval based on double index and user interaction

International Journal of Information and Communication Technology
Near-duplicate video retrieval: Current research and future trends

ACM Computing Surveys (CSUR)
A comprehensive study of idistance partitioning strategies for kNN queries and high-dimensional data indexing

BNCOD'13 Proceedings of the 29th British National conference on Big Data
Enhancing minimum spanning tree-based clustering by removing density-based outliers

Digital Signal Processing
PL-Tree: an efficient indexing method for high-dimensional data

SSTD'13 Proceedings of the 13th international conference on Advances in Spatial and Temporal Databases
QuEval: beyond high-dimensional indexing à la carte

Proceedings of the VLDB Endowment
iKernel: Exact indexing for support vector machines

Information Sciences: an International Journal
Efficient and robust large medical image retrieval in mobile cloud computing environment

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this article, we present an efficient B+-tree based indexing method, called iDistance, for K-nearest neighbor (KNN) search in a high-dimensional metric space. iDistance partitions the data based on a space- or data-partitioning strategy, and selects a reference point for each partition. The data points in each partition are transformed into a single dimensional value based on their similarity with respect to the reference point. This allows the points to be indexed using a B+-tree structure and KNN search to be performed using one-dimensional range search. The choice of partition and reference points adapts the index structure to the data distribution.We conducted extensive experiments to evaluate the iDistance technique, and report results demonstrating its effectiveness. We also present a cost model for iDistance KNN search, which can be exploited in query optimization.