Optimal multi-step k-nearest neighbor search

Authors:
Thomas Seidl;Hans-Peter Kriegel
Affiliations:
University of Munich, Germany, Institute for Computer Science;University of Munich, Germany, Institute for Computer Science
Venue:
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Year:
1998

Citing 29
Cited 156

Computational geometry: an introduction

Computational geometry: an introduction
A retrieval technique for similar shapes

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Searching for geometric molecular shape complementarity using bidimensional surface profiles

Journal of Molecular Graphics
Similar shape retrieval using a structural feature index

Information Systems
Efficient and effective querying by image content

Journal of Intelligent Information Systems - Special issue: advances in visual information management systems
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Accounting for boundary effects in nearest neighbor searching

Proceedings of the eleventh annual symposium on Computational geometry
Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Fast parallel similarity search in multimedia databases

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
S3: similarity search in CAD database systems

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A cost model for nearest neighbor search in high-dimensional data space

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Multidimensional access methods

ACM Computing Surveys (CSUR)
An Algorithm for Finding Best Matches in Logarithmic Expected Time

ACM Transactions on Mathematical Software (TOMS)
Approximation-Based Similarity Search for 3-D Surface Segments

Geoinformatica
Efficient Color Histogram Indexing for Quadratic Form Distance Functions

IEEE Transactions on Pattern Analysis and Machine Intelligence
PROBE Spatial Data Modeling and Query Processing in an Image Database Application

IEEE Transactions on Software Engineering
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Fast Nearest Neighbor Search in High-Dimensional Space

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Similarity Indexing with the SS-tree

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Performance of Nearest Neighbor Queries in R-Trees

ICDT '97 Proceedings of the 6th International Conference on Database Theory
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Fast Nearest Neighbor Search in Medical Image Databases

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient User-Adaptable Similarity Search in Large Multimedia Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Storage and Access Architecture for Efficient Query Processing in Spatial Database Systems

SSD '93 Proceedings of the Third International Symposium on Advances in Spatial Databases
Ranking in Spatial Databases

SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
3D Similarity Search by Shape Approximation

SSD '97 Proceedings of the 5th International Symposium on Advances in Spatial Databases

A new method for similarity indexing of market basket data

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Density-based indexing for approximate nearest-neighbor queries

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Distance browsing in spatial databases

ACM Transactions on Database Systems (TODS)
Adaptive multi-stage distance join processing

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Image retrieval using flexible image subblocks

SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 2
Dimensionality reduction and similarity computation by inner product approximations

Proceedings of the ninth international conference on Information and knowledge management
Supporting subseries nearest neighbor search via approximation

Proceedings of the ninth international conference on Information and knowledge management
Locally adaptive dimensionality reduction for indexing large time series databases

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Modeling high-dimensional index structures using sampling

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Similarity-based algebra for multimedia database systems

ADC '01 Proceedings of the 12th Australasian database conference
Efficient k-NN search on vertically decomposed data

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Time-parameterized queries in spatio-temporal databases

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Top-k selection queries over relational databases: Mapping strategies and performance evaluation

ACM Transactions on Database Systems (TODS)
Locally adaptive dimensionality reduction for indexing large time series databases

ACM Transactions on Database Systems (TODS)
Searching in metric spaces with user-defined and approximate distances

ACM Transactions on Database Systems (TODS)
Evaluating continuous nearest neighbor queries for streaming time series via pre-fetching

Proceedings of the eleventh international conference on Information and knowledge management
An Enhanced Technique for k-Nearest Neighbor Queries with Non-Spatial Selection Predicates

Multimedia Tools and Applications
Approximation-Based Similarity Search for 3-D Surface Segments

Geoinformatica
Combining Approximation Techniques and Vector Quantization for Adaptable Similarity Search

Journal of Intelligent Information Systems - Special issue on data warehousing and knowledge discovery
Fast and Effective Retrieval of Medical Tumor Shapes

IEEE Transactions on Knowledge and Data Engineering
A Multistep Approach for Shape Similarity Search in Image Databases

IEEE Transactions on Knowledge and Data Engineering
Indexing the Solution Space: A New Technique for Nearest Neighbor Search in High-Dimensional Space

IEEE Transactions on Knowledge and Data Engineering
Querying Time Series Data Based on Similarity

IEEE Transactions on Knowledge and Data Engineering
High-dimensional nearest neighbor search with remote data centers

Knowledge and Information Systems
VQ-index: an index structure for similarity searching in multimedia databases

Proceedings of the tenth ACM international conference on Multimedia
Improving Adaptable Similarity Query Processing by Using Approximations

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Evaluating Top-k Selection Queries

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Efficient Index Structures for String Databases

Proceedings of the 27th International Conference on Very Large Data Bases
Similarity-Based Operators in Image Database Systems

WAIM '01 Proceedings of the Second International Conference on Advances in Web-Age Information Management
An Efficient Aproach to Similarity-Based Retrieval on Top of Relational Databases

EWCBR '00 Proceedings of the 5th European Workshop on Advances in Case-Based Reasoning
Data Mining and Personalization Technologies

DASFAA '99 Proceedings of the Sixth International Conference on Database Systems for Advanced Applications
3D Shape Histograms for Similarity Search and Classification in Spatial Databases

SSD '99 Proceedings of the 6th International Symposium on Advances in Spatial Databases
K-Nearest Neighbor Search for Moving Query Point

SSTD '01 Proceedings of the 7th International Symposium on Advances in Spatial and Temporal Databases
Hashing Moving Objects

MDM '01 Proceedings of the Second International Conference on Mobile Data Management
Search K Nearest Neighbors on Air

MDM '03 Proceedings of the 4th International Conference on Mobile Data Management
Collection fusion using Bayesian estimation of a linear regression model in image databases on the Web

Information Processing and Management: an International Journal - Modelling vagueness and subjectivity in information access
Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient retrieval of similar shapes

The VLDB Journal — The International Journal on Very Large Data Bases
Properties of Embedding Methods for Similarity Searching in Metric Spaces

IEEE Transactions on Pattern Analysis and Machine Intelligence
Spatial Index on Air

PERCOM '03 Proceedings of the First IEEE International Conference on Pervasive Computing and Communications
Issues in Multimedia Database Management

IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
Warping indexes with envelope transforms for query by humming

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Using sets of feature vectors for similarity search on voxelized CAD objects

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
ClusterTree: Integration of Cluster Representation and Nearest-Neighbor Search for Large Data Sets with High Dimensions

IEEE Transactions on Knowledge and Data Engineering
Adaptive and Incremental Processing for Distance Join Queries

IEEE Transactions on Knowledge and Data Engineering
Efficient evaluation of relevance feedback for multidimensional all-pairs retrieval

Proceedings of the 2003 ACM symposium on Applied computing
Efficient region-based image retrieval

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
High dimensional reverse nearest neighbor queries

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Dimensionality reduction using magnitude and shape approximations

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Index-driven similarity search in metric spaces (Survey Article)

ACM Transactions on Database Systems (TODS)
Evaluating Refined Queries in Top-k Retrieval Systems

IEEE Transactions on Knowledge and Data Engineering
Optimizing Similarity Search for Arbitrary Length Time Series Queries

IEEE Transactions on Knowledge and Data Engineering
Efficient Similarity Search in Large Databases of Tree Structured Objects

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Content-based Three-dimensional Engineering Shape Search

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
A novel technique for indexing video surveillance data

IWVS '03 First ACM SIGMM international workshop on Video surveillance
Multi-Way Distance Join Queries in Spatial Databases

Geoinformatica
Energy efficient exact kNN search in wireless broadcast environments

Proceedings of the 12th annual ACM international workshop on Geographic information systems
Spatial queries in wireless broadcast systems

Wireless Networks - Special issue: Pervasive computing and communications
Exact indexing of dynamic time warping

Knowledge and Information Systems
Fast and Exact Warping of Time Series Using Adaptive Segmental Approximations

Machine Learning
Monitoring k-Nearest Neighbor Queries over Moving Objects

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Similarity evaluation on tree-structured data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Aggregate Nearest Neighbor Queries in Road Networks

IEEE Transactions on Knowledge and Data Engineering
DDR: an index method for large time-series datasets

Information Systems
Exact k-NN queries on clustered SVD datasets

Information Processing Letters
A Shrinking-Based Clustering Approach for Multidimensional Data

IEEE Transactions on Knowledge and Data Engineering
Generalized multidimensional data mapping and query processing

ACM Transactions on Database Systems (TODS)
Two ellipse-based pruning methods for group nearest neighbor queries

Proceedings of the 13th annual ACM international workshop on Geographic information systems
Adaptive nearest neighbor queries in travel time networks

Proceedings of the 13th annual ACM international workshop on Geographic information systems
Alternative Solutions for Continuous K Nearest Neighbor Queries in Spatial Network Databases

Geoinformatica
Range Nearest-Neighbor Query

IEEE Transactions on Knowledge and Data Engineering
Real-Time Processing of Range-Monitoring Queries in Heterogeneous Mobile Databases

IEEE Transactions on Mobile Computing
An error-resilient cell-based distributed index for location-based wireless broadcast services

MobiDE '06 Proceedings of the 5th ACM international workshop on Data engineering for wireless and mobile access
Filter ranking in high-dimensional space

Data & Knowledge Engineering
A non-linear dimensionality-reduction technique for fast similarity search in large databases

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Online summarization of dynamic time series data

The VLDB Journal — The International Journal on Very Large Data Bases
Nearest and reverse nearest neighbor queries for moving objects

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient processing of complex similarity queries in RDBMS through query rewriting

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A performance comparison of distance-based query algorithms using R-trees in spatial databases

Information Sciences: an International Journal
K nearest neighbor search in navigation systems

Mobile Information Systems
Processing partially specified queries over high-dimensional databases

Data & Knowledge Engineering
Adaptive similarity search in streaming time series with sliding windows

Data & Knowledge Engineering
Processing k nearest neighbor queries in location-aware sensor networks

Signal Processing
Continuous nearest neighbor search

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Exact indexing of dynamic time warping

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A shrinking-based approach for multi-dimensional data analysis

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Query processing in spatial network databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Voronoi-based K nearest neighbor search for spatial network databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Optimizing parallel itineraries for knn query processing in wireless sensor networks

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Ranked subsequence matching in time-series databases

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient Similarity Search over Future Stream Time Series

IEEE Transactions on Knowledge and Data Engineering
Proximity queries in large traffic networks

Proceedings of the 15th annual ACM international symposium on Advances in geographic information systems
The TS-tree: efficient time series search and retrieval

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Brief paper: Experience inclusion in iterative learning controllers: Fuzzy model based approaches

Engineering Applications of Artificial Intelligence
Data mining on the cell broadband engine

Proceedings of the 22nd annual international conference on Supercomputing
Efficient EMD-based similarity search in multimedia databases via flexible dimensionality reduction

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Scaling and time warping in time series querying

The VLDB Journal — The International Journal on Very Large Data Bases
A multi-resolution surface distance model for k-NN query processing

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient Similarity Search for Tree-Structured Data

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
On efficiently searching trajectories and archival data for historical similarities

Proceedings of the VLDB Endowment
A novel optimization approach to efficiently process aggregate similarity queries in metric access methods

Proceedings of the 17th ACM conference on Information and knowledge management
Optimal incremental multi-step nearest-neighbor search

Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems
SubSpace Projection: A unified framework for a class of partition-based dimension reduction techniques

Information Sciences: an International Journal
Indexing density models for incremental learning and anytime classification on data streams

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Dimensionality reduction for similarity search with the Euclidean distance in high-dimensional applications

Multimedia Tools and Applications
Techniques for Efficiently Searching in Spatial, Temporal, Spatio-temporal, and Multimedia Databases

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Continually answering constraint k-NN queries in unstructured P2P systems

Journal of Computer Science and Technology
Towards solving similarity search problems using fuzzy concept for multi-dimensional data

Proceedings of the 47th Annual Southeast Regional Conference
Robust Adaptable Video Copy Detection

SSTD '09 Proceedings of the 11th International Symposium on Advances in Spatial and Temporal Databases
SubCOID: an attempt to explore cluster-outlier iterative detection approach to multi-dimensional data analysis in subspace

Proceedings of the 46th Annual Southeast Regional Conference on XX
Continuous range search based on network Voronoi diagram

International Journal of Grid and Utility Computing
Reverse skyline search in uncertain databases

ACM Transactions on Database Systems (TODS)
Anticipatory DTW for efficient similarity search in time series databases

Proceedings of the VLDB Endowment
Exact k-NN queries on clustered SVD datasets

Information Processing Letters
Effectiveness of NAQ-tree as index structure for similarity search in high-dimensional metric space

Knowledge and Information Systems
A framework for rank computation and aggregation in fuzzy environments

INFOCOM'09 Proceedings of the 28th IEEE international conference on Computer Communications Workshops
Algorithms for constrained k-nearest neighbor queries over moving object trajectories

Geoinformatica
Combining weights into scores: a linear transform approach

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Pattern detector: fast detection of suspicious stream patterns for immediate reaction

Proceedings of the 13th International Conference on Extending Database Technology
Efficient k-nearest neighbor searching in nonordered discrete data spaces

ACM Transactions on Information Systems (TOIS)
Algorithms for memory hierarchies: advanced lectures

Algorithms for memory hierarchies: advanced lectures
Interval-focused similarity search in time series databases

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Generalizing the optimality of multi-step k-nearest neighbor query processing

SSTD'07 Proceedings of the 10th international conference on Advances in spatial and temporal databases
Effectiveness of optimal incremental multi-step nearest neighbor search

Expert Systems with Applications: An International Journal
High-dimensional indexing: transformational approaches to high-dimensional range and similarity searches

High-dimensional indexing: transformational approaches to high-dimensional range and similarity searches
K-nearest neighbor search for fuzzy objects

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Instant code clone search

Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Towards improving a similarity search approach

Proceedings of the 48th Annual Southeast Regional Conference
Towards improving subspace data analysis

Proceedings of the 48th Annual Southeast Regional Conference
Efficient and effective similarity search over probabilistic data based on earth mover's distance

Proceedings of the VLDB Endowment
Subspace clustering for indexing high dimensional data: a main memory index based on local reductions and individual multi-representations

Proceedings of the 14th International Conference on Extending Database Technology
Path branch points in mobile navigation

Proceedings of the 8th International Conference on Advances in Mobile Computing and Multimedia
Improving web database search incorporating users query information

Proceedings of the International Conference on Web Intelligence, Mining and Semantics
Finding the sites with best accessibilities to amenities

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications: Part II
An instance selection algorithm based on reverse nearest neighbor

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Scalable kNN search on vertically stored time series

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient processing of multiple DTW queries in time series databases

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Probabilistic time consistent queries over moving objects

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
New query processing algorithms for range and k-NN search in spatial network databases

CoMoGIS'06 Proceedings of the 2006 international conference on Advances in Conceptual Modeling: theory and practice
Probabilistic similarity join on uncertain data

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Finding data broadness via generalized nearest neighbors

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Speeding up complex video copy detection queries

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
Nearest neighbor search on vertically partitioned high-dimensional data

DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
The islands approach to nearest neighbor querying in spatial networks

SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
High-dimensional similarity search using data-sensitive space partitioning

DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Materialization-Based range and k-nearest neighbor query processing algorithms

FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Investigation of broadcast-audio semantic analysis scenarios employing radio-programme-adaptive pattern classification

Speech Communication
Brief Incorporation of experience in iterative learning controllers using locally weighted learning

Automatica (Journal of IFAC)
Model-Based similarity measure in timecloud

APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Spatial query processing for fuzzy objects

The VLDB Journal — The International Journal on Very Large Data Bases
Dimensionality reduction in high-dimensional space for multimedia information retrieval

DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Personalized query evaluation in ring-based P2P networks

Information Sciences: an International Journal
Document selection for tiered indexing in commerce search

Proceedings of the sixth ACM international conference on Web search and data mining
Active labeling application applied to food-related object recognition

Proceedings of the 5th international workshop on Multimedia for cooking & eating activities
VGQ-Vor: extending virtual grid quadtree with Voronoi diagram for mobile k nearest neighbor queries over mobile objects

Frontiers of Computer Science: Selected Publications from Chinese Universities
Efficient Time-Stamped Event Sequence Anonymization

ACM Transactions on the Web (TWEB)

Quantified Score

Hi-index	0.00

Visualization

Abstract

For an increasing number of modern database applications, efficient support of similarity search becomes an important task. Along with the complexity of the objects such as images, molecules and mechanical parts, also the complexity of the similarity models increases more and more. Whereas algorithms that are directly based on indexes work well for simple medium-dimensional similarity distance functions, they do not meet the efficiency requirements of complex high-dimensional and adaptable distance functions. The use of a multi-step query processing strategy is recommended in these cases, and our investigations substantiate that the number of candidates which are produced in the filter step and exactly evaluated in the refinement step is a fundamental efficiency parameter. After revealing the strong performance shortcomings of the state-of-the-art algorithm for k-nearest neighbor search [Korn et al. 1996], we present a novel multi-step algorithm which is guaranteed to produce the minimum number of candidates. Experimental evaluations demonstrate the significant performance gain over the previous solution, and we observed average improvement factors of up to 120 for the number of candidates and up to 48 for the total runtime.