Antipole Tree Indexing to Support Range Search and K-Nearest Neighbor Search in Metric Spaces

Authors:
Domenico Cantone;Alfredo Ferro;Alfredo Pulvirenti;Diego Reforgiato Recupero;Dennis Shasha
Affiliations:
-;-;-;-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2005

Citing 39
Cited 12

Optimal algorithms for approximate clustering

STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
New techniques for best-match retrieval

ACM Transactions on Information Systems (TOIS)
Vector quantization and signal compression

Vector quantization and signal compression
Farthest neighbors, maximum spanning trees and related problems in higher dimensions

Computational Geometry: Theory and Applications
Computational geometry: a retrospective

STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Approximation schemes for covering and packing problems in image processing and VLSI

Journal of the ACM (JACM)
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Nearest neighbor queries in metric spaces

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Approximate nearest neighbors: towards removing the curse of dimensionality

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Sublinear time algorithms for metric space problems

STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Data structures and algorithms for nearest neighbor search in general metric spaces

SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Efficiently approximating the minimum-volume bounding box of a point set in three dimensions

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Distance browsing in spatial databases

ACM Transactions on Database Systems (TODS)
Indexing large metric spaces for similarity search queries

ACM Transactions on Database Systems (TODS)
The choice of reference points in best-match file searching

Communications of the ACM
Some approaches to best-match file searching

Communications of the ACM
A practical approach for computing the diameter of a point set

SCG '01 Proceedings of the seventeenth annual symposium on Computational geometry
Texture synthesis over arbitrary manifold surfaces

Proceedings of the 28th annual conference on Computer graphics and interactive techniques
Searching in metric spaces

ACM Computing Surveys (CSUR)
Machine Learning

Machine Learning
Clustering for Approximate Similarity Search in High-Dimensional Spaces

IEEE Transactions on Knowledge and Data Engineering
CLARANS: A Method for Clustering Objects for Spatial Data Mining

IEEE Transactions on Knowledge and Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Similarity Search in High Dimensions via Hashing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Probabilistic Proximity Searching Algorithms Based on Compact Partitions

SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Monotonous Bisector* Trees - A Tool for Efficient Partitioning of Complex Scenes of Geometric Objects

Data Structures and Efficient Algorithms, Final Report on the DFG Special Joint Initiative
Proximity Matching Using Fixed-Queries Trees

CPM '94 Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching
WaveCluster: a wavelet-based clustering approach for spatial data in very large databases

The VLDB Journal — The International Journal on Very Large Data Bases
Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances

The VLDB Journal — The International Journal on Very Large Data Bases
Searching in metric spaces by spatial approximation

The VLDB Journal — The International Journal on Very Large Data Bases
An Effective Clustering Algorithm to Index High Dimensional Metric Spaces

SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Clustering Large Datasets in Arbitrary Metric Spaces

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
PAC Nearest Neighbor Queries: Approximate and Controlled Search in High-Dimensional and Metric Spaces

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Geometric techniques for clustering: theory and practice

Geometric techniques for clustering: theory and practice
An Efficient Approximate Algorithm for the 1-Median Problem in Metric Spaces

SIAM Journal on Optimization

Dynamic spatial approximation trees

Journal of Experimental Algorithmics (JEA)
Parallel query processing on distributed clustering indexes

Journal of Discrete Algorithms
Approximate similarity search: A multi-faceted problem

Journal of Discrete Algorithms
Analyzing Metric Space Indexes: What For?

SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
A flexible framework to ease nearest neighbor search in multidimensional data spaces

Data & Knowledge Engineering
Distributed antipole clustering for efficient data search and management in Euclidean and metric spaces

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Obstacles constrained group mobility models in event-driven wireless networks with movable base stations

Ad Hoc Networks
Voronoi-based multi-level range search in mobile navigation

Multimedia Tools and Applications
Clustered trie structures for approximate search in hierarchical objects collections

ICAPR'05 Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I
Multiple-Winners randomized tournaments with consensus for optimization problems in generic metric spaces

WEA'05 Proceedings of the 4th international conference on Experimental and Efficient Algorithms
Halfway through the semantic gap: Prosemantic features for image retrieval

Information Sciences: an International Journal
Fast techniques for mosaic rendering

Computational Aesthetics'05 Proceedings of the First Eurographics conference on Computational Aesthetics in Graphics, Visualization and Imaging

Quantified Score

Hi-index	0.00

Visualization

Abstract

Range and k-nearest neighbor searching are core problems in pattern recognition. Given a database S of objects in a metric space M and a query object q in M, in a range searching problem the goal is to find the objects of S within some threshold distance to q, whereas in a k-nearest neighbor searching problem, the k elements of S closest to q must be produced. These problems can obviously be solved with a linear number of distance calculations, by comparing the query object against every object in the database. However, the goal is to solve such problems much faster. We combine and extend ideas from the M-Tree, the Multivantage Point structure, and the FQ-Tree to create a new structure in the "bisector tree驴 class, called the Antipole Tree. Bisection is based on the proximity to an "Antipole驴 pair of elements generated by a suitable linear randomized tournament. The final winners a,b of such a tournament are far enough apart to approximate the diameter of the splitting set. If {\rm{dist}}(a,b) is larger than the chosen cluster diameter threshold, then the cluster is split. The proposed data structure is an indexing scheme suitable for (exact and approximate) best match searching on generic metric spaces. The Antipole Tree outperforms by a factor of approximately two existing structures such as List of Clusters, M-Trees, and others and, in many cases, it achieves better clustering properties.