Data structures and algorithms for nearest neighbor search in general metric spaces
SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Indexing large metric spaces for similarity search queries
ACM Transactions on Database Systems (TODS)
Information and Coding Theory
Fixed Queries Array: A Fast and Economical Data Structure for Proximity Searching
Multimedia Tools and Applications
Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Searching in metric spaces by spatial approximation
The VLDB Journal — The International Journal on Very Large Data Bases
D-Index: Distance Searching Index for Metric Data Sets
Multimedia Tools and Applications
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search
ACM Transactions on Database Systems (TODS)
A compact space decomposition for effective metric indexing
Pattern Recognition Letters
Scene completion using millions of photographs
ACM SIGGRAPH 2007 papers
The VLDB Journal — The International Journal on Very Large Data Bases
Indexing high-dimensional data in dual distance spaces: a symmetrical encoding approach
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Maximal metric margin partitioning for similarity search indexes
Proceedings of the 18th ACM conference on Information and knowledge management
Hi-index | 0.00 |
We researched to try to find a way to reduce the cost of nearest neighbor searches in metric spaces. Many similarity search indexes recursively divide a region into subregions by using pivots, and construct a tree structure index. A problem in the existing indexes is that they only focus on the pruning objects and do not take into consideration the tree balancing. The balance of the indexes depends on the data distribution and the indexes don't reduce the search cost for all data. We propose a similarity search index called the Partitioning Capacity Tree (PCTree). PCTree automatically optimizes the pivot selection based on both the balance of the regions partitioned by a pivot and the estimated effectiveness of the search pruning by the pivot. As a result, PCTree reduces the search cost for various data distributions. Our evaluations comparing it with four indexes on three real datasets showed that PCTree successfully reduces the search cost and is good at handling various data distributions.