NM-Tree: Flexible Approximate Similarity Search in Metric and Non-metric Spaces
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Content-based video copy detection
MM '09 Proceedings of the 17th ACM international conference on Multimedia
A flexible framework to ease nearest neighbor search in multidimensional data spaces
Data & Knowledge Engineering
Similarity search on Bregman divergence: towards non-metric indexing
Proceedings of the VLDB Endowment
CP-index: using clustering and pivots for indexing non-metric spaces
Proceedings of the Third International Conference on SImilarity Search and APplications
Improving the speed and stability of the k-nearest neighbors method
Pattern Recognition Letters
Hi-index | 0.00 |
Similarity-based search has been a key factor for many applications, such as multimedia retrieval, data mining, web search and retrieval, and so on. There are two important issues related to the the similarity search, namely the design of a distance function to measure the similarity, and improving the search efficiency. Many distance functions have been proposed that attempt to closely mimic human recognition. Unfortunately, some of these well-designed distance functions do not follow the triangle inequality, and are, therefore, non-metric. As a consequence, efficient retrieval using these non-metric distance functions becomes more challenging, since most existing index structures assume that the indexed distance functions are metric. In this paper, we address this challenging problem by proposing an efficient method, local constant embedding (LCE), which divides the data set into disjoint groups, so that the triangle inequality holds within each group by constant shifting. Furthermore, we design a pivot selection approach for the converted metric distance and create an index structure to speed up the retrieval efficiency. Extensive experiments show that, our method works well on various non-metric distance functions and improves the retrieval efficiency by an order of magnitude compared to the linear scan and existing retrieval approaches with no false dismissals.