Principles of database buffer management
ACM Transactions on Database Systems (TODS)
A system for adaptive disk rearrangement
Software—Practice & Experience
External memory algorithms and data structures: dealing with massive data
ACM Computing Surveys (CSUR)
ACM Computing Surveys (CSUR)
ACM Computing Surveys (CSUR)
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Index-driven similarity search in metric spaces (Survey Article)
ACM Transactions on Database Systems (TODS)
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)
Similarity Search: The Metric Space Approach (Advances in Database Systems)
Similarity Search: The Metric Space Approach (Advances in Database Systems)
A metric cache for similarity search
Proceedings of the 2008 ACM workshop on Large-Scale distributed systems for information retrieval
Caching content-based queries for robust and efficient image retrieval
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Nearest neighbours search using the PM-Tree
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
An efficient algorithm for reverse furthest neighbors query with metric index
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part II
On nonmetric similarity search problems in complex domains
ACM Computing Surveys (CSUR)
Similarity caching in large-scale image retrieval
Information Processing and Management: an International Journal
Hi-index | 0.00 |
Metric access methods (MAMs) serve as a tool for speeding similarity queries. However, all MAMs developed so far are index-based; they need to build an index on a given database. The indexing itself is either static (the whole database is indexed at once) or dynamic (insertions/deletions are supported), but there is always a preprocessing step needed. In this paper, we propose D-file , the first MAM that requires no indexing at all. This feature is especially beneficial in domains like data mining, streaming databases, etc., where the production of data is much more intensive than querying. Thus, in such environments the indexing is the bottleneck of the entire production/querying scheme. The idea of D-file is an extension of the trivial sequential file (an abstraction over the original database, actually) by so-called D-cache . The D-cache is a main-memory structure that keeps track of distance computations spent by processing all similarity queries so far (within a runtime session). Based on the distances stored in D-cache, the D-file can cheaply determine lower bounds of some distances while the distances alone have not to be explicitly computed, which results in faster queries. Our experimental evaluation shows that query efficiency of D-file is comparable to the index-based state-of-the-art MAMs, however, for zero indexing costs.