Algorithms for clustering data
Algorithms for clustering data
R-trees: a dynamic index structure for spatial searching
Readings in database systems
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
A new method for similarity indexing of market basket data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Efficient and tumble similar set retrieval
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A Fast Algorithm to Cluster High Dimensional Basket Data
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
On B-Tree Indices for Skewed Distributions
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Efficient similarity search for market basket data
The VLDB Journal — The International Journal on Very Large Data Bases
Similarity search in transaction databases with a two-level bounding mechanism
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Hi-index | 0.00 |
Recently, techniques for supporting efficient similarity search over huge transaction datasets have emerged as an important research area. Several indexing schemes have been proposed towards this direction. Typically, these schemes provide a tradeoff between searching efficiency and indexing overhead in terms of space. In this paper, we propose a novel indexing scheme for similarity search on transaction data. Based on well-studied clustering techniques, we develop a construction algorithm for the proposed index and a branch-and-bound searching strategy for answering similarity search. Unlike previous techniques, our indexing scheme exhibits high search efficiency and low space requirements by trading-off the pre-computation time. This behavior is ideal for applications with low update but high read volume e.g., data warehousing, collaborative filtering, etc.). Moreover, our experimental results illustrate that our method is robust to the varying characteristics of the datasets.