Similarity search in transaction databases with a two-level bounding mechanism

  • Authors:
  • Jo-Chun Chuang;Chung-Wen Cho;Arbee L. P. Chen

  • Affiliations:
  • Department of Computer Science, National Tsing Hua University, Taiwan, R.O.C;Department of Computer Science, National Tsing Hua University, Taiwan, R.O.C;Department of Computer Science, National Chengchi University, Taiwan, R.O.C

  • Venue:
  • DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a novel indexing method for similarity search in transaction databases where the frequency of database updates can be high. In our method, the incoming transactions are incrementally classified into clusters. The transactions in a cluster are represented using two features, namely the union and the intersection of all the transactions. Based on these two features, the transactions in a cluster are further divided into disjoint groups. As a result, all the transactions are organized as a two-level index structure. With this index, the insertion of a transaction can be quickly done because only a particular cluster needs to be modified. Moreover, when conducting a similarity search, we can compute for each level the lower and upper bounds on the distance between the query and each transaction in the cluster. Based on these bounds, the costs on the distance computation can be greatly reduced.