Association-based similarity testing and its applications

  • Authors:
  • Tao Li;Mitsunori Ogihara;Shenghuo Zhu

  • Affiliations:
  • (Correspd. Tel.: +1 585 275 8479/ Fax: +1 585 273 4556/ E-mail: taoli@cs.rochester.edu) Computer Science Department, University of Rochester, Rochester, NY 14627-0226, USA;Computer Science Department, University of Rochester, Rochester, NY 14627-0226, USA;Computer Science Department, University of Rochester, Rochester, NY 14627-0226, USA

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a new similarity measure between basket datasets based on associations. The new measure is calculated from support counts using a formula inspired by information entropy. Experiments on both real and synthetic datasets show the effectiveness of the measure. This paper then investigates the applications of the similarity measure. It first studies the problem of finding a mapping between categorical database attribute sets using similarity measures. A generic approach for identifying such a mapping is proposed. The approach is implemented based on the similarity measure proposed in the paper and its performance has been evaluated and validated. Moreover, this paper also explores the applications of using the similarity measure to mine distributed datasets.