Database classification for multi-database mining

  • Authors:
  • Xindong Wu;Chengqi Zhang;Shichao Zhang

  • Affiliations:
  • Department of Computer Science, University of Vermont, Burlington, VT;Faculty of Information Technology, University of Technology, Sydney, P.O. Box 123, Broadway NSW 2007, Australia;Faculty of Information Technology, University of Technology, Sydney, P.O. Box 123, Broadway NSW 2007, Australia and State Key Laboratory of Intelligent Technology and Systems, Tsinghua University, ...

  • Venue:
  • Information Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many large organizations have multiple databases distributed in different branches, and therefore multidatabase mining is an important task for data mining. To reduce the search cost in the data from all databases, we need to identify which databases are most likely relevant to a data mining application. This is referred to as database selection. For real-world applications, database selection has to be carried out multiple times to identify relevant databases that meet different applications. In particular, a mining task may be without reference to any specific application. In this paper, we present an efficient approach for classifying multiple databases based on their similarity between each other. Our approach is application-independent.