Mining globally interesting patterns from multiple databases using kernel estimation

  • Authors:
  • Shichao Zhang;Xiaofang You;Zhi Jin;Xindong Wu

  • Affiliations:
  • Department of Computer Science, Zhejiang Normal University, China and Faculty of Information Technology, University of Technology, Sydney, P.O. Box 123, Broadway NSW 2007, Australia;Department of Computer Science, Zhejiang Normal University, China;College of Information Technology, Peking University, China;Department of Computer Science, Hefei University of Technology, Hefei 230009, China and Department of Computer Science, University of Vermont, Burlington, Vermont 05405, USA

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2009

Quantified Score

Hi-index 12.05

Visualization

Abstract

When extracting knowledge (or patterns) from multiple databases, the data from different databases might be too large in volume to be merged into one database for centralized mining on one computer, the local information sources might be hidden from a global decision maker due to privacy concerns, and different local databases may have different contribution to the global pattern. Dealing with multiple databases is essentially different from mining from a single database. In multi-database mining, the global patterns must be obtained by carefully analyzing the local patterns from individual databases. In this paper, we propose a nonlinear method, named KEMGP (kernel estimation for mining global patterns), to tackle this problem, which adopts kernel estimation to synthesizing local patterns for global patterns. We also adopt a method to divide all the data in different databases according to attribute dimensionality, which reduces the total space complexity. We test our algorithm on a customer management system, where the application is to obtain all globally interesting patterns by analyzing the individual databases. The experimental results show that our method is efficient.