Efficient classification from multiple heterogeneous databases

  • Authors:
  • Xiaoxin Yin;Jiawei Han

  • Affiliations:
  • University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL

  • Venue:
  • PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the fast expansion of computer networks, it is inevitable to study data mining on heterogeneous databases. In this paper we propose MDBM, an accurate and efficient approach for classification on multiple heterogeneous databases. We propose a regression-based method for predicting the usefulness of inter-database links that serve as bridges for information transfer, because such links are automatically detected and may or may not be useful or even valid. Because of the high cost of inter-database communication, MDBM employs a new strategy for cross-database classification, which finds and performs actions with high benefit-to-cost ratios. The experiments show that MDBM achieves high accuracy in cross-database classification, with much higher efficiency than previous approaches.