Mining relational data through correlation-based multiple view validation

  • Authors:
  • Hongyu Guo;Herna L. Viktor

  • Affiliations:
  • University of Ottawa;University of Ottawa

  • Venue:
  • Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Commercial relational databases currently store vast amounts of real-world data. The data within these relational repositories are represented by multiple relations, which are inter-connected by means of foreign key joins. The mining of such interrelated data poses a major challenge to the data mining community. Unfortunately, traditional data mining algorithms usually only explore one relation, the so-called target relation, thus excluding crucial knowledge embedded in the related so-called background relations. In this paper, we propose a novel approach for classifying relational such domains. This strategy employs multiple views to capture crucial information not only from the target relation, but also from related relations. This information is integrated into the relational mining process. The framework presented here, firstly, explore the relational domain to partition its features space into multiple subsets. Subsequently, these subsets are used to construct multiple uncorrelated views, based on a novel correlation-based view validation method, against the target concept. Finally, the knowledge possessed by multiple views are incorporated into a meta-learning mechanism to augment one another. Based on this framework, a wide range of conventional data mining methods can be applied to mine relational databases. Our experiments on benchmark real-world data sets show that the proposed method achieves promising results both in terms of overall accuracy obtained and run time, when compared with two other relational data mining approaches.