Exploring optimization of semantic relationship graph for multi-relational Bayesian classification

  • Authors:
  • Hailiang Chen;Hongyan Liu;Jiawei Han;Xiaoxin Yin;Jun He

  • Affiliations:
  • Krannert School of Management, Purdue University, West Lafayette, IN, USA;School of Economics and Management, Tsinghua University, Beijing, China;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA;School of Information, Renmin University of China, Beijing, China

  • Venue:
  • Decision Support Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In recent years, there has been growing interest in multi-relational classification research and application, which addresses the difficulties in dealing with large relation search space, complex relationships between relations, and a daunting number of attributes involved. Bayesian Classifier is a simple but effective probabilistic classifier which has been shown to be able to achieve good results in most real world applications. Existing works for multi-relational Naive Bayes classifier mainly focus on how to extend traditional flat Naive Bayes classification method to multi-relational environment. In this paper, we look into issues concerned with how to increase the accuracy of multi-relational Bayesian classifier but still retain its efficiency. We develop a Semantic Relationship Graph (SRG) to describe the relationship between multiple tables and guide the search within relation space. Afterwards, we optimize the Semantic Relationship Graph by avoiding undesirable joins between relations and eliminating unnecessary attributes and relations. The experimental study on the real-world and synthetic databases shows that the proposed optimizing strategies make the multi-relational Naive Bayesian classifier achieve improved accuracy by sacrificing a small amount of running time.