Finding the optimal feature representations for Bayesian network learning

  • Authors:
  • LiMin Wang;ChunHong Cao;XiongFei Li;HaiJun Li

  • Affiliations:
  • Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, JiLin University, ChangChun, China;College of Information Science and Engineering, Northeastern University, ShenYang, China;Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, JiLin University, ChangChun, China;College of Computer Science, YanTai University, YanTai, China

  • Venue:
  • PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Naive Bayes is often used in text classification applications and experiments because of its simplicity and effectiveness. However, many different versions of Bayes model consider only one aspect of a particular word. In this paper we define an information criterion, Projective Information Gain, to decide which representation is appropriate for a specific word. Based on this, the conditional independence assumption is extended to make it more efficient and feasible and then we propose a novel Bayes model, General Naive Bayes (GNB), which can handle two representations concurrently. Experimental results and theoretical justification that demonstrate the feasibility of our approach are presented.