Propositionalized attribute taxonomies from data for data-driven construction of concise classifiers

  • Authors:
  • Dae-Ki Kang;Myoung-Jong Kim

  • Affiliations:
  • Department of Computer and Information Engineering, Dongseo University, San69-1, Churye-2Dong, Sasang-Gu, Busan 617-716, Republic of Korea;Department of Business Administration, College of Business Administration, Pusan National University, 30 Jangjeon-dong, Geumjeong-gu, Busan 609-735, Republic of Korea

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 12.05

Visualization

Abstract

In this paper, we consider the problem of generating concise but accurate naive Bayes classifiers using taxonomy of propositionalized attributes. For the problem, we introduce propositionalized attribute taxonomy guided naive Bayes Learner (PAT-NBL), a machine learning algorithm that effectively utilizes taxonomy to generate compact classifiers. We extend classical naive Bayes learner to the PAT-NBL algorithm that traverses over a propositionalized taxonomy to search for a locally optimal cut. PAT-NBL uses bottom-up search to find the locally optimal cut on a given taxonomy. For the evaluation of candidate cuts, we apply conditional log-likelihood, conditional minimum description length, and conditional Akaike information criterion. The detected cut enables PAT-NBL to construct an instance space which corresponds to the taxonomy and the data. That is, after PAT-NBL determines a cut according to its information-theoretic criteria, the algorithm generates a concise naive Bayes classifier based on the cut. Our experimental results on UCI Machine Learning benchmark data sets indicate that the proposed algorithm can generate naive Bayes classifiers that are more compact and often comparably accurate to those produced by standard naive Bayes learners.