Construction of Classifiers by Iterative Compositions of Features with Partial Knowledge

  • Authors:
  • Kazuya Haraguchi;Toshihide Ibaraki

  • Affiliations:
  • The author is with the Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto University, Kyoto-shi, 606-8501 Japan. E-mail: kazuyah@amp.i.kyoto-u.ac.jp,;The author is with the Department of Informatics, School of Science and Technology, Kwansei Gakuin University, Sanda-shi, 669-1337 Japan. E-mail: ibaraki@ksc.kwansei.ac.jp

  • Venue:
  • IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the classification problem to construct a classifier c :{0,1}n |- {0,1} from a given set of examples (training set), which (approximately) realizes the hidden oracle y :{0,1}n |- (0,1) describing the phenomenon under consideration. For this problem, a number of approaches are already known in computational learning theory; e.g., decision trees, support vector machines (SVM), and iteratively composed features (ICF). The last one, ICF, was proposed in our previous work (Haraguchi et al., (2004)). A feature, composed of a nonempty subset S of other features (including the original data attributes), is a Boolean function fS : {0,1}S |- {0,1} and is constructed according to the proposed rule. The ICF algorithm iterates generation and selection processes of features, and finally adopts one of the generated features as the classifier, where the generation process may be considered as embodying the idea of boosting, since new features are generated from the available features. In this paper, we generalize a feature to an extended Boolean function fS :{0,1,*}S |- {0,1,*} to allow partial knowledge, where * denotes the state of uncertainty. We then propose the algorithm ICF* to generate such generalized features. The selection process of ICF* is also different from that of ICF, in that features are selected so as to cover the entire training set. Our computational experiments indicate that ICF* is better than ICF in terms of both classification performance and computation time. Also, it is competitive with other representative learning algorithms such as decision trees and SVM.