Random relevant and non-redundant feature subspaces for co-training

  • Authors:
  • Yusuf Yaslan;Zehra Cataltepe

  • Affiliations:
  • Istanbul Technical University, Computer Engineering Department, Istanbul, Turkey;Istanbul Technical University, Computer Engineering Department, Istanbul, Turkey

  • Venue:
  • IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Random feature subspace selection can produce diverse classifiers and help with Co-training as shown by RASCO algorithm of Wang et al. 2008. For data sets with many irrelevant or noisy feature, RASCO may end up with inaccurate classifiers. In order to remedy this problem, we introduce two algorithms for selecting relevant and non-redundant feature subspaces for Co-training. The first algorithm Rel-RASCO (Relevant Random Subspaces for Co-training) produces subspaces by drawing features with probabilities proportional to their relevances. We also modify a successful feature selection algorithm, mRMR (Minimum Redundancy Maximum Relevance), for random feature subset selection and introduce Prob-mRMR (Probabilistic-mRMR). Experiments on 5 datasets demonstrate that the proposed algorithms outperform both RASCO and Co-training in terms of accuracy achieved at the end of Co-training. Theoretical analysis of the proposed algorithms is also provided.