Random relevant and non-redundant feature subspaces for co-training

Authors:
Yusuf Yaslan;Zehra Cataltepe
Affiliations:
Istanbul Technical University, Computer Engineering Department, Istanbul, Turkey;Istanbul Technical University, Computer Engineering Department, Istanbul, Turkey
Venue:
IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
Year:
2009

Citing 9
Cited 1

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Partitioning-based clustering for Web document categorization

Decision Support Systems - Special issue on WITS '97
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy

IEEE Transactions on Pattern Analysis and Machine Intelligence
Co-training by Committee: A New Semi-supervised Learning Framework

ICDMW '08 Proceedings of the 2008 IEEE International Conference on Data Mining Workshops
Using co-training and self-training in semi-supervised multiple classifier systems

SSPR'06/SPR'06 Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Semi-supervised multiple classifier systems: background and research directions

MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
Modeling timbre distance with temporal statistics from polyphonic music

IEEE Transactions on Audio, Speech, and Language Processing
Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans

Co-training with relevant random subspaces

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Random feature subspace selection can produce diverse classifiers and help with Co-training as shown by RASCO algorithm of Wang et al. 2008. For data sets with many irrelevant or noisy feature, RASCO may end up with inaccurate classifiers. In order to remedy this problem, we introduce two algorithms for selecting relevant and non-redundant feature subspaces for Co-training. The first algorithm Rel-RASCO (Relevant Random Subspaces for Co-training) produces subspaces by drawing features with probabilities proportional to their relevances. We also modify a successful feature selection algorithm, mRMR (Minimum Redundancy Maximum Relevance), for random feature subset selection and introduce Prob-mRMR (Probabilistic-mRMR). Experiments on 5 datasets demonstrate that the proposed algorithms outperform both RASCO and Co-training in terms of accuracy achieved at the end of Co-training. Theoretical analysis of the proposed algorithms is also provided.