Joint feature re-extraction and classification using an iterative semi-supervised support vector machine algorithm

Authors:
Yuanqing Li;Cuntai Guan
Affiliations:
Institute for Infocomm Research, Singapore, Singapore 119613;Institute for Infocomm Research, Singapore, Singapore 119613
Venue:
Machine Learning
Year:
2008

Citing 12
Cited 3

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Semi-supervised support vector machines

Proceedings of the 1998 conference on Advances in neural information processing systems II
Analyzing the effectiveness and applicability of co-training

Proceedings of the ninth international conference on Information and knowledge management
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Support vector machine active learning with applications to text classification

The Journal of Machine Learning Research
Co-trained support vector machines for large scale unstructured document classification using unlabeled data and syntactic information

Information Processing and Management: an International Journal
Co-EM support vector learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
A Bayesian Approach to Joint Feature Selection and Classifier Design

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Extended EM Algorithm for Joint Feature Extraction and Classification in Brain-Computer Interfaces

Neural Computation
Semi-Supervised Learning

Semi-Supervised Learning
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)

Bhattacharyya bound based channel selection for classification of motor imageries in EEG signals

CCDC'09 Proceedings of the 21st annual international conference on Chinese control and decision conference
Classifying motor imagery EEG signals by iterative channel elimination according to compound weight

AICI'10 Proceedings of the 2010 international conference on Artificial intelligence and computational intelligence: Part II
Channel selection by Rayleigh coefficient maximization based genetic algorithm for classifying single-trial motor imagery EEG

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The focus of this paper is on joint feature re-extraction and classification in cases when the training data set is small. An iterative semi-supervised support vector machine (SVM) algorithm is proposed, where each iteration consists both feature re-extraction and classification, and the feature re-extraction is based on the classification results from the previous iteration. Feature extraction is first discussed in the framework of Rayleigh coefficient maximization. The effectiveness of common spatial pattern (CSP) feature, which is commonly used in Electroencephalogram (EEG) data analysis and EEG-based brain computer interfaces (BCIs), can be explained by Rayleigh coefficient maximization. Two other features are also defined using the Rayleigh coefficient. These features are effective for discriminating two classes with different means or different variances. If we extract features based on Rayleigh coefficient maximization, a large training data set with labels is required in general; otherwise, the extracted features are not reliable. Thus we present an iterative semi-supervised SVM algorithm embedded with feature re-extraction. This iterative algorithm can be used to extract these three features reliably and perform classification simultaneously in cases where the training data set is small. Each iteration is composed of two main steps: (i) the training data set is updated/augmented using unlabeled test data with their predicted labels; features are re-extracted based on the augmented training data set. (ii) The re-extracted features are classified by a standard SVM. Regarding parameter setting and model selection of our algorithm, we also propose a semi-supervised learning-based method using the Rayleigh coefficient, in which both training data and test data are used. This method is suitable when cross-validation model selection may not work for small training data set. Finally, the results of data analysis are presented to demonstrate the validity of our approach.