Feature selection in the Laplacian support vector machine

Authors:
Sangjun Lee;Changyi Park;Ja-Yong Koo
Affiliations:
Data Mining Team, NHN Inc., Gyeonggi-do 463-847, Republic of Korea;Department of Statistics, University of Seoul, Seoul 130-743, Republic of Korea;Department of Statistics, Korea University, Seoul 136-701, Republic of Korea
Venue:
Computational Statistics & Data Analysis
Year:
2011

Citing 10
Cited 2

Support-Vector Networks

Machine Learning
Analyzing the effectiveness and applicability of co-training

Proceedings of the ninth international conference on Information and knowledge management
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Learning from Labeled and Unlabeled Data using Graph Mincuts

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Gene selection using support vector machines with non-convex penalty

Bioinformatics
Large Scale Transductive SVMs

The Journal of Machine Learning Research
Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples

The Journal of Machine Learning Research
Large Margin Semi-supervised Learning

The Journal of Machine Learning Research
On Efficient Large Margin Semisupervised Learning: Method and Theory

The Journal of Machine Learning Research
Introduction to Semi-Supervised Learning

Introduction to Semi-Supervised Learning

Twin least squares support vector regression

Neurocomputing
Letters: A novel online adaptive kernel method with kernel centers determined by a support vector regression approach

Neurocomputing

Quantified Score

Hi-index	0.03

Visualization

Abstract

Traditional classifiers including support vector machines use only labeled data in training. However, labeled instances are often difficult, costly, or time consuming to obtain while unlabeled instances are relatively easy to collect. The goal of semi-supervised learning is to improve the classification accuracy by using unlabeled data together with a few labeled data in training classifiers. Recently, the Laplacian support vector machine has been proposed as an extension of the support vector machine to semi-supervised learning. The Laplacian support vector machine has drawbacks in its interpretability as the support vector machine has. Also it performs poorly when there are many non-informative features in the training data because the final classifier is expressed as a linear combination of informative as well as non-informative features. We introduce a variant of the Laplacian support vector machine that is capable of feature selection based on functional analysis of variance decomposition. Through synthetic and benchmark data analysis, we illustrate that our method can be a useful tool in semi-supervised learning.