Efficient semi-supervised feature selection with noise insensitive trace ratio criterion

Authors:
Yun Liu;Feiping Nie;Jigang Wu;Lihui Chen
Affiliations:
Department of Computer Science and Engineering, University of Texas, Arlington, USA;Department of Computer Science and Engineering, University of Texas, Arlington, USA;Department of Computer Science, Tianjin Polytechnic University, Tianjin, China;School of Electrical & Electronic Engineering, Nanyang Technological University, Singapore
Venue:
Neurocomputing
Year:
2013

Citing 14
Cited 0

Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Filters, Wrappers and a Boosting-Based Hybrid for Feature Selection

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
An introduction to variable and feature selection

The Journal of Machine Learning Research
Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Principal Manifolds and Nonlinear Dimensionality Reduction via Tangent Space Alignment

SIAM Journal on Scientific Computing
Semi-supervised orthogonal discriminant analysis via label propagation

Pattern Recognition
Trace ratio criterion for feature selection

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Ranking with local regression and global alignment for cross media retrieval

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Beyond distance measurement: constructing neighborhood similarity for video annotation

IEEE Transactions on Multimedia - Special section on communities and media computing
Trace ratio problem revisited

IEEE Transactions on Neural Networks
A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback

IEEE Transactions on Pattern Analysis and Machine Intelligence
Harmonizing Hierarchical Manifolds for Multimedia Document Semantics Understanding and Cross-Media Retrieval

IEEE Transactions on Multimedia
On Similarity Preserving Feature Selection

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.01

Visualization

Abstract

Feature selection is an effective method to deal with high-dimensional data. While in many applications such as multimedia and web mining, the data are often high-dimensional and very large scale, but the labeled data are often very limited. On these kind of applications, it is important that the feature selection algorithm is efficient and can explore labeled data and unlabeled data simultaneously. In this paper, we target on this problem and propose an efficient semi-supervised feature selection algorithm to select relevant features using both labeled and unlabeled data. First, we analyze a popular trace ratio criterion in the dimensionality reduction, and point out that the trace ratio criterion tends to select features with very small variance. To solve this problem, we propose a noise insensitive trace ratio criterion for feature selection with a re-scale preprocessing. Interestingly, the feature selection with the noise insensitive trace ratio criterion can be much more efficiently solved. Based on the noise insensitive trace ratio criterion, we propose a new semi-supervised feature selection algorithm. The algorithm fully explores the distribution of the labeled and unlabeled data with a special label propagation method. Experimental results verify the effectiveness of the proposed algorithm, and show improvement over traditional supervised feature selection algorithms.