A practical approach to feature selection
ML92 Proceedings of the ninth international workshop on Machine learning
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Unsupervised Feature Selection Using Feature Similarity
IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised Feature Selection Applied to Content-Based Retrieval of Lung Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
An introduction to variable and feature selection
The Journal of Machine Learning Research
Margin based feature selection - theory and algorithms
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Toward Integrating Feature Selection Algorithms for Classification and Clustering
IEEE Transactions on Knowledge and Data Engineering
A Discriminative Learning Framework with Pairwise Constraints for Video Object Classification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Semisupervised Clustering with Metric Learning using Relative Comparisons
IEEE Transactions on Knowledge and Data Engineering
Learning a Mahalanobis distance metric for data clustering and classification
Pattern Recognition
Incorporating User Provided Constraints into Document Clustering
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Semi-supervised Document Clustering via Active Learning with Pairwise Constraints
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Large Margin Feature Weighting Method via Linear Programming
IEEE Transactions on Knowledge and Data Engineering
Unsupervised feature evaluation: a neuro-fuzzy approach
IEEE Transactions on Neural Networks
Hi-index | 0.01 |
Feature selection is an important problem for pattern classification systems. As compared to unsupervised feature selection methods, the supervised ones have better performance. However, almost all existing supervised ones use class labels as supervised information, very less work has been done for other forms of supervision information such as pairwise constraints, which specifies whether a pair of data samples belongs to the same class (must-link constraints) or different classes (cannot-link constraints). In reality, pairwise constraints can be easily obtained by specifying whether some pairs of examples belong to the same class or not. Therefore, a new filter method for feature selection with pairwise constraints, called Constraint Score, was proposed. Unfortunately, Constraint Score does not consider the case where only cannot-link constraints are given. Also, the conclusion 'must-link constraints are more important than cannot-link constraints' given by Constraint Score algorithm needs to be further verified, since 'cannot-link constraints' seems more important than 'must-link constraints' from the viewpoint of hypothesis-margin or margin. In addition, like the existing supervised feature selection methods, the currently proposed hypothesis-margin based approach for feature selection, called Simba, also utilizes class labels as supervision information. In this paper, to further study the feature selection problem aiming at pairwise constraints, we introduce a novel hypothesis-margin based approach for feature selection with side pairwise constraints, called Simba-sc, which only uses cannot-link constraints as supervision information. We compare our algorithm with the well-known Constraint Score, Fisher Score and Laplacian Score algorithms. Experiments are carried out on 6 UCI data sets using three different classifiers. Experimental results show that, with a few cannot-link constraints, Simba-sc achieves similar or even higher performance than Fisher Score with full class labels on all training data, and has better or comparable performance than Constraint Score.