A Classification EM algorithm for clustering and two stochastic versions
Computational Statistics & Data Analysis - Special issue on optimization techniques in statistics
Floating search methods in feature selection
Pattern Recognition Letters
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Unsupervised Learning of Finite Mixture Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature Subset Selection and Order Identification for Unsupervised Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
An introduction to variable and feature selection
The Journal of Machine Learning Research
Simultaneous Feature Selection and Clustering Using Mixture Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust classification with context-sensitive features
IEA/AIE'93 Proceedings of the 6th international conference on Industrial and engineering applications of artificial intelligence and expert systems
K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Problem of Dimensionality: A Simple Example
IEEE Transactions on Pattern Analysis and Machine Intelligence
The effect of linguistic hedges on feature selection: Part 2
Expert Systems with Applications: An International Journal
ICCSA'07 Proceedings of the 2007 international conference on Computational science and its applications - Volume Part I
Hi-index | 0.10 |
A wrapped feature selection process is proposed in the context of robust clustering based on Laplace mixture models. The clustering approach we consider is a generalization of the K-median algorithm. The selection process makes use of the statistical model and recursively deletes features using hypothesis tests. We report simulations and applications to real data sets which illustrate the relevance of the proposed approach. We propose a strategy to select a reasonable number of remaining features. It uses the test statistic to choose the most relevant features, then an evaluation of the clustering error to discard the redundant ones from among them. This strategy appears to produce a good compromise between the selection of features and the performance of the clustering.