Feature selection in unsupervised learning via evolutionary search
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised Feature Selection Using Feature Similarity
IEEE Transactions on Pattern Analysis and Machine Intelligence
Clustering Algorithms
Efficient Feature Selection in Conceptual Clustering
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature Subset Selection and Order Identification for Unsupervised Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Feature Selection for Clustering - A Filter Solution
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Dimensionality Reduction of Unsupervised Data
ICTAI '97 Proceedings of the 9th International Conference on Tools with Artificial Intelligence
Clustering and Information Retrieval (Network Theory and Applications)
Clustering and Information Retrieval (Network Theory and Applications)
Introduction to Machine Learning (Adaptive Computation and Machine Learning)
Introduction to Machine Learning (Adaptive Computation and Machine Learning)
Toward Integrating Feature Selection Algorithms for Classification and Clustering
IEEE Transactions on Knowledge and Data Engineering
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
Feature selection in robust clustering based on Laplace mixture
Pattern Recognition Letters
A filter feature selection method for clustering
ISMIS'05 Proceedings of the 15th international conference on Foundations of Intelligent Systems
An outlier-aware data clustering algorithm in mixture models
ICICS'09 Proceedings of the 7th international conference on Information, communications and signal processing
Hi-index | 0.01 |
Rapid advances in computer and database technologies have enabled organizations to accumulate vast amounts of data recently. These huge data make the data analysis task become more complicated. Feature selection is an effective dimensionality reduction technique by removing irrelevant, redundant, or noisy features. This research proposes a novel feature-selecting measure to evaluate feature importance for clustering process. The proposed measure aims at extracting useful information from the dissimilarity between two data objects since data dissimilarity is a common principle to determine whether data objects can be located within the same cluster or not. Therefore, the dissimilarity between a pair of data objects is used to develop the proposed feature-selecting measure. In the research, the probability distribution of the dissimilarity variable is considered as a mixture model consisting of the two "intra-cluster" and "inter-cluster" dissimilarity Gaussian distributions. The means of the two Gaussian distributions can be inferred by the EM algorithm. Accordingly, the difference between the two means is regarded as a meaningful measure to select important features for clustering. The effectiveness of the proposed feature-selecting measure for clustering is demonstrated using a set of experiments.