Feature Selection for Unsupervised Learning

Authors:
Jennifer G. Dy;Carla E. Brodley
Affiliations:
-;-
Venue:
The Journal of Machine Learning Research
Year:
2004

Citing 28
Cited 97

Algorithms for clustering data

Algorithms for clustering data
Applied multivariate statistical analysis

Applied multivariate statistical analysis
Models of incremental concept formation

Artificial Intelligence
A practical approach to feature selection

ML92 Proceedings of the ninth international workshop on Machine learning
Estimating attributes: analysis and extensions of RELIEF

ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
Bayesian classification (AutoClass): theory and results

Advances in knowledge discovery and data mining
Dimension reduction by local principal component analysis

Neural Computation
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mixtures of probabilistic principal component analyzers

Neural Computation
Concept Learning and Feature Selection Based on Square-Error Clustering

Machine Learning
Visualization and interactive feature selection for unsupervised data

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised Learning of Finite Mixture Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Knowledge Acquisition Via Incremental Conceptual Clustering

Machine Learning
Unsupervised Feature Selection Applied to Content-Based Retrieval of Lung Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient Feature Selection in Conceptual Clustering

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Refining Initial Points for K-Means Clustering

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Feature Selection as a Preprocessing Step for Hierarchical Clustering

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Model Selection in Unsupervised Learning with Applications To Document Clustering

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Feature Subset Selection and Order Identification for Unsupervised Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
X-means: Extending K-means with Efficient Estimation of the Number of Clusters

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Hierarchical Unsupervised Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
SMEM Algorithm for Mixture Models

Neural Computation
Evolutionary model selection in unsupervised learning

Intelligent Data Analysis
Iterative optimization and simplification of hierarchical clusterings

Journal of Artificial Intelligence Research

Cross-relational clustering with user's guidance

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A multinomial clustering model for fast simulation of computer architecture designs

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Bayesian Feature and Model Selection for Gaussian Mixture Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Evolving Feature Selection

IEEE Intelligent Systems
Cluster-based pattern discrimination: A novel technique for feature selection

Pattern Recognition Letters
Defect prevention in software processes: An action-based approach

Journal of Systems and Software
A rough sets based characteristic relation approach for dynamic attribute generalization in data mining

Knowledge-Based Systems
Multiobjective Optimization in Bioinformatics and Computational Biology

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Spectral feature selection for supervised and unsupervised learning

Proceedings of the 24th international conference on Machine learning
Localized feature selection for clustering

Pattern Recognition Letters
A correlation-based model for unsupervised feature selection

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Invariant optimal feature selection: A distance discriminant and feature ranking based solution

Pattern Recognition
Constraint Score: A new filter method for feature selection with pairwise constraints

Pattern Recognition
Consensus unsupervised feature ranking from multiple views

Pattern Recognition Letters
Unsupervised feature selection using clustering ensembles and population based incremental learning algorithm

Pattern Recognition
Unsupervised feature selection for principal components analysis

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A Graphical Model for Content Based Image Suggestion and Feature Selection

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
A new feature selection method for Gaussian mixture clustering

Pattern Recognition
Feature Selection Using Mutual Information: An Experimental Study

PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Feature Selection for Clustering on High Dimensional Data

PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Feature selection with dynamic mutual information

Pattern Recognition
A Statistical Approach for Binary Vectors Modeling and Clustering

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Feature Selection for Local Learning Based Clustering

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Unsupervised feature selection and general pattern discovery using Self-Organizing Maps for gaining insights into the nature of seismic wavefields

Computers & Geosciences
Foreground Focus: Unsupervised Learning from Partially Matching Images

International Journal of Computer Vision
Trace ratio criterion for feature selection

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
On multivariate binary data clustering and feature weighting

Computational Statistics & Data Analysis
Clustering stability-based feature selection for unsupervised texture classification

Machine Graphics & Vision International Journal
CBIR of spine X-ray images on inter-vertebral disc space and shape profiles using feature ranking and voting consensus

Data & Knowledge Engineering
Discriminative semi-supervised feature selection via manifold regularization

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Hybridization of Evolutionary Mechanisms for Feature Subset Selection in Unsupervised Learning

MICAI '09 Proceedings of the 8th Mexican International Conference on Artificial Intelligence
Improved Visual Clustering through Unsupervised Dimensionality Reduction

RSFDGrC '09 Proceedings of the 12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
A fast band selection method to increase image contrast for multispectral image segmentation

ISBI'09 Proceedings of the Sixth IEEE international conference on Symposium on Biomedical Imaging: From Nano to Macro
From variable weighting to cluster characterization in topographic unsupervised learning

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Bagging Constraint Score for feature selection with pairwise constraints

Pattern Recognition
Testing terrorism theory with data mining

International Journal of Data Analysis Techniques and Strategies
A maximum weighted likelihood approach to simultaneous model selection and feature weighting in Gaussian mixture

ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
An unsupervised feature selection algorithm: laplacian score combined with distance-based entropy measure

IITA'09 Proceedings of the 3rd international conference on Intelligent information technology application
Importance degree of features and feature selection

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Clustering ensemble for unsupervised feature selection

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Model-based subspace clustering of non-Gaussian data

Neurocomputing
A study on traditional Malay musical instruments sounds classification system

Proceedings of the 11th International Conference on Information Integration and Web-based Applications & Services
Unsupervised feature selection for multi-cluster data

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning multiple nonredundant clusterings

ACM Transactions on Knowledge Discovery from Data (TKDD)
Cluster editing problem for points on the real line: A polynomial time algorithm

Information Processing Letters
Relationship preserving feature selection for unlabelled clinical trials time-series

Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
Discriminative semi-supervised feature selection via manifold regularization

IEEE Transactions on Neural Networks
Discriminative codeword selection for image representation

Proceedings of the international conference on Multimedia
On dynamic soft dimension reduction in evolving fuzzy classifiers

IPMU'10 Proceedings of the Computational intelligence for knowledge-based systems design, and 13th international conference on Information processing and management of uncertainty
On-line incremental feature weighting in evolving fuzzy classifiers

Fuzzy Sets and Systems
A unifying criterion for unsupervised clustering and feature selection

Pattern Recognition
Nearest-neighbor guided evaluation of data reliability and its applications

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A new dataset evaluation method based on category overlap

Computers in Biology and Medicine
Constraint scores for semi-supervised feature selection: A comparative study

Pattern Recognition Letters
Evolving ensembles of feature subsets towards optimal feature selection for unsupervised and semi-supervised clustering

IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part II
Improving the dynamic hierarchical compact clustering algorithm by using feature selection

CIARP'10 Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and applications
Multi-objective semi-supervised feature selection and model selection based on Pearson's correlation coefficient

CIARP'10 Proceedings of the 15th Iberoamerican congress conference on Progress in pattern recognition, image analysis, computer vision, and applications
Adapt the mRMR criterion for unsupervised feature selection

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Consensus self-organized models for fault detection (COSMO)

Engineering Applications of Artificial Intelligence
Density-based Silhouette diagnostics for clustering methods

Statistics and Computing
Target segmentation in scenes with diverse background

SCIA'11 Proceedings of the 17th Scandinavian conference on Image analysis
Constrained laplacian score for semi-supervised feature selection

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Eigenvector sensitive feature selection for spectral clustering

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Assessment of an unsupervised feature selection method for generative topographic mapping

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part II
Data clustering: a user’s dilemma

PReMI'05 Proceedings of the First international conference on Pattern Recognition and Machine Intelligence
Finding uninformative features in binary data

IDEAL'05 Proceedings of the 6th international conference on Intelligent Data Engineering and Automated Learning
Genetic programming for automatic stress detection in spoken english

EuroGP'06 Proceedings of the 2006 international conference on Applications of Evolutionary Computing
Application of feature selection for unsupervised learning in prosecutors' office

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part II
Feature selection with adjustable criteria

RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part I
An evaluation of filter and wrapper methods for feature selection in categorical clustering

IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis
Unsupervised feature selection for biomarker identification in chromatography and gene expression data

ANNPR'06 Proceedings of the Second international conference on Artificial Neural Networks in Pattern Recognition
Evolving clusters in gene-expression data

Information Sciences: an International Journal
Simultaneous pattern and variable weighting during topological clustering

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part I
An unsupervised feature selection framework based on clustering

PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
A New Unsupervised Feature Ranking Method for Gene Expression Data Based on Consensus Affinity

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A semi-supervised feature ranking method with ensemble learning

Pattern Recognition Letters
Feature selection via joint embedding learning and sparse regression

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Unsupervised feature selection for linked social media data

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning to place new objects in a scene

International Journal of Robotics Research
Unsupervised feature and model selection for generalized Dirichlet mixture models

ICIAR'07 Proceedings of the 4th international conference on Image Analysis and Recognition
A decision support method, based on bounded rationality concepts, to reveal feature saliency in clustering problems

Decision Support Systems
Model-based clustering of high-dimensional data: Variable selection versus facet determination

International Journal of Approximate Reasoning
Navigating interpretability issues in evolving fuzzy systems

SUM'12 Proceedings of the 6th international conference on Scalable Uncertainty Management
Massively parallel feature selection: an approach based on variance preservation

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Assisted descriptor selection based on visual comparative data analysis

EuroVis'11 Proceedings of the 13th Eurographics / IEEE - VGTC conference on Visualization
A New Locally Weighted K-Means for Cancer-Aided Microarray Data Analysis

Journal of Medical Systems
Automatic dimensionality estimation for manifold learning through optimal feature selection

SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Rough Set Based Clustering Using Active Learning Approach

International Journal of Artificial Life Research
A probability model for recognition of dynamic gesture based on a finger-worn device

AMT'12 Proceedings of the 8th international conference on Active Media Technology
Hamming Distance based Clustering Algorithm

International Journal of Information Retrieval Research
Unsupervised Feature Selection with Feature Clustering

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Local-to-global semi-supervised feature selection

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
A graph Laplacian based approach to semi-supervised feature selection for regression problems

Neurocomputing
Automatic feature selection for named entity recognition using genetic algorithm

Proceedings of the Fourth Symposium on Information and Communication Technology
Dynamic maintenance of approximations in set-valued ordered decision systems under the attribute generalization

Information Sciences: an International Journal
Texture and color based image segmentation and pathology detection in capsule endoscopy videos

Computer Methods and Programs in Biomedicine
Feature selection for k-means clustering stability: theoretical analysis and an algorithm

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we identify two issues involved in developing an automated feature subset selection algorithm for unlabeled data: the need for finding the number of clusters in conjunction with feature selection, and the need for normalizing the bias of feature selection criteria with respect to dimension. We explore the feature selection problem and these issues through FSSEM (Feature Subset Selection using Expectation-Maximization (EM) clustering) and through two different performance criteria for evaluating candidate feature subsets: scatter separability and maximum likelihood. We present proofs on the dimensionality biases of these feature criteria, and present a cross-projection normalization scheme that can be applied to any criterion to ameliorate these biases. Our experiments show the need for feature selection, the need for addressing these two issues, and the effectiveness of our proposed solutions.