Algorithms for clustering data
Algorithms for clustering data
ACM Computing Surveys (CSUR)
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Feature extraction by non parametric mutual information maximization
The Journal of Machine Learning Research
Supervised term weighting for automated text categorization
Proceedings of the 2003 ACM symposium on Applied computing
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
IEEE Transactions on Pattern Analysis and Machine Intelligence
Multiple-attribute decision making methods for plant layout design problem
Robotics and Computer-Integrated Manufacturing
An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data
IEEE Transactions on Knowledge and Data Engineering
Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm
Computational Statistics & Data Analysis
PID-Based Feature Weight Learning and Its Application in Intrusion Detection
CSIE '09 Proceedings of the 2009 WRI World Congress on Computer Science and Information Engineering - Volume 05
Data clustering: 50 years beyond K-means
Pattern Recognition Letters
Learning and generalization with the information bottleneck
Theoretical Computer Science
Robust data clustering by learning multi-metric Lq-norm distances
Expert Systems with Applications: An International Journal
Feature subset selection wrapper based on mutual information and rough sets
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Feature weighting is one of the popular and effective ways to improve clustering quality. How to choose a proper weighting method for a data object is widely recognized as a difficult problem. Among majority of weighting schemes and combination weighting methods, the traditional way is evaluating the performance of feature weighting by measuring the quality of clustering. However, it is a time-consuming task because clustering algorithms should be run many times, and the number of times depends on the number of weighting schemes or the number of combination weighting iteration. To address the issue, we propose to apply the Mutual Information to predict the performance of feature weighting. We propose to judge the quality of feature weighting by the resulting gain in mutual information. Therefore, the top s weighted data representations can be selected from the weighting data representation set. Then, the best/second best cluster result can be obtained from the top s representations. Experimental results show that the Mutual Information evaluation reduces the running time without sacrificing the quality of clustering.