Fuzzy sets and fuzzy logic: theory and applications
Fuzzy sets and fuzzy logic: theory and applications
Data Mining: Introductory and Advanced Topics
Data Mining: Introductory and Advanced Topics
Applying Knowledge Discovery to Predict Water-Supply Consumption
IEEE Expert: Intelligent Systems and Their Applications
An initialization method for the K-Means algorithm using neighborhood model
Computers & Mathematics with Applications
Multi-attribute Weight Allocation Based on Fuzzy Clustering Analysis and Rough Sets
ISICA '09 Proceedings of the 4th International Symposium on Advances in Computation and Intelligence
Rough sets and near sets in medical imaging: a review
IEEE Transactions on Information Technology in Biomedicine - Special section on body sensor networks
A data labeling method for clustering categorical data
Expert Systems with Applications: An International Journal
A framework for clustering categorical time-evolving data
IEEE Transactions on Fuzzy Systems
A dissimilarity measure for the k-Modes clustering algorithm
Knowledge-Based Systems
CD: a coupled discretization algorithm
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Hi-index | 0.09 |
Lots of clustering algorithms have been developed, while most of them cannot process objects in hybrid numerical/nominal attribute space or with missing values. In most of them, the number of clusters should be manually determined and the clustering results are sensitive to the input order of the objects to be clustered. These limit applicability of the clustering and reduce the quality of clustering. To solve this problem, an improved clustering algorithm based on rough set (RS) and entropy theory was presented. It aims at avoiding the need to prespecify the number of clusters, and clustering in both numerical and nominal attribute space with the similarity introduced to replace the distance index. At the same time, the RS theory endows the algorithm with the function to deal with vagueness and uncertainty in data analysis. Shannon's entropy was used to refine the clustering results by assigning relative weights to the set of attributes according to the mutual entropy values. A novel measure of clustering quality was also presented to evaluate the clusters. This algorithm was analyzed and applied later to cluster the data set of one industrial product. The experimental results confirm that performances of efficiency and clustering quality of this algorithm are improved.