EEW-SC: Enhanced Entropy-Weighting Subspace Clustering for high dimensional gene expression data clustering analysis

Authors:
Zhaohong Deng;Kup-Sze Choi;Fu-Lai Chung;Shitong Wang
Affiliations:
School of Digital Media, Jiangnan University, Wuxi, Jiangsu, PR China and Ctr. for Int. Digital Health, School of Nursing, The Hong Kong Polytechnic University, Hong Kong and Jiangsu Engineering R ...;Ctr. for Int. Digital Health, School of Nursing, The Hong Kong Polytechnic University, Hong Kong;Department of Computing, The Hong Kong Polytechnic University, Hong Kong;School of Digital Media, Jiangnan University, Wuxi, Jiangsu, PR China
Venue:
Applied Soft Computing
Year:
2011

Citing 19
Cited 2

Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
Entropy-based subspace clustering for mining numerical data

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding generalized projected clusters in high dimensional spaces

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A Monte Carlo algorithm for fast projective clustering

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Projective ART for clustering data sets in high dimensional spaces

Neural Networks
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
d-Clusters: Capturing Subspace Correlation in a Large Data Set

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Subspace clustering for high dimensional data: a review

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Cluster Analysis for Gene Expression Data: A Survey

IEEE Transactions on Knowledge and Data Engineering
HARP: A Practical Projected Clustering Algorithm

IEEE Transactions on Knowledge and Data Engineering
Automated Variable Weighting in k-Means Type Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Locally adaptive metrics for clustering high dimensional data

Data Mining and Knowledge Discovery
Generalized fuzzy C-means clustering algorithm with improved fuzzy partitions

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A novel fuzzy clustering algorithm based on a fuzzy scatter matrix with optimality tests

Pattern Recognition Letters
A fuzzy subspace algorithm for clustering high dimensional data

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Subspace clustering of text documents with feature weighting k-means algorithm

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Optimality test for generalized FCM and its application to parameter selection

IEEE Transactions on Fuzzy Systems

Partitive clustering (K-means family)

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Evolving soft subspace clustering

Applied Soft Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering technology has been used extensively for the analysis of gene expression data. Among various clustering methods, soft subspace clustering algorithms developed in recent years have demonstrated more promising performance than most traditional clustering algorithms and hard subspace clustering algorithms. Many soft subspace clustering algorithms have effectively utilized the within-cluster information, such as the within-cluster compactness, to develop the corresponding algorithms but few of them pay enough attention to other important information, such as the between-cluster information. Thus, it deserves further study to enhance soft subspace clustering by integrating more useful information in the clustering procedure. In this study, enhanced subspace clustering techniques are investigated for the clustering analysis of high dimensional gene expression data by integrating the within-cluster and between-cluster information simultaneously. First, a new optimization objective function is presented by integrating the fuzzy within-class compactness and the between-cluster separation in the weighting subspace. The corresponding learning rules for clustering are then derived based on the proposed objective function and a new soft subspace clustering algorithm, named as Enhanced Entropy-Weighting Subspace Clustering (EEW-SC), is proposed. The performance of the proposed algorithm on the clustering analysis of various high dimensional gene expression datasets is experimentally compared with that of several competitive subspace clustering algorithms. Our experimental studies demonstrate that the proposed algorithm can obtain better performance than most of the existing soft subspace clustering algorithms.