Biclustering of Expression Data
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Biclustering Algorithms for Biological Data Analysis: A Survey
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Shifting and scaling patterns from gene expression data
Bioinformatics
A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation
The Journal of Machine Learning Research
Coclustering of Human Cancer Microarrays Using Minimum Sum-Squared Residue Coclustering
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Situation-Aware on mobile phone using co-clustering: algorithms and extensions
IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence
Feature selection for k-means clustering stability: theoretical analysis and an algorithm
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
The sum squared residue has been popularly used as a clustering and co-clustering quality measure, however little research on its detail properties has been performed. Recent research articulates that the residue is useful to discover shifting patterns but inappropriate to find scaling patterns. To remedy this weakness, we propose to take specific data transformations that can adjust latent scaling factors and eventually lead to lower the residue. First, we consider data matrix models with varied shifting and scaling factors. Then, we formally analyze the effect of several data transformations on the residue. Finally, we empirically validate the analysis with publicly-available human cancer gene expression datasets. Both the analytical and experimental results reveal column standard deviation normalization and column Z-score transformation are effective for the residue to handle scaling factors, through which we are able to achieve better tissue sample clustering accuracy.