Bi-clustering gene expression data using co-similarity

Authors:
Syed Fawad Hussain
Affiliations:
Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Pakistan
Venue:
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Year:
2011

Citing 11
Cited 0

A vector space model for automatic indexing

Communications of the ACM
Discovering local structure in gene expression data: the order-preserving submatrix problem

Proceedings of the sixth annual international conference on Computational biology
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
A generalized maximum entropy approach to bregman co-clustering and matrix approximation

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
BicAT: a biclustering analysis toolbox

Bioinformatics
A systematic comparison and evaluation of biclustering methods for gene expression data

Bioinformatics
Coclustering of Human Cancer Microarrays Using Minimum Sum-Squared Residue Coclustering

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Co-Clustering Tags and Social Data Sources

WAIM '08 Proceedings of the 2008 The Ninth International Conference on Web-Age Information Management
Chi-Sim: A New Similarity Measure for the Co-clustering Task

ICMLA '08 Proceedings of the 2008 Seventh International Conference on Machine Learning and Applications
An Improved Co-Similarity Measure for Document Clustering

ICMLA '10 Proceedings of the 2010 Ninth International Conference on Machine Learning and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a new framework for bi-clustering gene expression data that is based on the notion of co-similarity between genes and samples. Our work is based on a co-similarity based framework that iteratively learns similarity between rows using similarity between columns and vice-versa in a matrix. The underlying concept, which is usually referred to as bi-clustering in the domain of bioinformatics, aims to find groupings of the feature set that exhibit similar behavior across sample subsets. The algorithm has previously been shown to work well for document clustering in a sparse matrix representation. We propose a variation of the method suited for analyzing data that is represented as a dense matrix and is non-homogenous as is the case in gene expression. Our experiments show that, with the proposed variations, the method is well suited for finding bi-clusters with high degree of homogeneity and we provide empirical results on real world cancer datasets.