Biclustering of Expression Data
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Enhanced Biclustering on Expression Data
BIBE '03 Proceedings of the 3rd IEEE Symposium on BioInformatics and BioEngineering
Mining Deterministic Biclusters in Gene Expression Data
BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
Data analysis and bioinformatics
PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
Hi-index | 0.00 |
A bicluster of a gene expression dataset is a subset of genes which exhibit similar expression patterns along a subset of conditions. Biclustering algorithms aim at finding subsets of genes and subsets of conditions, such that a single cellular process is the main contributor to the expression of the gene subset over the condition subset. We believe that the size of biclusters should be small compared to the size of the gene expression data matrix and we have observed that a conceptually simpler way to perform biclustering from gene expression data is to apply standard oneway clustering algorithms to the rows and columns of the data matrix separately and then to combine the results to obtain bicluster seeds. Our algorithm has three steps. First, we generate a set of high quality bicluster seeds. In the second phase, these bicluster seeds are enlarged by adding more genes and conditions using a simulated annealing based technique. In the third phase, we find the p-values of the biclusters produced for statistical validation. Keywords: gene expression data, kmeans clustering, biclustering of expression data, p-value, simulated annealing.