Approximation algorithms for bi-clustering problems

Authors:
Lusheng Wang;Yu Lin;Xiaowen Liu
Affiliations:
Department of Computer Science, City University of Hong Kong, Hong Kong;Department of Computer Science, City University of Hong Kong, Hong Kong;Department of Computer Science, City University of Hong Kong, Hong Kong
Venue:
WABI'06 Proceedings of the 6th international conference on Algorithms in Bioinformatics
Year:
2006

Citing 9
Cited 0

Polynomial time approximation schemes for dense instances of NP-hard problems

STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
On the closest string and substring problems

Journal of the ACM (JACM)
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Adaptive double self-organizing maps for clustering gene expression profiles

Neural Networks - 2003 Special issue: Advances in neural networks research — IJCNN'03
The maximum edge biclique problem is NP-complete

Discrete Applied Mathematics
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Bayesian hierarchical error model for analysis of gene expression data

Bioinformatics
A Chernoff bound for random walks on expander graphs

SFCS '93 Proceedings of the 1993 IEEE 34th Annual Foundations of Computer Science
A genetic K-means clustering algorithm applied to gene expression data

AI'03 Proceedings of the 16th Canadian society for computational studies of intelligence conference on Advances in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the main goals in the analysis of microarray data is to identify groups of genes and groups of experimental conditions (including environments, individuals and tissues), that exhibit similar expression patterns. This is the so-called bi-clustering problem. In this paper, we consider two variations of the bi-clustering problem: the Consensus Submatrix Problem and the Bottleneck Submatrix Problem. The input of the problems contains a m×n matrix A and integers l and k. The Consensus Submatrix Problem is to find a l×k submatrix with lm and kn and a consensus vector such that the sum of distance between all rows in the submatrix and the vector is minimized. The Bottleneck Submatrix Problem is to find a l×k submatrix with lm and kn, an integer d and a center vector such that the distance between every row in the submatrix and the vector is at most d and d is minimized. We show that both problems are NP-hard and give randomized approximation algorithms for special cases of the two problems. Using standard techniques, we can derandomize the algorithms to get polynomial time approximation schemes for the two problems. To our knowledge, this is the first time that approximation algorithms with guaranteed ratio are presented for microarray analysis.