A novel approach to revealing positive and negative co-regulated genes

Authors:
Yu-Hai Zhao;Guo-Ren Wang;Ying Yin;Guang-Yu Xu
Affiliations:
Department of Computer Science and Engineering, Northeastern University, Shengyang, China;Department of Computer Science and Engineering, Northeastern University, Shengyang, China;Department of Computer Science and Engineering, Northeastern University, Shengyang, China;Department of Computer Science and Engineering, Northeastern University, Shengyang, China
Venue:
Journal of Computer Science and Technology
Year:
2007

Citing 13
Cited 0

Clustering by pattern similarity in large data sets

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Compressing Bitmap Indexes for Faster Search Operations

SSDBM '02 Proceedings of the 14th International Conference on Scientific and Statistical Database Management
OP-Cluster: Clustering by Tendency in High Dimensional Space

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
MaPle: A Fast Algorithm for Maximal Pattern-based Clustering

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
A Time Series Analysis of Microarray Data

BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
A Fast Algorithm for Subspace Clustering by Pattern Similarity

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Biclustering in Gene Expression Data by Tendency

CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Cluster Analysis for Gene Expression Data: A Survey

IEEE Transactions on Knowledge and Data Engineering
A Time-Series Biclustering Algorithm for Revealing Co-Regulated Genes

ITCC '05 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume I - Volume 01
TRICLUSTER: an effective algorithm for mining coherent clusters in 3D microarray data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Clustering short time series gene expression data

Bioinformatics
On the performance of bitmap indices for high cardinality attributes

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Quantified Score

Hi-index	0.00

Visualization

Abstract

As explored by biologists, there is a real and emerging need to identify co-regulated gene clusters, which include both positive and negative regulated gene clusters. However, the existing pattern-based and tendency-based clustering approaches are only designed for finding positive regulated gene clusters. In this paper, a new subspace clustering model called g-Cluster is proposed for gene expression data. The proposed model has the following advantages: 1) find both positive and negative co-regulated genes in a shot, 2) get away from the restriction of magnitude transformation relationship among co-regulated genes, and 3) guarantee quality of clusters and significance of regulations using a novel similarity measurement gCode and a user-specified regulation threshold δ, respectively. No previous work measures up to the task which has been set. Moreover, MDL technique is introduced to avoid insignificant g-Clusters generated. A tree structure, namely GS-tree, is also designed, and two algorithms combined with efficient pruning and optimization strategies to identify all qualified g-Clusters. Extensive experiments are conducted on real and synthetic datasets. The experimental results show that 1) the algorithm is able to find an amount of co-regulated gene clusters missed by previous models, which are potentially of high biological significance, and 2) the algorithms are effective and efficient, and outperform the existing approaches.