Mining gene–sample–time microarray data: a coherent gene cluster discovery approach

Authors:
Daxin Jiang;Jian Pei;Murali Ramanathan;Chuan Lin;Chun Tang;Aidong Zhang
Affiliations:
Nanyang Technological University, School of Computing Engineering, Blk N4, 2a-32, 639798, Nanyang Avenue, Singapore;Simon Fraser University, Blk N4, 2a-32, 639798, Burnaby, Canada;State University of New York at Buffalo, Blk N4, 2a-32, 639798, Buffalo, New York, USA;State University of New York at Buffalo, Blk N4, 2a-32, 639798, Buffalo, New York, USA;State University of New York at Buffalo, Blk N4, 2a-32, 639798, Buffalo, New York, USA;State University of New York at Buffalo, Blk N4, 2a-32, 639798, Buffalo, New York, USA
Venue:
Knowledge and Information Systems
Year:
2007

Citing 0
Cited 3

MobileMiner: a real world case study of data mining in mobile communication

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Subspace sums for extracting non-random data from massive noise

Knowledge and Information Systems
Data mining of vector–item patterns using neighborhood histograms

Knowledge and Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Extensive studies have shown that mining microarray data sets is important in bioinformatics research and biomedical applications. In this paper, we explore a novel type of gene–sample–time microarray data sets that records the expression levels of various genes under a set of samples during a series of time points. In particular, we propose the mining of coherent gene clusters from such data sets. Each cluster contains a subset of genes and a subset of samples such that the genes are coherent on the samples along the time series. The coherent gene clusters may identify the samples corresponding to some phenotypes (e.g., diseases), and suggest the candidate genes correlated to the phenotypes. We present two efficient algorithms, namely the Sample-Gene Search and the Gene–Sample Search, to mine the complete set of coherent gene clusters. We empirically evaluate the performance of our approaches on both a real microarray data set and synthetic data sets. The test results have shown that our approaches are both efficient and effective to find meaningful coherent gene clusters.