Clustering short time series gene expression data

Authors:
Jason Ernst;Gerard J. Nau;Ziv Bar-Joseph
Affiliations:
Center for Automated Learning and Discovery, School of Computer Science, Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh, PA 15213, USA;Department of Molecular Genetics and Biochemistry, University of Pittsburgh School of Medicine 200 Lothrop Street, Pittsburgh, PA 15261, USA;Center for Automated Learning and Discovery, School of Computer Science, Carnegie Mellon University 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
Venue:
Bioinformatics
Year:
2005

Citing 0
Cited 27

Weighted isotonic regression under the L1 norm

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
A novel approach to revealing positive and negative co-regulated genes

Journal of Computer Science and Technology
Classification of gene functions using support vector machine for time-course gene expression data

Computational Statistics & Data Analysis
Maximal Subspace Coregulated Gene Clustering

IEEE Transactions on Knowledge and Data Engineering
Creating gene set activity profiles with time-series expression data

International Journal of Bioinformatics Research and Applications
A novel pattern based clustering methodology for time-series microarray data

International Journal of Computer Mathematics - Bioinformatics
A data integration method for exploring gene regulatory mechanisms

Proceedings of the 2nd international workshop on Data and text mining in bioinformatics
Clustering Time-Series Gene Expression Data with Unequal Time Intervals

Transactions on Computational Systems Biology X
Cross Species Expression Analysis of Innate Immune Response

RECOMB 2'09 Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology
Subspace sums for extracting non-random data from massive noise

Knowledge and Information Systems
Spectral preprocessing for clustering time-series gene expressions

EURASIP Journal on Bioinformatics and Systems Biology - Special issue on applications of signal procesing techniques to bioinformatics, genomics, and proteomics
Microarray Time-Series Data Clustering via Multiple Alignment of Gene Expression Profiles

PRIB '09 Proceedings of the 4th IAPR International Conference on Pattern Recognition in Bioinformatics
Identification of critical genes in microarray experiments by a Neuro-Fuzzy approach

Computational Biology and Chemistry
Data mining of vector–item patterns using neighborhood histograms

Knowledge and Information Systems
Analysis of time series data with predictive clustering trees

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
FPF-SB: a scalable algorithm for microarray gene expression data clustering

ICDHM'07 Proceedings of the 1st international conference on Digital human modeling
Unraveling complex temporal associations in cellular systems across multiple time-series microarray datasets

Journal of Biomedical Informatics
Alignment-based versus variation-based transformation methods for clustering microarray time-series data

Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
The sustainability of corporate wikis: A time-series analysis of activity patterns

ACM Transactions on Management Information Systems (TMIS)
A General Framework for Analyzing Data from Two Short Time-Series Microarray Experiments

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Microarray Time Course Experiments: Finding Profiles

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Classification trees for time series

Pattern Recognition
Comparison between the applications of fragment-based and vertex-based GPU approaches in k-means clustering of time series gene expression data

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part I
Clustering short gene expression profiles

RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
Curve fitting for short time series data from high throughput experiments with correction for biological variation

IDA'12 Proceedings of the 11th international conference on Advances in Intelligent Data Analysis
Short-term time series algebraic forecasting with internal smoothing

Neurocomputing
Proximity Measures for Clustering Gene Expression Microarray Data: A Validation Methodology and a Comparative Analysis

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	3.85

Visualization

Abstract

Motivation: Time series expression experiments are used to study a wide range of biological systems. More than 80% of all time series expression datasets are short (8 time points or fewer). These datasets present unique challenges. On account of the large number of genes profiled (often tens of thousands) and the small number of time points many patterns are expected to arise at random. Most clustering algorithms are unable to distinguish between real and random patterns. Results: We present an algorithm specifically designed for clustering short time series expression data. Our algorithm works by assigning genes to a predefined set of model profiles that capture the potential distinct patterns that can be expected from the experiment. We discuss how to obtain such a set of profiles and how to determine the significance of each of these profiles. Significant profiles are retained for further analysis and can be combined to form clusters. We tested our method on both simulated and real biological data. Using immune response data we show that our algorithm can correctly detect the temporal profile of relevant functional categories. Using Gene Ontology analysis we show that our algorithm outperforms both general clustering algorithms and algorithms designed specifically for clustering time series gene expression data. Availability: Information on obtaining a Java implementation with a graphical user interface (GUI) is available from http://www.cs.cmu.edu/~jernst/st/ Contact: jernst@cs.cmu.edu Supplementary information: Available at http://www.cs.cmu.edu/~jernst/st/