A new approach to analyzing gene expression time series data

Authors:
Ziv Bar-Joseph;Georg Gerber;David K. Gifford;Tommi S. Jaakkola;Itamar Simon
Affiliations:
MIT, Cambridge, MA;MIT, Cambridge, MA;MIT, Cambridge, MA;MIT, Cambridge, MA;Whitehead Institute for Biomedical Research, Cambridge, MA
Venue:
Proceedings of the sixth annual international conference on Computational biology
Year:
2002

Citing 2
Cited 25

Using Bayesian networks to analyze expression data

RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Alignment by maximization of mutual information

Alignment by maximization of mutual information

Clustering of streaming time series is meaningless

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Translation-invariant mixture models for curve clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Indexing multi-dimensional time-series with support for multiple distance measures

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Towards interactive exploration of gene expression patterns

ACM SIGKDD Explorations Newsletter
Clustering time series from ARMA models with clipped data

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Bioinformatics—an introduction for computer scientists

ACM Computing Surveys (CSUR)
Exact indexing of dynamic time warping

Knowledge and Information Systems
Elastic Translation Invariant Matching of Trajectories

Machine Learning
Analyzing Gene Expression Time-Courses

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Combining Sequence and Time Series Expression Data to Learn Transcriptional Modules

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Biclustering Models for Structured Microarray Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Indexing Multidimensional Time-Series

The VLDB Journal — The International Journal on Very Large Data Bases
Inference of Genetic Regulatory Networks with Recurrent Neural Network Models Using Particle Swarm Optimization

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Knowledge Aquisition and Data Storage in Mobile GeoSensor Networks

GeoSensor Networks
Finding anomalous periodic time series

Machine Learning
A novel HMM-based clustering algorithm for the analysis of gene expression time-course data

Computational Statistics & Data Analysis
Clustering of unevenly sampled gene expression time-series data

Fuzzy Sets and Systems
Translating time-course gene expression profiles into semi-algebraic hybrid automata via dimensionality reduction

AB'07 Proceedings of the 2nd international conference on Algebraic biology
A global averaging method for dynamic time warping, with applications to clustering

Pattern Recognition
Summarizing a set of time series by averaging: From Steiner sequence to compact multiple alignment

Theoretical Computer Science
Probabilistic models for joint clustering and time-warping of multidimensional curves

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
A Swarm Intelligence Framework for Reconstructing Gene Networks: Searching for Biologically Plausible Architectures

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Using relevance feedback to learn both the distance measure and the query in multimedia databases

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Mixture models for clustering multilevel growth trajectories

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present algorithms for time-series gene expression analysis that permit the principled estimation of unobserved time-points, clustering, and dataset alignment. Each expression profile is modeled as a cubic spline (piecewise polynomial) that is estimated from the observed data and every time point influences the overall smooth expression curve. We constrain the spline coefficients of genes in the same class to have similar expression patterns, while also allowing for gene specific parameters. We show that unobserved time-points can be reconstructed using our method with 10-15% less error when compared to previous best methods. Our clustering algorithm operates directly on the continuous representations of gene expression profiles, and we demonstrate that this is particularly effective when applied to non-uniformly sampled data. Our continuous alignment algorithm also avoids difficulties encountered by discrete approaches. In particular, our method allows for control of the number of degrees of freedom of the warp through the specification of parameterized functions, which helps to avoid overfitting. We demonstrate that our algorithm produces stable low-error alignments on real expression data and further show a specific application to yeast knockout data that produces biologically meaningful results.