A Simple Hyper-Geometric Approach for Discovering Putative Transcription Factor Binding Sites

Authors:
Yoseph Barash;Gill Bejerano;Nir Friedman
Affiliations:
-;-;-
Venue:
WABI '01 Proceedings of the First International Workshop on Algorithms in Bioinformatics
Year:
2001

Citing 6
Cited 5

Context-specific Bayesian clustering for gene expression data

RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Finding motifs using random projections

RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Regulatory Element Detection Using a Probabilistic Segmentation Model

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Mining for Putative Regulatory Elements in the Yeast Genome Using Gene Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Combinatorial Approaches to Finding Subtle Signals in DNA Sequences

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
A Statistical Method for Finding Transcription Factor Binding Sites

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology

Discovery of distinctive patterns in music

Intelligent Data Analysis - Machine Learning and Music
Computational molecular biology of genome expression and regulation

PReMI'05 Proceedings of the First international conference on Pattern Recognition and Machine Intelligence
Condition transition analysis reveals TF activity related to nutrient-limitation-specific effects of oxygen presence in yeast

CMSB'06 Proceedings of the 2006 international conference on Computational Methods in Systems Biology
Generalized planted (l,d)-motif problem with negative set

WABI'05 Proceedings of the 5th International conference on Algorithms in Bioinformatics
A new clustering approach for learning transcriptional modules

International Journal of Data Mining and Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

A central issue in molecular biology is understanding the regulatory mechanisms that control gene expression. The recent flood of genomic and post-genomic data opens the way for computational methods elucidating the key components that play a role in these mechanisms. One important consequence is the ability to recognize groups of genes that are co-expressed using microarray expression data. We then wish to identify in-silico putative transcription factor binding sites in the promoter regions of these gene, that might explain the coregulation, and hint at possible regulators. In this paper we describe a simple and fast, yet powerful, two stages approach to this task. Using a rigorous hypergeometric statistical analysis and a straightforward computational procedure we find small conserved sequence kernels. These are then stochastically expanded into PSSMs using an EM-like procedure. We demonstrate the utility and speed of our methods by applying them to several data sets from recent literature. We also compare these results with those of MEME when run on the same sets.