Finding novel transcripts in high-resolution genome-wide microarray data using the genrate model

Authors:
Brendan J. Frey;Quaid D. Morris;Mark Robinson;Timothy R. Hughes
Affiliations:
Elec. and Comp. Eng., Univ. of Toronto, Toronto, ON, Canada;Elec. and Comp. Eng., Univ. of Toronto, Toronto, ON, Canada;Elec. and Comp. Eng., Univ. of Toronto, Toronto, ON, Canada;Banting and Best Dep. Med. Res., Univ. of Toronto, Toronto, ON, Canada
Venue:
RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology
Year:
2005

Citing 2
Cited 0

Extending factor graphs so as to unify directed and undirected graphical models

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Factor graphs and the sum-product algorithm

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

Genome-wide microarray designs containing millions to tens of millions of probes will soon become available for a variety of mammals, including mouse and human. These “tiling arrays” can potentially lead to significant advances in science and medicine, e.g., by indicating new genes and alternative primary and secondary transcripts. While bottom-up pattern matching techniques (e.g., hierarchical clustering) can be used to find gene structures in tiling data, we believe the many interacting hidden variables and complex noise patterns more naturally lead to an analysis based on generative models. We describe a generative model of tiling data and show how the iterative sum-product algorithm can be used to infer hybridization noise, probe sensitivity, new transcripts and alternative transcripts. We apply our method, called GenRate, to a new exon tiling data set from mouse chromosome 4 and show that it makes significantly more predictions than a previously described hierarchical clustering method at the same false positive rate. GenRate correctly predicts many known genes, and also predicts new gene structures. As new problems arise, additional hidden variables can be incorporated into the model in a principled fashion, so we believe that GenRate will prove to be a useful tool in the new era of genome-wide tiling microarray analysis.