Graphical Models, Exponential Families, and Variational Inference
Foundations and Trends® in Machine Learning
Toward a phylogenetically aware algorithm for fast DNA similarity search
RCG'04 Proceedings of the 2004 RECOMB international conference on Comparative Genomics
New methods for detecting lineage-specific selection
RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
Hi-index | 3.84 |
Motivation: Phylogenetic shadowing is a comparative genomics principle that allows for the discovery of conserved regions in sequences from multiple closely related organisms. We develop a formal probabilistic framework for combining phylogenetic shadowing with feature-based functional annotation methods. The resulting model, a generalized hidden Markov phylogeny (GHMP), applies to a variety of situations where functional regions are to be inferred from evolutionary constraints. Results: We show how GHMPs can be used to predict complete shared gene structures in multiple primate sequences. We also describe shadower, our implementation of such a prediction system. We find that shadower outperforms previously reported ab initio gene finders, including comparative human--mouse approaches, on a small sample of diverse exonic regions. Finally, we report on an empirical analysis of shadower's performance which reveals that as few as five well-chosen species may suffice to attain maximal sensitivity and specificity in exon demarcation. Availability: A Web server is available at http://bonaire.lbl.gov/shadower