New methods for detecting lineage-specific selection

Authors:
Adam Siepel;Katherine S. Pollard;David Haussler
Affiliations:
Center for Biomolecular Science and Engineering, U.C. Santa Cruz, Santa Cruz, CA;Center for Biomolecular Science and Engineering, U.C. Santa Cruz, Santa Cruz, CA;Center for Biomolecular Science and Engineering, U.C. Santa Cruz, Santa Cruz, CA
Venue:
RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
Year:
2006

Citing 2
Cited 1

Computational identification of evolutionarily conserved exons

RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Multiple-sequence functional annotation and the generalized hidden Markov phylogeny

Bioinformatics

An Evolutionary Study of the Human Papillomavirus Genomes

RECOMB-CG '08 Proceedings of the international workshop on Comparative Genomics

Quantified Score

Hi-index	0.00

Visualization

Abstract

So far, most methods for identifying sequences under selection based on comparative sequence data have either assumed selectional pressures are the same across all branches of a phylogeny, or have focused on changes in specific lineages of interest. Here, we introduce a more general method that detects sequences that have either come under selection, or begun to drift, on any lineage. The method is based on a phylogenetic hidden Markov model (phylo-HMM), and does not require element boundaries to be determined a priori, making it particularly useful for identifying noncoding sequences. Insertions and deletions (indels) are incorporated into the phylo-HMM by a simple strategy that uses a separately reconstructed “indel history.” To evaluate the statistical significance of predictions, we introduce a novel method for computing P-values based on prior and posterior distributions of the number of substitutions that have occurred in the evolution of predicted elements. We derive efficient dynamic-programming algorithms for obtaining these distributions, given a model of neutral evolution. Our methods have been implemented as computer programs called DLESS (Detection of LinEage-Specific Selection) and phyloP (phylogenetic P-values). We discuss results obtained with these programs on both real and simulated data sets.