Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Efficient Reconstruction of Phylogenetic Networks with Constrained Recombination
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
The Number of Recombination Events in a Sample History: Conflict Graph and Lower Bounds
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Approximation algorithms for combinatorial problems
Journal of Computer and System Sciences
Parsimony Score of Phylogenetic Networks: Hardness Results and a Linear-Time Heuristic
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
Accurate computation of likelihoods in the coalescent with recombination via parsimony
RECOMB'08 Proceedings of the 12th annual international conference on Research in computational molecular biology
WABI'09 Proceedings of the 9th international conference on Algorithms in bioinformatics
Minimum recombination histories by branch and bound
WABI'05 Proceedings of the 5th International conference on Algorithms in Bioinformatics
RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
COCOON'07 Proceedings of the 13th annual international conference on Computing and Combinatorics
Hi-index | 0.00 |
Recombination is an important evolutionary mechanism responsible for the genetic diversity in humans and other organisms. Recently, there has been extensive research on understanding the fine scale variation in recombination rates across the human genome using DNA polymorphism data. A combinatorial approach toward this is to estimate the minimum number of recombination events in any history of the sample. Recently, Myers and Griffiths [1] proposed two measures, Rh and Rs, that give lower bounds on the minimum number of recombination events. In this paper, we provide new and improved methods (both in terms of running time and ability to detect past recombination events) for computing recombination lower bounds. Our principal results include:We show that computing the lower bound Rh is NP-hard and adapt the greedy algorithm for the set cover problem [2] to obtain a polynomial time algorithm for computing a diversity based bound Rg. This algorithm is several orders of magnitude faster than the Recmin program [1] and the bound Rg matches the bound Rh almost always. We also show that computing the lower bound is also NP-hard using a reduction from MAX-2SAT. We give a O(m 2n) time algorithm for computing Rs for a dataset with n haplotypes and m SNP's. We propose a new bound RI which extends the history based bound Rs using the notion of intermediate haplotypes. This bound detects more recombination events than both Rh and Rs bounds on many real datasets. We extend our algorithms for computing Rg and Rs to obtain lower bounds for haplotypes with missing data. These methods can detect more recombination events for the LPL dataset [3] than previous bounds and provide stronger evidence for the presence of a recombination hotspot. We apply our lower bounds to a real dataset [4] and demonstrate that these can provide a good indication for the presence and the location of recombination hotspots.