RECOMB '98 Proceedings of the second annual international conference on Computational molecular biology
Formulations and hardness of multiple sorting by reversals
RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
A few logs suffice to build (almost) all trees (l): part I
Random Structures & Algorithms
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Absolute convergence: true trees from short sequences
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Steps toward accurate reconstructions of phylogenies from gene-order data
Journal of Computer and System Sciences - Computational biology 2002
Inversion Medians Outperform Breakpoint Medians in Phylogeny Reconstruction from Gene-Order Data
WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Constructing Big Trees from Short Sequences
ICALP '97 Proceedings of the 24th International Colloquium on Automata, Languages and Programming
Quartet Cleaning: Improved Algorithms and Simulations
ESA '99 Proceedings of the 7th Annual European Symposium on Algorithms
The Median Problem for Breakpoints in Comparative Genomics
COCOON '97 Proceedings of the Third Annual International Conference on Computing and Combinatorics
Performance study of phylogenetic methods: (unweighted) quartet methods and neighbor-joining
Journal of Algorithms - Special issue: Twelfth annual ACM-SIAM symposium on discrete algorithms
Phylogenetic Reconstruction from Arbitrary Gene-Order Data
BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
Linear programming for phylogenetic reconstruction based on gene rearrangements
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Improving tree search in phylogenetic reconstruction from genome rearrangement data
WEA'07 Proceedings of the 6th international conference on Experimental algorithms
Hi-index | 0.00 |
Phylogenetic reconstruction from gene-rearrangement data is attracting increasing attention from biologists and computer scientists. Methods used in reconstruction include distance-based methods, parsimony methods using sequence encodings, and direct optimization. The latter, pioneered by Sankoff and extended by us with the software suite GRAPPA, is the most accurate approach; however, its exhaustive approach means that it can be applied only to small datasets of fewer than 15 taxa. While we have successfully scaled it up to 1,000 genomes by integrating it with a disk-covering method (DCM-GRAPPA), the recursive decomposition may need many levels of recursion to handle datasets with 1,000 or more genomes. We thus investigated quartet-based approaches, which directly decompose the datasets into subsets of four taxa each; such approaches have been well studied for sequence data, but not for gene-rearrangement data. We give an optimization algorithm for the NP-hard problem of computing optimal trees for each quartet, present a variation of the dyadic method (using heuristics to choose suitable short quartets), and use both in simulation studies. We find that our quartet-based method can handle more genomes than the base version of GRAPPA, thus enabling us to reduce the number of levels of recursion in DCM-GRAPPA, but is more sensitive to the rate of evolution, with error rates rapidly increasing when saturation is approached.