Seeing the trees and their branches in the network is hard
Theoretical Computer Science
Integrating Sequence and Topology for Efficient and Accurate Detection of Horizontal Gene Transfer
RECOMB-CG '08 Proceedings of the international workshop on Comparative Genomics
Parsimony Score of Phylogenetic Networks: Hardness Results and a Linear-Time Heuristic
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Comparison of Tree-Child Phylogenetic Networks
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
Reconstructing phylogenetic networks with one recombination
WEA'08 Proceedings of the 7th international conference on Experimental algorithms
Faster computation of the Robinson-Foulds distance between phylogenetic networks
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Faster computation of the Robinson-Foulds distance between phylogenetic networks
Information Sciences: an International Journal
WABI'07 Proceedings of the 7th international conference on Algorithms in Bioinformatics
Hi-index | 3.84 |
Motivation: Phylogenies---the evolutionary histories of groups of organisms---play a major role in representing relationships among biological entities. Although many biological processes can be effectively modeled as tree-like relationships, others, such as hybrid speciation and horizontal gene transfer (HGT), result in networks, rather than trees, of relationships. Hybrid speciation is a significant evolutionary mechanism in plants, fish and other groups of species. HGT plays a major role in bacterial genome diversification and is a significant mechanism by which bacteria develop resistance to antibiotics. Maximum parsimony is one of the most commonly used criteria for phylogenetic tree inference. Roughly speaking, inference based on this criterion seeks the tree that minimizes the amount of evolution. In 1990, Jotun Hein proposed using this criterion for inferring the evolution of sequences subject to recombination. Preliminary results on small synthetic datasets. Nakhleh et al. (2005) demonstrated the criterion's application to phylogenetic network reconstruction in general and HGT detection in particular. However, the naive algorithms used by the authors are inapplicable to large datasets due to their demanding computational requirements. Further, no rigorous theoretical analysis of computing the criterion was given, nor was it tested on biological data. Results: In the present work we prove that the problem of scoring the parsimony of a phylogenetic network is NP-hard and provide an improved fixed parameter tractable algorithm for it. Further, we devise efficient heuristics for parsimony-based reconstruction of phylogenetic networks. We test our methods on both synthetic and biological data (rbcL gene in bacteria) and obtain very promising results. Contact: ssagi@math.berkeley.edu