Elements of information theory
Elements of information theory
Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals
STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
A compression algorithm for DNA sequences and its applications in genome comparison
RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Introduction to data compression (2nd ed.)
Introduction to data compression (2nd ed.)
Fast lightweight suffix array construction and checking
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
A new metric for probability distributions
IEEE Transactions on Information Theory
Improved redundancy of a version of the Lempel-Ziv algorithm
IEEE Transactions on Information Theory
Biological networks: comparison, conservation, and evolutionary trees
RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
Hi-index | 0.00 |
We describe a novel method for efficient reconstruction of phylogenetic trees, based on sequences of whole genomes or proteomes. The core of our method is a new measure of pairwise distances between sequences, whose lengths may greatly vary. This measure is based on information theoretic tools (Kullback-Leibler relative entropy). We present an algorithm for efficiently computing these distances. The algorithm uses suffix arrays to compute the distance of two ℓ long sequences in O(ℓ) time. It is fast enough to enable the construction of the phylogenomic tree for hundreds of species, and the phylogenomic forest for almost two thousand viruses. An initial analysis of the results exhibits a remarkable agreement with “acceptable phylogenetic truth”. To assess our approach, it was implemented together with a number of alternative approaches, including two that were previously published in the literature. Comparing their outcome to ours, using a “traditional” tree and a standard tree comparison method, our algorithm improved upon the “competition” by a substantial margin.