Research article: Estimating sufficient statistics in co-evolutionary analysis by mutual information

Authors:
Philipp Weil;Franziska Hoffgaard;Kay Hamacher
Affiliations:
Theoretical Biology and Bioinformatics, Institute of Microbiology and Genetics, Department of Biology, TU Darmstadt, Schnittspahnstr. 10, 64287 Darmstadt, Germany;Theoretical Biology and Bioinformatics, Institute of Microbiology and Genetics, Department of Biology, TU Darmstadt, Schnittspahnstr. 10, 64287 Darmstadt, Germany;Theoretical Biology and Bioinformatics, Institute of Microbiology and Genetics, Department of Biology, TU Darmstadt, Schnittspahnstr. 10, 64287 Darmstadt, Germany
Venue:
Computational Biology and Chemistry
Year:
2009

Citing 10
Cited 2

Blind separation of instantaneous mixture of sources via the Gaussian mutual information criterion

Signal Processing
Information Theory, Inference & Learning Algorithms

Information Theory, Inference & Learning Algorithms
Linear and nonlinear ICA based on mutual information: the MISEP method

Signal Processing - Special issue on independent components analysis and beyond
Using information theory to search for co-evolving residues in proteins

Bioinformatics
Bio3d: an R package for the comparative analysis of protein structures

Bioinformatics
Adaptive extremal optimization by detrended fluctuation analysis

Journal of Computational Physics
Inferring protein–DNA dependencies using motif alignments and mutual information

Bioinformatics
Clustal W and Clustal X version 2.0

Bioinformatics
Correction for phylogeny, small number of observations and data redundancy improves the identification of coevolving amino acid pairs using mutual information

Bioinformatics
Biopython

Bioinformatics

Brief communication: Computation of mutual information from Hidden Markov Models

Computational Biology and Chemistry
Information-theoretic analysis of molecular (co)evolution using graphics processing units

Proceedings of the 3rd international workshop on Emerging computational methods for the life sciences

Quantified Score

Hi-index	0.01

Visualization

Abstract

Mutual information (MI) is a standard measure in information theory to observe and quantify correlated signals and events in both, empirical data sets and theoretical models. In the field of computational biology the MI turned out to be particularly useful in studies on co-evolutionary signals of sites within biomolecules. A key issue in the applicability of the MI is, however, a correct reference system or null model to understand finite-size effects in the underlying, finite data set. Although some bioinformatics studies exist with rigorous results for theoretical, well-designed random distributions, data from real-world proteins was never used to quantify the effect of finite-size samples. The impact of real-world statistics is, however, most relevant for researchers in all fields concerned with detecting evolutionary signals within biological sequences. We present results on such effects in finite-sized biological data sets and point to future research directions. We are most of all concerned with bacterial, ribosomal proteins as a prototypical example in molecular evolution. We compare to previous published suggestions, give an empirical formula, and propose a protocol to guide future research projects.