Co-evolution and information signals in biological sequences

Authors:
A. Carbone;L. Dib
Affiliations:
-;-
Venue:
Theoretical Computer Science
Year:
2011

Citing 4
Cited 0

Physical complexity of symbolic sequences

Physica D
A perturbation-based method for calculating explicit likelihood of evolutionary co-variance in multiple sequence alignments

Bioinformatics
Using information theory to search for co-evolving residues in proteins

Bioinformatics
Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction

Bioinformatics

Quantified Score

Hi-index	5.23

Visualization

Abstract

The information content of a pool of sequences has been defined in information theory through enthropic measures aimed to capture the amount of variability within sequences. When dealing with biological sequences coding for proteins, a first approach is to align these sequences to estimate the probability of each amino-acid to occur within alignment positions and to combine these values through an ''entropy'' function whose minimum corresponds to the case where for each position, each amino-acid has the same probability to occur. This model is too restrictive when the purpose is to evaluate sequence constraints that have to be conserved to maintain the function of the proteins under random mutations. In fact, co-evolution of amino-acids appearing in pairs or tuplets of positions in sequences constitutes a fine signal of important structural, functional and mechanical information for protein families. It is clear that classical information theory should be revisited when applied to biological data. A large number of approaches to co-evolution of biological sequences have been developed in the last decade. We present a few of them, discuss their limitations and some related questions, like the generation of random structures to validate predictions based on co-evolution, which appear crucial for new advances in structural bioinformatics.