Methodologies for studying protein co-evolution
Co-evolution, which can be defined as interdependence of evolutionary histories, is a fundamental part of evolutionary theories from the times of C. Darwin himself. Inter-species co-evolution and co-adaptation is known to have a large effect on the evolutionary paths and characteristics of the involved organisms. More recently, co-evolutionary concepts have been brought to the molecular level. The evolution of many genes/proteins is not independent but entangled to that of others. The same happens at a lower level for individual residues within proteins. Molecular co-evolution can be due to specific co-adaptation between the two co-evolving elements, where changes in one of them are compensated by changes in the other, or by a less specific external force affecting the evolutionary rates of both elements in a similar magnitude. In both cases, independently of the underlying cause, co-evolutionary signatures between genes/proteins serve as markers of physical interactions and/or functional relationships. For this reason, a plethora of computational methods emerged for studying co-evolution at the protein or residue level so as to predict features such as protein-protein interactions, residue contacts within protein structures and protein functional sites. Co-evolution allows proteins to change while maintaining their interactions and, consequently, it plays a very important role in key biological systems. For this reason, the application of those co-evolution inspired methodologies allowed to gain insight into de functioning of these systems.
More information
Available co-evolution related methods
Category | Method | Input | Analysis | Main application | Software Availability | Servers and Databases | Refs |
Inter-residue co-evolution | Mutual Information | One Large famiy MSA | Simple inter-position co-evolution | Protein contacts (model selection for homology modeling) | Java code (http://www.afodor.net/covariance1_1.zip ) | Coevolution analysis server (http://coevolution.gersteinlab.org/coevolution/) | 1 |
Mutual information corrected (Mip) | One Large famiy MSA | Inter-position co-evolution without phylogenetic contribution | Protein contacts (model selection for homology modeling) | Perl code (Suppl. Mat. Ref) | 2 | ||
McBASC | One family alignment | Simple inter-position co-evolution | Protein contacts (model selection for homology modeling) | Binary files for every OS (*) & Java code ( http://www.afodor.net/covariance1_1.zip) | Coevolution analysis server (http://coevolution.gersteinlab.org/coevolution/) | 3 | |
CAPS | One small alignment [optional: second alignment or pdb] | Inter-position co-evolution without phylogenetic contribution | Protein contacts (model selection for homology modeling) | Perl code (http://bioinf.gen.tcd.ie/~faresm/software/software.html#caps) | CAPS server (http://bioinf.gen.tcd.ie/caps/home.html) | 4 | |
DCA / DCA optimized | One Large famiy MSA | Pair specific Inter-position co-evolution | Protein contacts (ab initio protein structure prediction) | Matlab code (*) | 5,6 | ||
PSICOV | One Large famiy MSA | Pair specific Inter-position co-evolution | Protein contacts (ab initio protein structure prediction) | Fortarn & C code (http://bioinfadmin.cs.ucl.ac.uk/downloads/PSICOV/) | 7 | ||
SDPs | Evolutionary trace | One family alignment [optional pdb and/or tree] | SDPs | Ligand and protein interaction specificity | Evolutionary Trace Server( http://mammoth.bcm.tmc.edu/ETserver.html) | 8 | |
SDPsite | One family alignment and a tree | SDPs | Ligand and protein interaction specificity | - | SDPsite Server (http://bioinf.fbb.msu.ru/SDPsite/index.jsp) | 9 | |
Mutational behaviour | One family alignment | SDPs | Ligand and protein interaction specificity |
| Treedet Server (http://treedetv2.bioinfo.cnio.es/treedet/index.html) | 10 | |
Sequence Space | One family alignment | SDPs and subfamilies | Ligand and protein interaction specificity | Binary files for every OS (*) | 11 | ||
S3det | One family alignment | SDPs | Ligand and protein interaction specificity |
| Treedet Server (http://treedetv2.bioinfo.cnio.es/treedet/index.html) | 12 | |
SCA-like | SCAold | One family alignment | Conditioned conservation | Intra-protein pathways (allostery) | Binary file for Windows (*) & Java code (http://www.afodor.net/covariance1_1.zip) | 13 | |
SCAnew | One family alignment | Subfamily -specific conservation | Intra-protein pathways (allostery) | Matlab toolbox (*) | 14 | ||
Inter-protein co-evolution | MirrorTree | Two alignments of orhtologous sequences | Simple Inter-protein co-evolution | Phyiscal and Functional Interactions | Binary files for every OS (*) | MirrorTree Server (http://csbg.cnb.csic.es/mtserver/) | 15 |
i2h | Two alignments of orhtologous sequences | Simple Inter-protein co-evolution | Phyiscal and Functional Interactions | Binary files for every OS (*) | 16 | ||
tol-MirrorTree | Sequence distance matrixes for two sets of orthologous and for the species tree (16S rRNA tree) | Inter-protein co-evolution without phylogenetic contribution | Phyiscal and Functional Interactions | Binary files for every OS (*) | 17 | ||
ContextMirror | Evolutionary distances of a big set of groups of orthologs | Pair specific inter-protein co-evolution | Phyiscal and Functional Interactions | Binary files for every OS (*) | EcID database (http://ecid.bioinfo.cnio.es/) | 18 | |
MMM | Sequence distance matrixes for two sets of homologs | Inter-protein co-evolution of the strngest co-evolving sequence in the alignments | Phyiscal and Functional Interactions | Binary files for every OS (*) | MatrixMatchMaker Web interface (http://www.uhnresearch.ca/labs/tillier/MMMWEBvII/MMMWEBvII.php) MMM-D: database of co-evolving proteins (http://tillier.uhnres.utoronto.ca/MMMD.php) | 19 | |
Phylogenetic Profiles | - | Sequences presence/ absence- associated inter-protein co-evolution | Phyiscal and Functional Interactions | - | STRING database (http://www.string-db.org/) | 20 |
References:
1. Korber, B. T., Farber, R. M., Wolpert, D. H. & Lapedes, A. S. Covariation of mutations in the V3 loop of human immunodeficiency virus type 1 envelope protein: an information theoretic analysis. Proc. Natl. Acad. Sci. U.S.A. 90, 7176–7180 (1993).
2. Dunn, S. D., Wahl, L. M. & Gloor, G. B. Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction. Bioinformatics 24, 333–340 (2008).
3. Göbel, U., Sander, C., Schneider, R. & Valencia, A. Correlated mutations and residue contacts in proteins. Proteins 18, 309–317 (1994).
4. Fares, M. A. & Travers, S. A. A. A novel method for detecting intramolecular coevolution: adding a further dimension to selective constraints analyses. Genetics 173, 9–23 (2006).
5. Weigt, M., White, R. A., Szurmant, H., Hoch, J. A. & Hwa, T. Identification of direct residue contacts in protein-protein interaction by message passing. Proc. Natl. Acad. Sci. U.S.A. 106, 67–72 (2009).
6. Morcos, F. et al. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc. Natl. Acad. Sci. U.S.A. 108, E1293–301 (2011).
7. Jones, D. T., Buchan, D. W. A., Cozzetto, D. & Pontil, M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 28, 184–190 (2012).
8. Mihalek, I., Res, I. & Lichtarge, O. A family of evolution-entropy hybrid methods for ranking protein residues by importance. J. Mol. Biol. 336, 1265–1282 (2004).
9. Kalinina, O. V., Gelfand, M. S. & Russell, R. B. Combining specificity determining and conserved residues improves functional site prediction. BMC Bioinformatics 10, 174 (2009).
10. del Sol Mesa, A., Pazos, F. & Valencia, A. Automatic Methods for Predicting Functionally Important Residues. J. Mol. Biol. 326, 1289–1302 (2003).
11. Casari, G., Sander, C. & Valencia, A. A method to predict functional residues in proteins. Nat. Struct. Biol. 2, 171–178 (1995).
12. Rausell, A., Juan, D., Pazos, F. & Valencia, A. Protein interactions and ligand binding: from protein subfamilies to functional specificity. Proc. Natl. Acad. Sci. U.S.A. 107, 1995–2000 (2010).
13. Lockless, S. W. & Ranganathan, R. Evolutionarily conserved pathways of energetic connectivity in protein families. Science 286, 295–299 (1999).
14. Reynolds, K. A., McLaughlin, R. N. & Ranganathan, R. Hot spots for allosteric regulation on protein surfaces. Cell 147, 1564–1575 (2011).
15. Pazos, F. & Valencia, A. Similarity of phylogenetic trees as indicator of protein-protein interaction. Protein Eng. 14, 609–614 (2001).
16. Pazos, F. & Valencia, A. In silico two-hybrid system for the selection of physically interacting protein pairs. Proteins 47, 219–227 (2002).
17. Pazos, F., Ranea, J. A. G., Juan, D. & Sternberg, M. J. E. Assessing protein co-evolution in the context of the tree of life assists in the prediction of the interactome. J. Mol. Biol. 352, 1002–1015 (2005).
18. Juan, D., Pazos, F. & Valencia, A. High-confidence prediction of global interactomes based on genome-wide coevolutionary networks. Proc. Natl. Acad. Sci. U.S.A. 105, 934–939 (2008).
19. Tillier, E. R. M. & Charlebois, R. L. The human protein coevolution network. Genome Res. 19, 1861–1871 (2009).
20. Pellegrini, M., Marcotte, E. M., Thompson, M. J., Eisenberg, D. & Yeates, T. O. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl. Acad. Sci. U.S.A. 96, 4285–4288 (1999).
© 2012, Computational Systems Biology Group. CNB-CSIC