Deciphering function and mechanism of calcium-binding proteins from their evolutionary imprints

Deciphering function and mechanism of calcium-binding proteins from their evolutionary imprints
of 12
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
  Deciphering function and mechanism of calcium-binding proteinsfrom their evolutionary imprints Reginald O. Morgan  a,b, ⁎ , Silvia Martin-Almedina  a,b , Montserrat Garcia  a,b ,Jorge Jhoncon-Kooyip  a,b,c , Maria-Pilar Fernandez  a,b a   Department of Biochemistry and Molecular Biology, Edificio Santiago Gascon, Faculty of Medicine, University of Oviedo, 33006 Oviedo, Spain  b University Institute of Biotechnology of Asturias, Spain c Universidad Nacional de Educación  “   Enrique Guzmán y Valle ”  , Lima, Peru Received 17 August 2006; received in revised form 18 September 2006; accepted 19 September 2006Available online 26 September 2006 Abstract Calcium-binding proteins regulate ion metabolism and vital signalling pathways in all living organisms. Our aim is to rationalize the molecular  basis of their function by studying their evolution using computational biology techniques. Phylogenetic analysis is of primary importance for classifying cognate orthologs; profile hidden Markov models (HMM) of individual subfamilies discern functionally relevant sites by conservation probability analysis; and 3-dimensional structures display the integral protein in context. The major classifications of calcium-binding proteins,viz. EF-hand, C2 and ANX, exhibit structural diversity in their HMM fingerprints at the subfamily level, with functional consequences for proteinconformation, exposure of receptor interaction sites and/or binding to membrane phospholipids. Calmodulin, S100 and annexin families werecharacterized in  Petromyzon marinus  (sea lamprey) to document genome duplication and gene creation events during the key evolutionarytransition to primitive vertebrates. Novel annexins from diverse organisms revealed calcium-binding domains with accessory structural featuresthat define their unique molecular fingerprints, protein interactivity and functional specificity. These include the first single-domain, bacterialannexin in  Cytophaga hutchinsonii , the 21 tetrad annexins from the unicellular protist   Giardia intestinalis , an ancestor to land plant annexins fromthe green alga  Ostreococcus lucimarinus , invertebrate octad annexins and a critical polymorphism in human ANXA7. Receptor docking modelssupported the hypothesis of a potential interaction between annexin and C2 domains as a propitious mechanism for ensuring membranetranslocation during signal transduction.© 2006 Elsevier B.V. All rights reserved.  Keywords:  Computational biology; Functional determinant; Hidden Markov model (HMM); Membrane-binding mechanism; Molecular evolution; Receptor docking 1. Introduction Calcium-binding sites can be defined, in the narrowest sense, by the pattern of carboxyl and carbonyl oxygens in acidic and polar residues that coordinate calcium ions. They may be further classified by their loop geometry as EF-hand (helix – loop – helix), annexin (discontiguous pair of helix – loop – helix) or C2domain (multiple  β -strand). Since this binding is oftencooperative with respect to membrane phospholipids and/or local pH, it is important to consider the influence of adjacent residues on bond angle flexibility and other protein propertiesthat influence binding kinetics. The literature on EF-hand motifsand C2 domains, and more recently for annexins, amplydemonstrates the essential contribution of the surroundingstructural environment, not only to the specificity of calcium- binding itself but to its subsequent coupling with other activedomains responsible for conformational changes, membrane binding and ensuing receptor interactions [1 – 4]. Thus, specialresidues such as cysteine can add sensitivity to the cellular reduction – oxidation state, hydrophobic or bulky aromaticresidues such as tryptophan and phenylalanine perturb protein Biochimica et Biophysica Acta 1763 (2006) 1238 –  ⁎  Corresponding author. Department of Biochemistry and Molecular Biology,Edificio Santiago Gascon, Faculty of Medicine, University of Oviedo, 33006Oviedo, Spain. Tel.: +34 985 104214; fax: +34 985 103157.  E-mail address: (R.O. Morgan).0167-4889/$ - see front matter © 2006 Elsevier B.V. All rights reserved.doi:10.1016/j.bbamcr.2006.09.028  flexibility for deeper membrane or receptor penetration, while basic residues can strengthen protein bonds with anionic phospholipids independent of a calcium bridge.A biological definition of calcium-binding protein functionrequires additional knowledge of the interrelationships amongall other functional domains in the integral protein and the physiological network role that protein fulfills. This objectivestill remains elusive for the majority of calcium-binding proteins, but it is becoming increasingly viable and reliable toinfer functionality from structural information using algo-rithms of computational biology, particularly those that are based on evolutionary theory [5 – 8]. The reason is that thefunctional adaptation of a protein originates in the naturalselection of its structural features, and this progression can beexamined by the systematic comparison of its cognateorthologs as structural and functional equivalents in different species. The sequence data available from genome sequencing projects now enable the elaboration of refined, evolutionaryimprints or molecular profiles of the conservation anddivergence pattern for every protein subfamily and thisinformation can be viewed in realistic context by incorporatingit into 3-dimensional structures. Such evolutionary modelsfacilitate inferences about function and mechanism and can befurther enhanced by accumulated biological knowledge and aguided imagination.The fact that calcium-binding domains frequently occur inmultimers or heteromers within many protein sequencessuggests some level of interactive diversity in their function,and alterations of gross structure or occasional ablation of calcium-binding sites also point to supplementary roles or moreintricate mechanisms [9 – 12]. We therefore sought to compareevolutionary aspects of the highly conserved EF-hand proteincalmodulin, the more recent and highly diversified vertebrateS100 family, and the ancient, unique superfamily of annexins,with the aims of tracing their distinctive origins and thedivergence profiles of their calcium-binding sites. The char-acterization of new homologs in each group provides insight into the varied mechanisms by which calcium binding andsignal transduction are controlled [13 – 15]. These include theintroduction into calcium-binding domains of other residues(esp. Cys, Trp and Lys) capable of modifying calcium- and phospholipid-binding kinetics, rearranging domain architecture,and fostering cross-interactions like annexins with S100 and C2domains. 2. Materials and methods Database searches identified potential homologs of calmodulin, S100 andannexin by BLAST (esp. PSI-BLAST) and FASTA sequence comparison of authentic members against protein, cDNA transcript and genomic tracedatabases at the National Center for Biotechnology Information, USA (http:// ). Protein sequences were retrieved from UniProt ( and the PFAM database of proteinfamilies ( ). Original genome sequencedata for   Petromyzon marinus  (sea lamprey) srcinated from WashingtonUniversity, USA ( ). Genome sequence data for the bacterium  Cytophaga hutchinsonii , the green algae  Ostreococcus lucimarinus ,and the annelid worm  Capitella capitata  were from the DOE Joint GenomeInstitute ( ). The Marine Biological Laboratory (WoodsHole, MA, USA.) was responsible for the  Giardia intestinalis  genome project ( ).Progressively refined sequence alignments of cognate homologs weresubjected to phylogenetic analysis to define their evolutionary relationships,transformed into profile hidden Markov models to create sequence logosignatures, and mapped into 3-dimensional structures to reveal site-specific patterns of conservation and divergence for each subfamily. Such representa-tions focus attention on the evolutionarily (i.e. functionally) important sitesand reveal how key structural features interact in a physical context.Phylogenetic analysis by MEGA [16] performed neighbor-joining analysison >1000 bootstrapped alignments and results were confirmed by maximumlikelihood or Bayesian analysis of selected clades. Molecular profiles of individual subfamilies were created as hidden Markov models (HMMs) byHMMER  [17] and visualized with the sequence Logo-Mat server (http://logos. ). Gene structures and chromosomal linkage maps werededuced by visual inspection and the integration of contig data based onexisting models [7,15,18]. Site-specific conservation of multiple alignmentsand sequence threading into 3D models utilized the CONSURF server  [19]and protein structure modelling of evolutionary information employedHHPRED and MODELLER  [20,21] for the bacterial annexin, SWISS-MODEL DEEPVIEW [22] for template-based protein models, and MOLMOLfor molecular presentation ( molmol/ ). Molecular docking models of annexins with C2 domains used theCLUSPRO server  [23]. 3. Results and discussion 3.1. Origin and evolution of EF-hand motifs A survey of some 15,000 recognized EF-hand motifs inPFAM confirmed their presence in the extracellular proteinmilieu of bacteria and intracellularly in hundreds of eukaryotic protein families, where a significant proportionhave acquired conformational sensitivity to calcium andvaried modular architecture for participation in calciumsignalling mechanisms [5]. The typical EF-hand structureconsists of a 12-residue loop flanked by  α -helical domains, inwhich Asp side-chain oxygens of loop residues 1, 3 and 5,other atoms at positions 7 and 9, and a side-chain oxygenfrom Glu in position 12 participate in calcium coordination.Its representation as a hidden Markov model (HMM)sequence logo (Fig. 1) also reveals a conserved, central Glyand flanking Phe residues that further define what is regardedas a particularly successful variation of the more generalizedDxDxDG motif  [24]. The latter have been observed indistinct contexts, including the excalibur and thrombospondindomains where vicinal cysteines contribute structural support for calcium ligand presentation. The classical example of anEF-hand protein is the highly conserved calmodulin (Fig. 1),which contrasts with pseudoEF-hand domains of the S100 protein family. The molecular profiles and context of eachdomain, its response to calcium and target recognition areimportant determinants of the functional selectivity for diverseEF-hand proteins such as calmodulin and the S100 family[2,12].Calmodulin presents certain challenges for evolutionarystudy, because of its uncertain srcin in primitive eukaryotes,the strong functional constraint on protein structure conserva-tion, and the divergence in regulatory control of gene expressionat 3 distinct loci (encoding 3 identical proteins!) in mammaliangenomes [25]. The extreme conservation level of calmodulin 1239  R.O. Morgan et al. / Biochimica et Biophysica Acta 1763 (2006) 1238  –  1249  (especially in vertebrates) required an extensive alignment of 300 full-length eukaryotic proteins to develop a statisticalHMM profile that identified the (dis)similarity in its 4 EF-handmotifs and the strategic conservation of Met and Phe residuessecondarily involved in the conformational response andhydrophobic target recognition (Fig. 1). It should be notedthat more selective profiles of specific subfamilies or function-ally related clades can reveal additional features, such as thereplacement by Cys at position 27, Leu-72, Leu-86 and Gln-97in plants, Phe-100 and Lys-144 also common to invertebrates,or Ser-148 unique to invertebrate calmodulins [25]. NucleotideHMM profiles of noncoding regions, phylogenetic footprints of  promoter regions, and genetic linkage maps of calmodulins caneffectively identify conserved DNA regulatory elements todistinguish orthologs from paralogs.HMM statistical representations of sequence conservationare especially informative about functionally important DNAelements and protein residues when confined to individualsubfamilies. It is frequently meaningful to view this informa-tion in the context of a 3-dimensional model to identifydiscontiguous segments that interact physically or functioncooperatively. The protein sequence alignment used to createthe profile HMM was subjected to phylogenetic analysis andamino acid (aa) conservation analysis for incorporation of thisinformation into compatible 3D structures available in ProteinDataBank. CONSURF3 was used to create such a functionalmap of the calmodulin protein structure (Fig. 2), based on site-specific aa conservation variation in 300 calmodulin proteins,using two different conformational models in the absence(pdb:1cfd) and presence of calcium (pdb:1cll). The dynamic,functional role of individual aa have been clarified byexperimental models of calmodulin [26], but such evolutionary profiles can offer both insight and corroborative evidenceabout 3D structure – function relationships in lesser character-ized protein families. 3.2. Sea lamprey as a model organism for vertebrate evolution The emergence of vertebrates from invertebrate stock about 500 – 600 million years ago has been associated with successivewhole genome duplications [27]. We were interested incomparing the gene duplication patterns for calmodulin as avital regulatory gene, S100 proteins that originated invertebrates, and annexins that expanded into a distinct vertebrate family. We focused on genome analysis of   Petro-myzon marinus  (sea lamprey), a primitive, jawless vertebratethat may represent an intermediate state, having manyduplicated genes compared to invertebrates and suggestive of at least one whole genome duplication. The retrieval andassembly of whole genome shotgun traces with high-scoringmatches to calmodulin or S100 authentic sequences led to thereconstruction of 3 genes in each family (Fig. 3), apparently Fig. 1. Profile hidden Markov models (pHMM) of the canonical EF-hand motif (center panel), variants in excalibur and thrombospondin 3 (upper panels) anddistribution in the calmodulin protein family (bottom panels). Amino acid sequence alignments were compiled for EF-hand motifs from 1000 diverse proteins,excalibur (38 aa in 10 bacterial proteins), thrombospondin 3 (15 aa in 44 eukaryotic proteins) and full-length calmodulin (149 aa in 300 proteins). Analysis byHMMER and visualization as sequence logos by LOGO-MAT show residue frequencies by their relative height and the site-specific probabilities as total columnheight, reflecting informative value with respect to conservation and function. Acidic residues implicated in calcium ion coordination are starred (open where poorlyconserved) and the prominent conservation of cysteines in excalibur/thrombospondin or methionines and phenylalanines in calmodulin affirm their accessoryfunctional contribution to calcium-binding kinetics, conformational change and hydrophobic receptor interaction.1240  R.O. Morgan et al. / Biochimica et Biophysica Acta 1763 (2006) 1238  –  1249  representing the complete repertoires, given the intermediatestage of genome sequencing. Two of the calmodulins had 99%aa identity with each other and 85% nucleotide identity incoding regions, with 98% aa identity to human calmodulin andidentical gene structure with 6 coding exons. A third lampreycalmodulin with 95% aa identity to the others was represented by a single expressed sequence tag without genomic confirma-tion, but with sequence characteristics (e.g. Phe-100, Lys-144,Ser-148) resembling invertebrate calmodulins [25].Genomic trace assembly of lamprey annexin sequences hasreached a gene number of 12, equivalent to that of mammals, but this so far appears to represent half the expected number of gene families present in duplicate copy. Thus, both the foundingmember annexins A13, A7 and A11 and annexins A1 throughA4 seem to be present as paralogous duplicates, while annexinsA5, A6, A8, A9 and A10 remain to be detected. This isinterpreted to suggest that not all vertebrate annexin subfamiliesmay have been created at the divergence time of jawless fishfrom teleost  – tetrapod lineage, and that the lamprey genomeitself may have suffered a unique tetraploidization event that duplicated an incomplete set of annexin genes, includingannexin A7 (Fig. 3), which exists as a single gene incartilaginous fish and is absent from all teleost fish genomes.Further comparative genome studies will be required todetermine the model organisms most suitable for documentinggene and genome expansion in formative vertebrates.The three S100 proteins deduced from lamprey genomictraces shared approximately 42% aa pairwise identity betweenthem indicating they are unlikely to be recent duplication products within the lamprey lineage. Phylogenetic analysis(Fig. 4) and HMM models provided clear statisticalidentification for the known S100 subfamilies and just oneof the lamprey proteins was thereby confirmed to be a trueS100P ortholog. A second lamprey S100 branched from the base of the S100A2 – A6 clade and may therefore havesrcinated from their common ancestor prior to its amplifica-tion in higher vertebrates. The third lamprey protein wasweakly associated with the S100A10 – A11 pair. The pendingcompletion and genome assembly for lamprey and direct comparison with more teleost fish S100 subfamilies shouldresolve the true genetic relationships of these S100 loci withtheir mammalian homologs. Fig. 2. Structural representation of amino acid evolutionary conservation in calmodulin. Different conformational models in the absence of calcium (pdb:1cfd top) and presence (pdb:1cll bottom) were used to depict the spatial arrangement of conserved (burgundy) and variable (blue) atoms and bonds in calmodulin structures,computed from the sequence alignment by the CONSURF server and modelled by MolMol. Conserved residues directly associated with calcium ion coordination aresupplemented by others with accessory roles in maintaining conformational flexibility and hydrophobic receptor interactions (see text).1241  R.O. Morgan et al. / Biochimica et Biophysica Acta 1763 (2006) 1238  –  1249  Fig. 4. Phylogeny of sea lamprey S100 proteins. Three novel lamprey S100 proteins were subjected to neighbor-joining analysis of 1000 bootstrap alignments inMEGA together with 150 representatives from known subfamilies to determine their evolutionary relationships. The fanning of branch tips reflects the number of orthologs pertaining to each subfamily and numbers at bifurcations give the branch bootstrap support. S100_Pma1 was thus confirmed to pertain to the S100Psubfamily, while Pma2 and Pma3 did not strongly associate with any particular clade for formal classification. Orthologous representatives ( n =12) determined to belong to each of the S100A6, S100A10 and S100A11 subfamilies of annexin partners were selected for more detailed pHMM analysis.Fig. 3. Novel sequences of calcium-binding proteins from sea lamprey. Sequence searches and concatenation of contigs from the genome of   Petromyzon marinus  permitted the reconstruction of gene organization and prediction coding regions for putative lamprey calmodulins, S100 proteins and annexins. Calmodulinsdesignated Pma1 and Pma2 were derived from complete gene structures with splicing patterns identical to their vertebrate homologs, whereas Pma3 derived from thedual read of a single expressed sequence tag (GenBank accession no. EB717185) without genomic evidence and included Phe-100, Lys-144 and Ser-148 characteristicof invertebrate calmodulins (reverse highlight). The three S100 sequences represent the apparently complete and earliest known gene family repertoire in this primitivevertebrate. The detection of paralogous copies of annexin A7 in lamprey suggests that this lineage (Agnathans) may have suffered ancient genome tetraploidy to causea doubling of its half complement of mammalian annexin subfamilies. This is consistent with the presence of a single ANXA7 copy (with characteristic Trp and Cys inreverse highlight) in later-diverging cartilaginous fishes and the subsequent silencing of this gene altogether in bony fishes (see text).1242  R.O. Morgan et al. / Biochimica et Biophysica Acta 1763 (2006) 1238  –  1249
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks