In Silico Biology 9, 0029 (2009); ©2009, Bioinformation Systems e.V.  


Interface of apoptotic protein complexes has distinct properties


Pralay Mitra1,2, Riddhiman Dhar2 and Debnath Pal1,2*




1 Supercomputer Education and Research Centre,
2 Bioinformatics Centre,
   Indian Institute of Science, Bangalore-560012, India

* Corresponding author

   Phone: +91-80-2293 2901, Fax: +91-80-2360 0551
   Email: dpal@serc.iisc.ernet.in





Edited by E. Wingender; received August 11, 2009; revised September 17; accepted September 19, 2009; published November 26, 2009



Abstract

Apoptosis is a programmed mechanism of cell death that is a normal component of development and health of multi-cellular organisms. In this study, we ask if interface properties of apoptotic protein complexes are different from protein complexes in general. We find that although in apoptotic protein complexes the overall distribution of interface size, surface complementarity, hydrogen bonding, hydrophobicity are similar to general interface properties, apoptotic complexes tend to have more fragmented interfaces and different secondary structural preferences. The statistics on the number of interfaces where specific amino acid(s) occur with significantly enhanced frequency suggest that Arg, Met and Asp are most important functional residues. The role of Met is believed to be unique, as evidenced from the existing data on hot spot potential of residues. These findings together provide insight into the possible role of various physico-chemical attributes at the protein interface in regulation of the apoptosis process.

Keywords: apoptosis, protein-protein interaction, residue propensity, protein interface, secondary structure



Introduction

Apoptosis is the mechanism of cell death in which cells die in a regulated fashion. This programmed cell death is a normal component of development and health of multi-cellular organisms. Inappropriate apoptosis leads to major diseases including cancer, acquired immunodeficiency syndrome and neurodegenerative disorders such as Alzheimer's disease and Huntington's disease [Thompson, 1995].

Protein complexes play a major role in the apoptosis process. For example, the caspases, which are the central molecules in apoptosis, are dimeric cysteine proteases that cleave after an aspartate (Asp) [Cohen, 1997]. They are divided into two categories, the initiators, which include caspase-2, -8, -9 and -10, are activated in response to death signals, and the executioners, which include caspase-3, -6 and -7, carry out the execution of the apoptosis process in the dimeric state. These caspases remain present inside the cells as inactive zymogens unless activated during the apoptosis process [Shi, 2002]. The activation of caspases occurs via two signaling pathways - extrinsic or intrinsic. The extrinsic pathway is initiated by the binding of death ligand to the death receptor followed by the recruitment of the FADD and procaspase-8 to form the death-inducing signaling complex (DISC) [Algeciras-Schimnich et al., 2002]. The intrinsic pathway is initiated by the formation of apoptosome complex where procaspase-9 binds to apaf-1 (apoptotic peptidase activating factor 1) in the presence of cytochrome c and dATP which leads to the active caspase-9 [Li et al., 1997]. The BCL-2 family member proteins play an important role in this intrinsic pathway of mitochondria mediated apoptosis [Hengartner, 2000]. The active initiator caspases activate the executioner caspases which break down cellular proteins such as lamins, DNA-PK etc. The executioner caspases also activate DNA Fragmentation Factor 40 (DFF40) (CAD) by degrading DNA Fragmentation Factor 45 (ICAD). DFF40 triggers DNA fragmentation and chromatin condensation [Liu et al., 1997; 1998]. This highlights the important role of protein interaction in the apoptosis process.

Looking further into individual apoptotic protein complexes where structural information is currently available from the Protein Data Bank [Bernstein et al., 1977] (PDB), specific residues have been documented for their crucial role in the protein-protein interaction. In Rap1A/cRaf1 (PDB:1c1y) [Nassar et al., 1995], which is a Ras-dependent signaling complex, the Ras binding domain (cRaf1) is anchored by highly conserved Arg89 at the centre of the binding interface to Asp38 on Rap1A through a salt bridge. A similar complex of Ras/Byr2RBD (PDB:1k8r) [Scheffzek et al., 2001] show Ras binding domain anchored by Lys101 through a network of salt bridges to Asp38 and Asp33 from the Ras protein. Mutation of Lys101 to Glu renders the complex non functional. The high affinity between the components of the TRADD/TRAF2 complex (PDB: 1f3v) [Park et al., 2000] is also due to salt bridges involving Arg146 in TRADD and Asp450 in TRAF2. Mutations of both these residues lower the affinity of the complex and apoptosis repression. In caspases, where the main structural differences are in the loop regions, charged residues and aromatic residues equally play an important role. In its complex with p53 protein, the conserved Arg260, Arg413 of caspase 8 are important. In another complex (PDB: 1i4o) [Huang et al., 2001] of an inhibitory apoptosis protein with baculoviral IAP repeats and human caspase-7, the important residues are Trp, Tyr and His around Arg232, conserved across all caspases. Interestingly, the β6 dimerization strands in the caspases 3, 7, 6, 9 contain a conserved Met, which reflects its important role. In TRAIL/DR5 complex (PDB: 1d4v) [Mongkolsapaya et al., 1999], which consists of TNF related apoptosis inducing ligand and a cell surface receptor protein DR5, the interface is anchored by a key Tyr216, the mutation of which abolishes the complex; the salt bridge between Arg149 of TRAIL and Glu147 of DR5 stabilizes the interface. At the interface of DNA fragmentation factor 40 (CAD) (PDB: 1ibx) [Zhou et al., 2001], Lys18 and Gly20 at the interface form a cluster and interact with conserved Ile60 and Val70 of ICAD, and Lys32 at DFF40 interface interact with Asp72 and Asp74 of ICAD. Thus these interactions may be major driving force in the inhibition of CAD by ICAD and hence evolutionarily conserved. Mostly, the charged residues at surfaces are conserved even among the sequences across kingdoms. For example in cytochrome c (PDB: 1j3s) Lys53, Lys80, Lys73, Lys74 and Lys87 present at the surface are evolutionarily conserved.

As already outlined, apoptosis pathway is a mechanism of "programmed cell death", therefore the biological tasks, which involve protein-protein interaction, need to be specifically regulated. From the literature, there is a strong indication that residue preferences may give insight into their important role in apoptotic protein functionality. So a pertinent question is whether these residue trends are common across apoptotic protein interfaces in a statistically significant manner. Also if there are unique characteristics in apoptotic protein-protein interfaces, such as size of the interface area, amount of surface complementarity, hydrophobicity, hydrogen bonds, protrusion index, solvation potential, packing of secondary structures [Jones and Thornton, 1996; 1997; Lo Conte et al., 1999; Chakrabarti and Janin, 2002]. It is well documented that protein interfaces have distinct properties from the rest of the protein surface [Gruber et al., 2007] and this information is being used to design new protein interfaces [Karanicolas and Kuhlman, 2009]. In our current study, features of protein-protein interactions are analyzed using the size of the interface area, hydrophobicity, hydrogen bonds, salt bridges, secondary structures and residue propensity. Our observations underline the role of protein-protein interaction in the apoptosis process.



Materials and methods


Constructing the dataset of apoptotic protein complexes

In the first stage, protein subunits from the PDB with <25 residues were excluded. Next, the protein complexes involved in apoptosis process for which structures were available in PDB were selected using a regular expression search for the word APOPTOSIS. In addition, the KEGG database was searched for PDB complexes for the human apoptotic pathway. To ensure that we culled all biologically relevant interfaces (for calculation of an interface, see below) from each PDB, we looked at the Protein Quaternary Structure (PQS) database [Jones and Thornton, 1996] for the corresponding PDB entry. If an interface present in the PDB file was absent in the PQS database, we surveyed the literature to verify if indeed the proteins formed a physiologically relevant interface. At this stage, non-apoptotic proteins were rejected and only one protein complex was chosen among homologous complexes available from more than one species. Protein complex from the highest organism in the taxonomic tree was preferred. This was to ensure that we retained the closest homolog of apoptotic proteins in humans, interactions of which are the target of this study. To double-check that we retained genuine apoptotic proteins, the protein sequences were searched against the ApoptosisDB (Apoptosis Database: www.apoptosis-db.org) using BLAST [Altschul et al., 1990]. To further exclude redundant protein interfaces, we compared all protein interfaces in the dataset for 90% redundancy. Here the interface residues were picked from the three-dimensional structure and arranged in the same order as in the primary structure. The pair of sequences obtained for each interface was compared among each other and those with sequence identity not extending beyond 90% were kept. Only one protein complex from the pool showing redundant interfaces was kept. Lastly, we removed any proteins that were synthetic constructs or had engineered mutations that were located at the interface of the protein complex. This filtering procedure gave us a total of 71 unique PDB entries, contributing 51 homodimer interfaces and 61 heterodimer interfaces, details of which can be found from Supplementary Table SI. The proteins in our data set covered all the nodes described at http://www.genome.jp/kegg/pathway/has/hsa04210.html, except for nodes which we did not have any structures or were monomers.


Constructing the dataset for general protein complexes subdivided into signaling and non-signaling catergories

To compare any data, there is a need for a reference set. Therefore, a generic proteome was defined using a set of non-redundant proteins with less than 30% sequence identity. The aim of creating the dataset was to provide the mean physico-chemical properties at the interfaces, to which we could compare the apoptotic protein interfaces. We further reduced redundancy by excluding proteins of the same fold class as defined by the SCOP database [Andreeva et al., 2008]. The PQS database search and literature survey was performed in the same manner as above to screen for biologically relevant interface. The final data set contained 1176 unique PDB entries, contributing 1657 homodimer interfaces and 652 heterodimer interfaces. We refer to this dataset in the text as general protein complexes (Supplementary Table SII). To better compare against the apoptotic dataset, it was further subdivided into two categories: signaling and non signaling complexes. The idea was to distinguish the dataset into complexes that are permanent and non-permanent in nature. The biological process Gene Ontology (GO) annotations of the PDB files were used for segregating signaling from non signaling complexes. For those PDB files, for which no GO annotations were available, predicted annotations were used as available from the ProKnow (http://proknow.mbi.ucla.edu) function annotation server. At first the word "signal" was used to extract all GO terms in the ontology dictionary. Thereafter a unique list of GO terms for all child terms of the signal-associated GO terms was built. All PDB files having GO annotations matching with the "signal" associated GO terms and their child terms were considered as proteins involved in the signaling pathway. This gave a total of 251 protein complexes involved in the signaling pathway and the rest 925 complexes involved elsewhere. The signaling protein complexes were comprised of 271 homodimer interfaces and 208 heterodimer interfaces. Likewise, non signaling protein complexes were comprised of 1386 homodimer interfaces and 444 heterodimer interfaces.


Interface area

The interface area was defined as the portion of the solvent accessible surface area (ASA) buried due to the formation of complex by proteins. It was estimated by calculating the difference between the sum of the ASAs of the individual subunits and the complex. The ASA was computed using the program NACCESS [Hubbard and Thornton, 1993] with default (1.4 Å) probe radius using an implementation of Lee-Richards's algorithm [Lee and Richards, 1971]. Those atoms, which lost their ASA by >0.1 Å2 due to complex formation were identified as interface atoms.


Interface patch analysis

A patch was defined as the portion of interface area where no water molecule can penetrate. To compute a patch, we defined an interface from subunits pairs A and B. We started with an interface atom ia of subunit A and then found the subset of interface atoms from subunit B which are within the threshold distance, and labelled those as patch atoms. Then for the set of the labelled patch atoms selected from subunit B found the new set of interface atoms from subunit A within the threshold distance, and included them as patch atoms. We iterated this process for all the interface atoms until no atoms were left. In our method we used threshold distances of 4.5, 5.0 and 6.0 Å to screen all atom pairs present between the subunit interfaces. All spatially contiguous atom pairs at the interface were juxtaposed together to denote a single patch. The results discussed in the text are for the threshold of 5.0 Å. We used the hydrophobic scale of Fauchère and Pliska, 1983, for assessing hydrophobicity at the patch area. Gly, Ser, Thr, Asn, Gln, His, and charged residues (Arg, Lys, Asp, Glu) are assumed as hydrophilic and rest of the amino acids are as hydrophobic. We also evaluated the solvation energy at the interface using the method of Eisenberg and McLachlan, 1986.


Hydrogen bonds and salt bridges at the interface

The hydrogen bonds at the interface were identified using geometric criteria [Baker and Hubbard, 1984]. Excluding the positions of the hydrogen atoms which are not generally available in crystal structures, we have used the distance between the donor (D) and acceptor (A) atoms at less than 3.9 Å and the angle for donor-acceptor-acceptor antecedent (D-A-Aa) greater than 90° for screening hydrogen bonds (Fig. 1).



Click on the thumbnail to enlarge the picture
Figure 1: Geometric criteria used to calculate hydrogen bonds in protein interfaces [Baker and Hubbard, 1984]. H = proton, D represents proton donor, A = proton acceptor, Aa = acceptor antecendent.

A salt bridge was defined for a nonbonded contact between two atoms from Arg (NH, NE) / Lys (NZ) and Asp (OD) / Glu (OE), separated within 6 Å. We did not consider the main-chain termini. Our distance thresholds were borrowed from Barlow and Thornton, 1983, who showed that the preferred distances for separation of oppositely charged groups lie in the ranges 1.75 Å to 6.0 Å.


Residue propensity at interface

The propensity was calculated for each residue at a protein interface using the following formula:

where n is the count of number of atoms, I is the interface, S is the total surface of the individual protein subunit in the complex including the interface region, X represents an individual amino acid (see above for the definition of interface). Since our sample size is not very large, it is possible that some amino acids are present in low frequency, but yet have a high propensity. To examine the statistical significance of propensity calculated at each interface, we calculated the p-value associated with each residue type at the interface using the hypergeometric distribution in the following way:

where symbols are same as from the propensity equation. The probability of a given amino acid X at the interface in K amino acids at the interface is given by , and applying Bonferroni correction, the p-value of the amino acid X occurring k times in the interface is: k * (). A canonical p-value threshold of ≤0.05 was used to identify the statistically significant residue occurrences using the said formula, in conjunction with a propensity threshold of >2.0. The interfaces, which had a particular residue(s) satisfying the mentioned thresholds, are referred as conforming to statistically significant high propensity (SSHP) value, and they were counted for calculating statistics on the interface, which are presented in the text and figures.


Secondary structure packing at the interfaces

The packing and interaction of secondary structures at the interfaces were studied by dividing the secondary structures into four classes: helix, β-strand, turns and the rest, which included the non-regular secondary structure. The regular secondary structure tags output by the DSSP [Kabsch and Sander, 1983] program were merged as follows for helix: α-helix (H), 310-helix (G), π-helix (I); β-strand: β-sheet (E), β-ladder (B); turn: hydrogen bonded turn (T), geometric bend (S). Two secondary structures were said to be interacting with each other if the minimum distance between them was less than equal to 6.0 Å. To calculate how often an individual residue with a given secondary structure is present at the interface SSHP value, we repeated the calculation of secondary structure propensities and their p-values using hypergeometric distribution using a similar procedure outlined above. Here, the amino acid label was replaced by one of the four secondary structure labels for the calculations.



Results


Properties common to both apoptotic and general protein complexess

Hydrophobicity. We used the free energy values from the scale of Fauchère and Pliska, 1983, to compute the overall hydrophobicity / hydrophilicity by summing the free energy values for individual residues at the interfaces. The distribution of the overall hydrophobicity of the interfaces from the apoptotic proteins was similar to general proteins . The estimates based on solvation energy also yielded similar results. However, there are specific residue abundances unique to apoptotic protein complexes, as discussed later. For both data sets, general and apoptotic proteins, the peak of the distribution suggest that, overall, hydrophobic interfaces are more common (data not shown). Separation of the general data set into signaling and non signaling complexes did not alter the results.

Shape complementarity. Surface complementarity scoring based on the formula from Katchalski-Katzir and coworkers [Katchalski-Katzir et al., 1992] was used to estimate complementarity at protein interface. The distribution of surface complementary scores showed similar distribution for both apoptotic and general protein complexes indicating that the packing of residues at the interfaces is similar (data not shown). The results did not change on partitioning of the general data set into signaling and non signaling complexes. The results are consistent with the analysis of patches discussed below.


Distinct properties of apoptotic protein interface

Hydrogen bonds and salt bridges. 2162 hydrogen bonds were identified across the subunit interfaces of all the complexes in apoptotic proteins. We compared the distribution against 35722 hydrogen bonds from general protein complexes (Tab. 1). The fraction of hydrogen bond types that contribute to the stability of the interface is main chain - side chain > side chain - side chain > main chain - main chain. There is only a minor difference in the hydrogen bond distribution between apoptotic complexes and general complexes. The contribution of main chain - side chain hydrogen bonds is marginally depleted for apoptotic protein complexes compared to general protein complexes at 40% and 43%, respectively. The percentage of main chain - main chain hydrogen bonds is reversed at 27% compared to 23% for apoptotic and general interfaces, respectively. The percentage of side chain - side chain hydrogen bonds is same at 34% for both apoptotic and general protein complexes. Overall, the correlation between the distribution of numbers of hydrogen bonds and the interface area is high, in agreement with Xu et al. [Xu et al., 1997], and is similar for both apoptotic and general proteins (Tab. 1). It may be noted that for apoptotic proteins, the correlation between the number of main chain - side chain hydrogen bonds and the interface area is 0.82, which is lowered to 0.68 for side chain - side chain and interface area. This drop is substantial when compared to the same set of parameters for general proteins (0.80 to 0.74) (Supplementary Table SIII and SIV). This suggests some distinction in hydrogen bonding patterns in apoptotic and general protein interfaces. Specifically, these differences are more pronounced for main chain - main chain hydrogen bonds in both signal and non signal proteins, and only main chain - side chain hydrogen bonds in signal proteins and side chain - side chain hydrogen bonds for non signaling proteins.


Table 1: Hydrogen bond statistics on apoptotic and general proteins.
Main-Main Main-Side Side-Side Overall
Fraction of hydrogen bond types
Apoptotic 27% 40% 34% 100%
General 23% 43% 34% 100%
signal 23% 42% 35% 100%
non-signal 23% 43% 34% 100%
Correlation between number of hydrogen bonds and interface area
Apoptotic 0.63 0.82 0.68 0.91
General 0.52 0.80 0.74 0.88
signal 0.54 0.77 0.68 0.86
non-signal 0.52 0.80 0.75 0.88

When we compare the frequency of salt bridges across interfaces, we find a poor correlation (0.49) with the interface area in apoptotic proteins (Supplementary Table SIII). 1232 salt bridges across apoptotic protein interfaces are unevenly distributed. For example, we did not find any salt bridge across 21 subunit interfaces and only one salt bridge for 7 subunit interfaces. Relatively, higher correlation (0.61) is found in general protein interfaces (Supplementary Table SIV), suggesting that the varied salt-bridge distribution may be special attribute of proteins in apoptotic process. Interestingly, the signaling proteins in the general dataset yielded a correlation 0.53 for number of salt-bridges to interface area, indicating that in terms of distribution salt-bridges at the interfaces they are more similar to apoptotic proteins; non signaling proteins yielded a correlation of 0.62.

Interface patch. The average subunit interface areas of apoptotic proteins vary widely from ~51.1 Å2 to ~4761 Å2. To check if these interfaces were segmented into tightly packed patches, we performed patch analysis using near neighbour method for detecting patches with a starting distance threshold of 4.5 Å, and find that the size of the patch area gradually increases with the increase in distance threshold. For a distance threshold of 6.0 Å, the patch area almost converges to the interface area in the majority of the cases. As the size of the patch increases with the increase in distance threshold, the number of patches also decreases due to merging of patches. This suggests that the interfacial voids in the apoptotic protein complexes do not exceed 6 Å in dimension. Looking into the area of these individual patches, their distribution in apoptotic interfaces is similar to general protein interfaces (data not shown), with the highest number of patches being of small size of within 500 Å2. Subdivision of the dataset into signaling and non signaling proteins did not show significant alteration in the distribution pattern. If we look at the ratio of the number of patches from apoptotic and general signal and non signal datasets, irrespective of the interface area size, it is can be seen that the ratio value is near 1 for interfaces with up to three patches (Fig. 2, main). In this category, the bulk (32-41%) of the interfaces in the apoptotic proteins are single-patch (Fig. 2, inset). However, if we look at interfaces with four patches or more, we find that one or more of the interface size categories have a distribution that is distinct between apoptotic and general signal and non signal interfaces. It may be noted that these interfaces collectively cover substantial fraction of the apoptotic protein interfaces (smaller bars, see Fig. 2, inset). Overall, the plot shows that a larger proportion of apoptotic interfaces are fragmented into patches when compared to both general signal and non signal protein complexes. This suggest that apoptotic proteins have distinct distribution of patches, and as such this allows for creation of clefts, which may harbour solvent molecules that may play an important role in modulating the stability of the interfaces.



Click on the thumbnail to enlarge the picture
Figure 2: Plot showing the distribution of the ratio values computed using the fraction of occurrence of a given "number of patch" for a specified interface area range in apoptotic and general signal and non signal datasets. The inset shows the fraction of occurrence of values of a given "number of patch" for a specified interface area range in apoptotic dataset only. The values are the numerator from which the ratio values have been computed in the corresponding data points in the main plot. The graph suggests that the proportions of the number of fragmented patches are more in apoptotic proteins, compared to general signal and non signal proteins.

Interface residue propensity. The percentage of interfaces with significant propensities for individual amino acids shows some clear trends (Fig. 3). When we compare the number of cases where we have significant occurrence of amino acid pairs in at least 10% of the interfaces (bold letters in Fig. 3), we find that the top ranked residues are R>Y>M>W>D, Y>F>H~M>W~D>R, Y>R>F~W>H>M. for apoptotic, signaling and non signaling data set, respectively. It is clear that the aromatic residues Y, H, and W are found in SSHP numbers across all interfaces. F is conspicuous by its diminished presence in SSHP numbers at apoptotoic protein interfaces. Comparing rank orders of residues, the main difference arises from significant lack of Asp and somewhat diminished presence of Met in signaling protein interfaces vis-à-vis apoptotic interfaces. Comparing apoptotic interfaces with non-signaling protein interfaces, we see Arg to be significantly diminished in the latter. This points to their important role in apoptotic complexes, and taken together with the documented evidence from structural biology studies outlined in the Introduction section, it appears that the role of the residues is most likely functional. The trends remain consistent even when the data is divided into homomeric and heteromeric complexes.



Click on the thumbnail to enlarge the picture
Figure 3: Matrix showing in percentage how often amino acids occur individually and in pairs at the interface with SSHP values. A p-value threshold of ≤0.05, in conjunction with propensity value of >2.0 for the amino acid were used as a cut-off to determine if the presence of the amino acid at the interface is SSHP. For each amino acid residue, the data are divided as heteromers (top row) and homomers (bottom row). Within the bold-line-bounded box, we have given the information for the paired amino acid groups. In the lower triangle, for each cell, in the left the results from general signal complexes, and in the right the results from general non signal complexes are shown. In the upper triangle the same data is shown for the apoptosis dataset. Values ≥10 are marked in bold, while values ≤2 are italicised. Each row under the label "General" and "Apoptosis" contain values for a single residue at the interface as indicated by the one letter amino acid code.

Fig. 3 also shows that Val, Ile, Leu are rarely present in SSHP values for both apoptotic and general protein complexes. To confirm unambiguously if as non aromatic hydrophobic group too, these amino acids are less preferred, we grouped them (Ile/Leu/Val) together and recalculated their frequency of SSHP occurrence at protein interfaces. We find that although they are found only in few interfaces with SSHP values, they are overwhelmingly preferred by homomers in apoptotic complexes (data not shown).

We also looked in detail at individual proteins that show abundance of particular residues in a statistically significant manner (Supplementary Table SIII). In case of RNA directed RNA polymerase that induces apoptosis in host cell (PDB: 2hwiA_B), we found a maximum of 53% of the residues at the interface to be charged and the lowest charged fraction being 8% for tumor necrosis factor β and its receptor p55 (1tnrA_B). In the former case, Asp is present in SSHP values as evident from propensity and p-value calculations; whereas the latter is dominated by aromatic residues of Tyr and Trp (See Supplementary Table SIII for details). Looking at interfaces dominated by positive charged residues, we find p53 complex (1p35A_B) has a maximum of 31% of positively charged residues, and tumor suppressor p53 complex has a maximum of 33% negatively charged residues. For caspases, which are one of the central players in the apoptosis process, we find that caspase-2 (1pyo) has 33% of the interface composed of charged residues; similar values for caspase-3 (1nmq) is 39%, caspase-8 (1qtn) is 22%. It is possible that these charged residues are important for proximity-induced dimerization and activation during the early stages of apoptosis [Boatright and Salvesen, 2003].


Secondary structures at the interfaces

Based on a DSSP [Kabsch and Sander, 1983] derived data, we checked if secondary structures at the interfaces are different in apoptotic protein complexes. For this we calculated the p-value of occurrence of each secondary structure type at the interface based on hypergeometric distribution. We calculate the frequency of occurrence of a given interface where a given secondary structure was present in SSHP values. From Fig. 4A, we can see that the average distribution of secondary structure frequency varies between 0.2-0.4. It can be seen that for apoptotic proteins, the number of interfaces with significant number of residues in helical state is rather small, in contrast to the β-sheet which occur at high frequency. The number of interfaces with residues predominantly in coil conformation is also significant. When we divide the same data between homomeric and heteromeric protein interfaces, we find all interfaces abundant with residues in helical conformation come from homomeric proteins (data not shown). The same is true for interfaces with turn elements. More coil-dominant interfaces are present in heteromeric proteins. Comparing these observations with general proteins, we find that only in signal proteins, there is a distinct proclivity for helical residues, and there are no distinct preferences for interfaces for any other secondary structure. The interfaces are essentially equally distributed among homomeric and heteromeric proteins. Hoskins et al. [Hoskins et al., 2006] recently showed that β-strands play an important role in stabilizing protein-protein interactions. Our data suggest that secondary structures may play roles in specific functionalities; however there is no propensity for a specific secondary structure at the interface in general proteins. To further confirm our observations, instead of looking at the prevalence of secondary structure elements at individual interfaces of a subunit, we checked on secondary structure contacts across the protein interface. In proportion to the abundance of helical and β-sheet residues, we found (as in Fig. 4A) helix-mediated interactions to be lower than sheet-mediated interactions in apoptotic interfaces in contrast to general (signal and non signal) interfaces where helix-mediated interactions dominate compared to β-sheet (Fig. 4B). Only few interfaces are found where both the interacting residues across the interface are in regular secondary structure conformation.



Click on the thumbnail to enlarge the picture
Figure 4: (A) Plot showing as a fraction how often an interface with a given secondary structure occurs with a SSHP value. The frequency of occurrence of residue secondary structure counts was used to calculate the p-value and propensity, and corresponding thresholds of ≤0.05 and >2.0 were used as a cut-offs. We used 25%, 30% and 35% residue exposure to count whether a residue was present or absent at the interface. The mean and the standard deviation of the results obtained from the three calculations using the stated thresholds are plotted. For each secondary structure label at the x-axis two bars are given, the first one refers to the apoptotic protein complexes, the second to the signaling protein complexes, and the third to the non signaling complexes. The plot shows that helical secondary structures are less preferred in apoptotic protein interfaces compared to both general signaling and non signaling protein interfaces, while β-strand and irregular structures are more preferred in interfaces of apoptotic proteins. (B) Distribution showing how often different secondary structure segments are found per unit area of interfaces. Here if α-helical, strand, turn, or coil segment is present at the interface then it has been counted only once. If more than one segment is present then the count is increased cumulatively. All such secondary structure segments making contacts across interfaces are counted for calculating the statistics for the histogram. Results show that helical segments are less in apoptotic protein interfaces, compared to β and coil segments. This is consistent with the results in Fig. 4A. Also please note the secondary structure pairing is commutative, therefore data for HB/BH, TC/CT follow similar pattern.



Discussion

There appear to be multiple levels of controls in the protein-protein interactions of the apoptosis process. The first level of control appears to be guided by the area dependent stability of the interacting interfaces. Apoptosis protein complexes contain a large number of small area patches (area up to 500 Å2); however the nature of these interfaces varies compared to general signal and non signal protein complexes. The small size of the interface area does not necessarily indicate its stability [Brooijmans et al., 2002]; nevertheless, small-patch interfaces are more prone to solvent mediated destabilization, especially when the dominant nature of the interface is hydrophilic.

To investigate if indeed the fragmented interfaces may correlate with lower affinity, we looked at the available dissociation constants of the solved protein complexes. Since the determination of structure and kinetic data is most of the time done in different chemical environments, one cannot be absolutely sure of the relation seen between the physicochemical data in the structure and the dissociation constant. Nevertheless, when we compiled the data from available literature on the complexes in our list, we did find the majority of the protein not to be tightly bound (Tab. 2).


Table 2: Dissociation constants of apoptotic complexes.
PDB Code Dissociation constant Interaction strength* reference
1czy 0.04-1.5 mM Weak Ye et al., 1999
1f2r 50 nM Medium Otomo et al., 2000
1f3v 7.8 ± 3.6 μM Weak Park et al., 2000
1hx1 0.1 ± 0.035 μM Weak Stuart et al., 1998
1i4e 0.116 ± 0.001 μM Weak Xu et al., 2001
2a5y 48 ± 8 nM Medium Yan et al., 2005
2clq 0.22 ± 0.2 μM Weak Bunkoczi et al., 2007
2jm6 35 nM Medium Czabotar et al., 2007
1rgi 2.0 μM Weak Burtnick et al., 2004
2an6 0.176 μM Weak House et al., 2006
1ikn 3.0 nM Strong Malek et al., 1998
1d2z 0.5 μM Weak Schiffmann et al., 1999
1dt7 0.96 μM Weak Nowotny et al., 2000
2h9g <0.200 μM Weak-Medium Li et al., 2006
* ≥10-7 M (0.1 μM): Weak; <10-7 M (0.1 μM) and ≥10-8M (10 nM): Medium; <10-8 M (10 nM): Strong

It has been reported that mitochondria induced acidification of cytosol is important for caspase activation by CytC; it has been found that acidic environment increases the caspase activity by up-regulating the Bax protein activity [Park et al., 1999; Matsuyama et al., 2000]. Most interestingly, it has been found that the caspase 3 proenzyme dormancy is maintained by a "safety catch" which functions through multiple ionic interactions and this "safety catch" is disrupted by acidification [Roy et al., 2001]. The abundance of charged residues at the protein interfaces suggests that these charged residues play a major role in the regulation of the apoptosis process in response to the change in pH. In sufficiently acidic pH, the proteins mostly exist in positively charged form where the carboxyl side chains of aspartic acid and glutamic acid are neutralized and the side chains of arginine and lysine are positively charged. With a pKa of 6, His side-chain is also positively charged at low pH; but at weakly acidic to neutral pH, it would have a mixed distribution of ionic state. It can therefore be proposed that the arginine, histidine and lysine residues are more important than the aspartic acid and glutamic acid with respect to the regulation of apoptosis process if indeed the process proceeds at low pH. In such environments arginine, histidine and lysine residues being charged may serve as the recognition sites.

It has been shown by Strickler et al. [Strickler et al., 2006] that the surface charged residues are important for protein stability and optimised surface charge-charge interactions can confer increased thermo stability to a protein. The surface charges may also have important role in the thermostability of the apoptotic protein complexes which remain functional in heat-shock induced apoptosis [Samali and Orrenius, 1998]. The importance of charged residues is therefore further emphasized from the Fig. 3.

Hotspot residues have been frequently used to assess the role of individual amino acid residues to the stability of interfaces through alanine scanning experiments. The preference of amino acids in hotspots have been estimated statistically by calculating the frequency of that amino acid in the experimental database as a whole and in the subset with binding energies >2 kcal/mol (www.asedb.org). The ratio of these quantities gives the fold enrichment of amino acids in hotspots. The order of prevalence of hotspots is W>>Y>>R>>I>D>N>P>K>H>Q>E>F>V>M>S>T>L>C. When we compare the order of overabundance of residues in interfaces of apoptotic proteins, it is R>D>M, excluding the two aromatic amino acids in the top six. Although one can anticipate Arg to be present in higher numbers due to its known hotspot potential, the high prevalence of Asp and Met, ranked 5th and 14th in the hotspot scale points to their other important role in apoptosis. Since the role of methionine cannot be substantial in terms of imparting stability to the interface, as evident from its ranking in the hotspot scale, the role appears to be more functional. In this regards, known role of oxidation/reduction of the methionine- sulfur in context of control of the apoptosis process appears most likely.

In summary, apoptotic protein interfaces have unique properties distinct to both signal and non signaling complexes. Indeed, the important differences in patch distribution, residue propensities as well as secondary structures indicate special functional requirement in the apoptosis process. First, having small patch size may increase the efficiency of the regulation system because recognition interfaces can be formed and destroyed relatively easily. Overabundance of residues with particular physicochemical properties ensures that there is less heterogeneity at the interface which may increase the specificity of interaction, despite its non-permanent nature.



Acknowledgements

PM is grateful to All India Council for Technical Education, New Delhi, for the National Doctoral Fellowship. The authors thank the Department of Biotechnology, New Delhi, for the funding support.



References