The Death Factors: a Combinatorial Analysis

Anne Bergeron, Paul Geanta1 and Dominique Bergeron2





1LACIM, Université du Québec à Montréal,
C.P. 8888 Succ. Centre-Ville, Montréal, Québec, Canada, H3C 3P8
E-mail:anne@lacim.uqam.ca


2Département de microbiologie et immunologie, Université de Montréal
C.P. 6128 Centre-Ville, Montréal, Canada, H3C 3J7
E-mail:bergerdo@ere.umontreal.ca





Edited by T. Werner; received August 6, 1998; revised October 27, 1998; accepted November 17, 1998


ABSTRACT

This paper presents theoretical and computational tools to understand how a small group of proteins, the death factors, are involved in widely different behavior of the cell. Experiments were done using a virtual laboratory that can simulate cellular response to different external stimuli.
WARNING: It is not certain which of the theoretical protein clusters described here really occur in nature. In addition, the rules of cluster assembly are combinatorial, and thus an oversimplification to describe the real situation.

Key words: regulatory proteins, death factors, virtual experiments, signalization pathway



INTRODUCTION

In this paper, we develop theoretical and computational tools to understand how a small group of proteins can modulate signals to trigger two opposite cellular responses. The key point is in recognizing the basic modular properties of the proteins, and their ability to form molecular clusters whose characteristics can be statistically analyzed. We began to explore these ideas in [Bergeron et al., 1997] which describes a virtual laboratory used to predict the effects of changes in transcriptional factors on the rate of transcription.

The current project, whose models and results are exposed in the following sections, focuses on the death factors. These proteins are known to participate in the first step of a process that can signal a diverse range of activities, including cellular proliferation, or death by apoptosis. Our goal is to understand, qualitatively and quantitatively, how minor variations among this group of proteins can generate opposite effects, even in the presence of similar stimuli.

The next section exposes the basic hypothesis which can be summed up as: a regulatory protein is characterized by the set of its binding domains. These domains are used to construct clusters of different compositions and properties; cellular response depends on the characteristics of the population of possible clusters. 'Programmed Cellular Death' exposes the principles underlying apoptosis, and shows how the virtual laboratory can help to explain some experimental observations. Finally, in 'Combinatorial Model', we present the combinatorial model underlying the laboratory.




The Biological Model

The Death Factors

Each protein is characterized by an amino acid chain which, during synthesis, folds itself into a unique tridimensional structure. The surface of this globular entity contains regions, called here binding domains, of specific shape, charge and/or hydrophobic properties. Types of domains are relatively few in numbers, and similar domains - that is, regions with similar composition and properties - are frequently shared by different proteins.

In this project, we focus on a set of proteins, the death factors, that are involved in the first step of signaling processes initiated by the TNF (Tumor Necrosis Factor) receptor family. Fig. 1 identifies schematically some of their binding domains [Liu et al., 1996; Malinin et al., 1997].

Figure 1 Figure 1: Death factors and their relatives

Once the proteins are folded, these domains lie on their external surface allowing interactions with other molecules present in the cell. Tab. 1 gives the rules of such interactions between the domains present on the death factors, as derived from known biological interactions - a more refined version will be given under The Combinatorial Model and the Virtual Laboratory.


Table 1: Rules of Interaction between Binding Domains
Domain 1 Domain 2
DD
T
P
P2
Ring
TrafC-1
Traf
DD
T'
P'
P2'
Ring
TrafC-1'
Traf



Some domains are self-complementary - such as DD or Traf -, while others bind to specific, but less characterized domains. In the latter case, we use the convention of labeling by D' a domain complementary to the domain D.

Molecular cluster formation is the driving force behind all processes in the cell. In response to a given stimulus, proteins form clusters that are bound together by complementary domains through non-covalent interaction. Fig. 2 shows an example of a cluster containing four death factors associated to stimulated triad of receptors at the cell surface. This kind of representation does not imply that the relative positions of the domains are known, but is used to display the interconnections between proteins in a Lego-like fashion: DD binds to DD, T' binds to T, etc.

Figure 2 Figure 2: A hypothetical cluster of four proteins showing three free domains: DD, P2, Traf


Populations of Clusters

On the surface of the cell, receptors are proteins that cross the membrane and that possess - at least - three types of domains. In the middle, they are anchored to the membrane by an hydrophobic domain. On the external side, they interact with molecules in the environment propagating, through changes in physical or chemical properties, signals to their internal part.

Inside the cell, this modification allows the formation of a cluster of regulatory proteins, eventually triggering enzymatic activity whose effect can propagate the signal, through the formation of successive waves of clusters, down to the nucleus. For example, in response to an external stimulus, the receptors of the TNF family form trimeric structures, exposing three active DD domains to the cytoplasm (Fig. 3). These domains recruit cytosolic factors having a complementary domain DD, beginning the formation of clusters.

Figure 3 Figure 3: Triads of receptors

Initiation of a signaling process is not the result of the stimulation of a single receptor, but of a set of identical receptors. Since different factors share similar binding domains, clusters of different composition can be associated with each stimulated receptor. Fig. 4 gives a snapshot of three possible clusters that can be formed with some of the death factors, given the rules of Tab. 1.



Figure 4 Figure 4: Examples of Different Hypothetical Clusters associated to Identical Receptors

Fig. 4 also gives a hint on the combinatorial nature of clusters. Indeed, many different clusters can be formed, and it is possible to enumerate them according to their size. Tab. 2 gives the number of possible clusters with a given number of proteins, assuming that there is enough of each of the death factors in the cytoplasm.

Table 2: Number of Different Clusters with a given Number of Proteins
Number of Proteins in Cluster Number of different Clusters
1
2
3
4
5
6
7
8
9
10
4
14
41
72
131
210
245
220
119
46


For example, the second line of Tab. 2 lists 14 different clusters of 2 proteins with TNFR-1 receptors. These can be briefly described as follows.

Since each of the four proteins TRADD, FAD, RAIDD, and RIP has a DD domain, any two of them can bind to two of the three DD domains of TNFR1, yielding the 10 clusters:

{TRADD, TRADD} {TRAD, FADD} {TRADD, RAIDD} {TRADD, RIP} {FADD, FADD}
{FADD, RAIDD}  {FADD, RIP}  {RAIDD, RAIDD} {RAID, RIP}  {RIP, RIP}

The four remaining clusters are obtained when one of TRADD, FADD, RAIDD or RIP recruits another factor with a non DD domain:

{TRADD, TRAF2} {FADD, FLICE} {RAIDD, ICE} {RIP, TRAF2}

Clearly, this kind of computation cannot be done by hand. We are also aware that some of the clusters could be unrealizable, either because of geometry, or because of instability resulting from incompatible participating molecules. Nevertheless, the number of different clusters at a given level of complexity is likely to remain high, and statistical study of the formal populations can still shed light on what is going on.

Obviously, in the more complex and still unknown situation in the cell, some of the virtual clusters could be replaced by completely different protein assemblies.

Our basic hypothesis is that cellular response does not depend on a simple sequence of recruitment events, but rather on the characteristics of the population of clusters that can result in response of a triggering event stimulating a group of identical receptors. Different triggers will result in a different distribution of clusters among the same group of cytosolic proteins.

In the next section, we will discuss in details how cell death, or proliferation, can be triggered. We also present the results of various virtual experiments involving the death factors.



Programmed Cellular Death

Life and death in the immune system

In the human organism, the proper working of the immune system involves the proliferation and natural death of cells such as T lymphocytes. In response to environmental stimuli, lymphocytes present diverse cellular responses including apoptosis [Hale et al., 1996]. This form of cell death leads to the elimination of undesired cells without inducing any inflammatory response. Apoptosis is associated with several profound alterations of cell morphology and composition, including DNA fragmentation.

We focus on three types of receptors of the TNF family, that are present on the surface of lymphocytes. Their basic binding domains are given in Fig. 5.

Figure 5 Figure 5: Receptors of the TNF Family


All these receptors are able, upon stimulation, to form clusters with the same group of signaling proteins [Nagata 1997]. Curiously, however, the ligation of these receptors can promote two different cell fates: proliferation or death. There also exist situations in which the same stimulus, mediated by the same member of the TNF receptor family, can trigger either proliferation, or apoptosis. One of the reasons of this apparently aberrant behavior is that these two cellular processes, although occurring through similar initial pathways, are due to the existence of a signaling bifurcation [Nagata 1997; Malinin et al., 1997]. Indeed, each of these receptors can transmit one signal eliciting cell death, and another that induces proliferation. This dichotomic signalization is dictated by the nature of the effector molecules recruited by the receptors, and the final cellular output depends on the relative frequency of these two signalization events.

TNFR1 and TNFR2, but not FAS, respond to the same external cytokine, the TNF-alpha, however their intracellular domains are not the same (Fig. 5). On the other hand, TNFR1 and FAS share similar intracellular domains, the DD or Death Domains, which is able to interact with DD-containing factors, although with different affinities [Liu et al., 1996; Hsu et al., 1995; Chinnaiyan et al., 1995]. Thus, in response to their respective stimulus, these three receptors will give rise to different cluster populations. The main activity of TNFR2 is to trigger cellular proliferation, and activation of FAS mainly signals cell death [Nagata 1997]. In different physiological states, stimulation of TNFR1 can lead to either of these opposite responses.

Four of the signaling proteins involved are known to possess enzymatic properties (Fig. 1). Two of them, RIP and MAPKKK, belong to the kinase family. When activated, they trigger a chain reaction called a kinase cascade, leading to the activation of transcription factors, gene expression, and eventually to proliferation of the cell. The same principle apply to the two proteins of the protease family, FLICE and ICE, generating a protease cascade leading to the fragmentation of the DNA, which is essentially gene destruction [Hale et al., 1996; Nagata 1997].

Fig. 6 shows three possible clusters that can be formed with stimulated TNFR1 receptors. The first cluster may activate a protease cascade, the third one, a kinase cascade, and the middle one has no associated enzymatic activity.

Figure 6 Figure 6: Possible responses to TNFR1 stimulation


In order to predict final cellular response, we simulated cluster formation in a virtual laboratory. Starting with a medium composed of several copies of each of the death factors and their relatives, we computed, among cluster populations containing n proteins, the expected number of clusters containing protein kinase and protease. Fig. 7 shows the frequencies of clusters containing proteins with an enzymatic domain (kinase or protease) in presence of, respectively, the receptors TNFR1, TNFR2, and FAS.

The expected number of cluster exhibiting a characteristic in a population is obtained by summing the probabilities of of existence of each cluster with the characteristic. For exemple in the 14 clusters of 2 proteins discussed in section "Populations of Clusters", two contain a protease:

{FADD, FLICE}, {RAIDD, ICE}

and five contain a kinase:

{TRADD, RIP}, {RAID, RIP}, {FADD, RIP}, {RIP, RIP}, {RIP, TRAF-2}

The probability of occurrence of each cluster is computed according to the formula developed in the section 'Modeling Cluster Formation'.


Figure 7 Figure 7 Figure 7
Figure 7: Frequency of clusters containing protease and kinase among population of clusters of n proteins in simulated stimulation of receptors


Results and Discussion

Our results, as expected from biological data, show that the three receptors lead to the formation of clusters associated with kinase or protease (Fig. 7). However, the profile of cluster formation completely differs from one receptor to the other.

With the TNFR2 receptors, the expected number of clusters associated with a protein kinase reaches a high level compared to clusters with protease. Almost none of the clusters with less than six proteins are associated with protease.

In contrast, in the presence of FAS receptors, a higher proportion of clusters with protease is observed, even with clusters containing very few proteins. These characteristics correlate well with the main biological activities respectively mediated by these two receptors namely, the kinase activation (proliferation), and the protease activation (apoptosis).

As previously mentioned, the stimulation of TNFR1 may lead to transmission of either a proliferation or apoptotic signal. Interestingly, under the same experimental conditions the proportion of TNFR1-signalization clusters associated to protease is intermediate compared to the ones calculated with TNFR2 and FAS. However, and in contrast with cluster formation associated to TNFR2, a high proportion of clusters of less than four proteins could contain kinase. This intermediate behavior could reflect the adaptability of the TNFR1 to mediate opposite cellular responses upon physiological variation in a given cell. Indeed, minor changes in the cytosolic factors composition could favor the kinase or protease recruitment among populations of TNFR1-associated clusters.

In order to verify this hypothesis, we introduced in our virtual cell a new protein, FADD-, which contains only the DD domain of the FADD factor (Fig. 1). This protein is able to bind to the TNFR1 DD domains but is unable to recruit additionnal members in the cluster. The presence of this protein, which will compete with FADD, is expected to protect cells from apoptosis.

As presented in the first three columns of Tab. 3, while a high kinase/protease ratio is associated with stimulation of the TNFR2 - a receptor associated mainly with cellular proliferation -, a lower ratio is associated with induction of apoptosis (FASR). We computed this ratio for clusters of different sizes (n).


Table 3: The kinase/protease ratios with increasing concentration of FADD-
n TNFR2 FASR TNFR1
FADD-*0
TNFR1
FADD-*3
TNFR1
FADD-*6
2 - 3.01 6.37 7.09 7.80
3 - 1.82 3.73 3.89 4.08
4 - 1.80 3.71 3.05 2.59
5 176.40 5.60 8.61 9.28 10.01
6 84.40 4.04 6.80 7.38 8.21
7 26.96 10.83 17.85 21.87 26.02


In vitro, the artificial over-expression of FADD- was shown to block the TNF-induced apoptosis [Chinnaiyan et al., 1995; Liu et al., 1996]. Similarly, computational data reveals that the introduction of an increasing number of FADD- results in an increase of the ratio of kinase/protease recruited by the TNFR1 receptors. The initial medium was composed of 3 proteins of each of the death factors. In Tab. 3, the last three columns give the ratios for a medium containing no FADD- protein, 3 FADD- proteins, and then 6 FADD- proteins.



The Combinatorial Model and the Virtual Laboratory

The combinatorial model presented here is a refinement and a simplification of [Bergeron et al., 1997]. The first modification is to consider a more general binding relation which allows us to adjust the degree of affinities of domains. This, in turn, influences the probability of formation of a given cluster. We also simplified the recruitment operation, allowing a protein to link to a cluster with only one domain.



The Binding Relation

Given a set T of interaction domains, the binding relation is given by a symmetric function:


which assigns to a pair of domains (d1, d2) a real number Aff(d1, d2) called the affinity of (d1, d2). This value is set to 0 if the domains do not interact, and to a positive value if the domains are known to interact.

In the experiments presented in section 3, we treated with special care the binding of DD domains. Theoretically, any two DD domains can interact, but DD domains on different proteins can have different composition, and some attractions are stronger than others. For example, it is known [Hsu et al., 1995] that the protein TRADD has a high affinity for the DD domain of the receptor TNFR1, but a low affinity for the DD domains of FAS. On the other hand, the FADD protein binds strongly to FAS but weakly to TNFR1 [Chinnaiyan et al., 1995].

We arbitrarily assigned values from 1 to 4 to distinguish between reported low to high affinities. For the DD domains present on the two receptors TNFR1 and FAS, and in the four death factors TRADD, FADD, RIP and RAIDD, we have the following matrix for the binding relation:

  TRNR1 FASR TRADD FADD RIP RAIDD
TNFR1 0 0 4 1 2 2
FASR   0 1 4 2 2
TRADD     1 1 1 1
FADD       1 1 1
RIP         4 4
RAIDD           1

All other interactions among the factors involved other domains than DD. Their affinities were set to 4.



Modeling Cluster Formation

In the virtual laboratory, a protein is modeled as the (multi) set of its domains. For example, the protein TRAF-2 is described by:

TRAF-2 = {Ring, T', Traf}

A cluster is a set of proteins, together with a set of free domains. For example, the cluster depicted in Figure 2 is represented by:

Proteins = {RAIDD, TRADD, TRAF-2, MAPKKK}
Domains = {DD, P2, Traf}



A cluster C can recruit a protein P with the link (d1, d2) if:

  1. The affinity of (d1, d2) is different from 0.
  2. d1 is a free domain of the cluster C.
  3. d2 is a domain of the protein P.

The resulting cluster is obtained by adding the protein P to cluster C, removing d1 from the free domains of C, and adding to them any domain of P other than d2.

Given a cluster C and a medium M which contain proteins that can be recruited, the probability that C recruits a protein P M with the link (d1, d2) is given by:

where:

Statistics on Popn

Starting with an initial medium M, and an initial cluster representing the domains of a receptor, the laboratory computes successively the sets of different clusters containing 1, 2, 3, ... proteins. The set of clusters containing n proteins is called Popn. We consider two clusters to be equal if they contain the same proteins, and have the same free domains.

Each cluster in Popn is obtained by a sequence of recruitments, thus we can compute its probability of formation. Given these probabilities, it is possible to compute the expected level of protease and kinase of a given population [Bergeron et al., 1997].


The Virtual Laboratory

The virtual laboratory is a program written in JAVA which can be used to study any process involving cluster formation. The experimenter must provide the description of the proteins involved, such as: enumeration of domains on each factor, quantity of factors, rules of interactions between domains (affinities).

An experiment is initiated by giving an initial cluster, which is simply the list of its free domains. Several parameters can be adjusted to control the growth of clusters. For example, in the present experiments, we chose to stop the growth of clusters as soon as one of the proteins MAPKKK, FLICE, or ICE was recruited.

For the experiments described in this paper, we had an initial medium of about 30 proteins - 3 of each of the death factors. The computations took, typically, a few minutes to generate all possible clusters. Once the populations are known and stored, the program can compute statistics on the composition of clusters.




CONCLUSION

As shown in the present experiments, the use of computational tools can provide guidelines to experiments on the cellular response upon stimulation in different cellular contexts. Although presently limited, our virtual laboratory could be improved to include parameters such as geometrical localization of the protein domains, and the relative affinity of domain interactions based on experimental values. These values could be derived from experimental systems such as the yeast two-hybrid system [Fields et al., 1989]. In this system, different protein domains can be linked, using genetic engineering, to different chimeric proteins. When two of these chimeric proteins are artificially expressed in yeast cells, the binding affinity between the two domains under scrutiny directly correlates with the intensity of an indicator biochemical reaction. Thus, it is possible to systematically measure the two-by-two interactions of a set of domains, and to construct the weight matrix of their relative affinities.

Further implementations will also include serial connections between virtual laboratories. For example, the output of a first experiment on a signalization pathway can be fed to a second one, modifying accordingly the composition of transcription factors. Such simulations are expected to mimic more closely the sequence of cellular events leading to decision of a cell to divide, or to commit suicide. We also plan to construct biological databases of regulatory proteins in terms of their binding domains, or binding partners. Such protein mappings will be useful to rapidly construct virtual experiments based on protein interactions.


REFERENCES