Edited by T. Werner; received August 6, 1998; revised October 27, 1998; accepted November 17, 1998
This paper presents theoretical and computational tools to
understand how a small group of proteins, the death factors, are involved
in widely different behavior of the cell. Experiments were done using
a virtual laboratory that can simulate cellular response to different
external stimuli.
WARNING: It is not certain which of the theoretical protein clusters
described here really occur in nature. In addition, the rules of cluster
assembly are combinatorial, and thus an oversimplification to describe
the real situation.
Key words: regulatory proteins, death factors, virtual experiments, signalization pathway
In this paper, we develop theoretical and computational tools to understand how a small group of proteins can modulate signals to trigger two opposite cellular responses. The key point is in recognizing the basic modular properties of the proteins, and their ability to form molecular clusters whose characteristics can be statistically analyzed. We began to explore these ideas in [Bergeron et al., 1997] which describes a virtual laboratory used to predict the effects of changes in transcriptional factors on the rate of transcription.
The current project, whose models and results are exposed in the following sections, focuses on the death factors. These proteins are known to participate in the first step of a process that can signal a diverse range of activities, including cellular proliferation, or death by apoptosis. Our goal is to understand, qualitatively and quantitatively, how minor variations among this group of proteins can generate opposite effects, even in the presence of similar stimuli.
The next section exposes the basic hypothesis which can be summed up as: a regulatory protein is characterized by the set of its binding domains. These domains are used to construct clusters of different compositions and properties; cellular response depends on the characteristics of the population of possible clusters. 'Programmed Cellular Death' exposes the principles underlying apoptosis, and shows how the virtual laboratory can help to explain some experimental observations. Finally, in 'Combinatorial Model', we present the combinatorial model underlying the laboratory.
The Death Factors
Each protein is characterized by an amino acid chain which, during synthesis, folds itself into a unique tridimensional structure. The surface of this globular entity contains regions, called here binding domains, of specific shape, charge and/or hydrophobic properties. Types of domains are relatively few in numbers, and similar domains - that is, regions with similar composition and properties - are frequently shared by different proteins.
In this project, we focus on a set of proteins, the death factors,
that are involved in the first step of signaling processes initiated by the
TNF (Tumor Necrosis Factor) receptor family. Fig. 1
identifies schematically some of their
binding domains [Liu et al., 1996; Malinin et al., 1997].
|
Figure 1: Death factors and their relatives |
Once the proteins are folded, these domains lie on their external surface allowing interactions with
other molecules present in the cell. Tab. 1 gives the rules of such interactions between the domains present on the death factors, as derived from known biological interactions - a more refined version will be given under The Combinatorial Model and the Virtual Laboratory.
Table 1: Rules of Interaction between Binding Domains
| Domain 1 | Domain 2 |
| DD T P P2 Ring TrafC-1 Traf |
DD T' P' P2' Ring TrafC-1' Traf |
Some domains are self-complementary - such as DD or Traf -, while others bind to specific, but less characterized domains. In the latter case, we use the convention of labeling by D' a domain complementary to the domain D.
Molecular cluster formation is the driving force behind all processes
in the cell.
In response to a given stimulus, proteins form clusters
that are bound together by
complementary domains through non-covalent interaction. Fig. 2
shows an example of a cluster containing four death factors associated to
stimulated triad of receptors at the cell surface.
This kind of representation
does not imply that the relative positions of the domains are known, but
is used to display the interconnections between proteins in a Lego-like
fashion: DD binds to DD, T' binds to T, etc.
|
Figure 2: A hypothetical cluster of four proteins showing three free domains: DD, P2, Traf |
Populations of Clusters
On the surface of the cell, receptors are proteins that cross the membrane and that possess - at least - three types of domains. In the middle, they are anchored to the membrane by an hydrophobic domain. On the external side, they interact with molecules in the environment propagating, through changes in physical or chemical properties, signals to their internal part.
Inside the cell, this modification allows the formation of a cluster
of regulatory proteins, eventually triggering enzymatic activity
whose effect can propagate the signal, through the formation of successive
waves of clusters, down to the nucleus.
For example, in response to an external stimulus,
the receptors of the TNF family form trimeric structures, exposing
three active DD domains to the cytoplasm (Fig. 3).
These domains recruit cytosolic factors having a complementary domain DD,
beginning the formation of clusters.
|
Figure 3: Triads of receptors |
Initiation of a signaling process is not the result of the stimulation of a single receptor, but of a set of identical receptors. Since different factors share similar binding domains, clusters of different composition can be associated with each stimulated receptor. Fig. 4 gives a snapshot of three possible clusters that can be formed with some of the death factors, given the rules of Tab. 1.
|
Figure 4: Examples of Different Hypothetical Clusters associated to Identical Receptors |
Fig. 4 also gives a hint on the combinatorial nature
of clusters. Indeed, many different clusters can be formed, and it
is possible to enumerate them according to their size.
Tab. 2 gives the number of possible clusters
with a given number of proteins, assuming that there is enough of
each of the death factors in the cytoplasm.
Table 2: Number of Different Clusters with a given Number of Proteins
| Number of Proteins in Cluster | Number of different Clusters |
| 1 2 3 4 5 6 7 8 9 10 |
4 14 41 72 131 210 245 220 119 46 |
For example, the second line of Tab. 2 lists 14 different clusters of 2 proteins with TNFR-1 receptors. These can be briefly described as follows.
Since each of the four proteins TRADD, FAD, RAIDD, and RIP has a DD domain, any two of them can bind to two of the three DD domains of TNFR1, yielding the 10 clusters:
{TRADD, TRADD} {TRAD, FADD} {TRADD, RAIDD} {TRADD, RIP} {FADD, FADD}
{FADD, RAIDD} {FADD, RIP} {RAIDD, RAIDD} {RAID, RIP} {RIP, RIP}
The four remaining clusters are obtained when one of TRADD, FADD, RAIDD or RIP recruits another factor with a non DD domain:
{TRADD, TRAF2} {FADD, FLICE} {RAIDD, ICE} {RIP, TRAF2}
Clearly, this kind of computation cannot be done by hand. We are also aware that some of the clusters could be unrealizable, either because of geometry, or because of instability resulting from incompatible participating molecules. Nevertheless, the number of different clusters at a given level of complexity is likely to remain high, and statistical study of the formal populations can still shed light on what is going on.
Obviously, in the more complex and still unknown situation in the cell, some of the virtual clusters could be replaced by completely different protein assemblies.
Our basic hypothesis is that cellular response does not depend on a simple sequence of recruitment events, but rather on the characteristics of the population of clusters that can result in response of a triggering event stimulating a group of identical receptors. Different triggers will result in a different distribution of clusters among the same group of cytosolic proteins.
In the next section, we will discuss in details how cell death, or
proliferation, can be triggered. We also present the results of
various virtual experiments involving the death factors.
Life and death in the immune system
In the human organism, the proper working of the immune system involves the proliferation and natural death of cells such as T lymphocytes. In response to environmental stimuli, lymphocytes present diverse cellular responses including apoptosis [Hale et al., 1996]. This form of cell death leads to the elimination of undesired cells without inducing any inflammatory response. Apoptosis is associated with several profound alterations of cell morphology and composition, including DNA fragmentation.
We focus on three types of receptors of the TNF family, that are
present on the surface of lymphocytes. Their basic binding domains are
given in Fig. 5.
|
Figure 5: Receptors of the TNF Family |
All these receptors are able, upon stimulation, to form clusters with the same group of signaling proteins [Nagata 1997]. Curiously, however, the ligation of these receptors can promote two different cell fates: proliferation or death. There also exist situations in which the same stimulus, mediated by the same member of the TNF receptor family, can trigger either proliferation, or apoptosis. One of the reasons of this apparently aberrant behavior is that these two cellular processes, although occurring through similar initial pathways, are due to the existence of a signaling bifurcation [Nagata 1997; Malinin et al., 1997]. Indeed, each of these receptors can transmit one signal eliciting cell death, and another that induces proliferation. This dichotomic signalization is dictated by the nature of the effector molecules recruited by the receptors, and the final cellular output depends on the relative frequency of these two signalization events.
TNFR1 and TNFR2, but not FAS, respond to the same external cytokine, the TNF-alpha, however their intracellular domains are not the same (Fig. 5). On the other hand, TNFR1 and FAS share similar intracellular domains, the DD or Death Domains, which is able to interact with DD-containing factors, although with different affinities [Liu et al., 1996; Hsu et al., 1995; Chinnaiyan et al., 1995]. Thus, in response to their respective stimulus, these three receptors will give rise to different cluster populations. The main activity of TNFR2 is to trigger cellular proliferation, and activation of FAS mainly signals cell death [Nagata 1997]. In different physiological states, stimulation of TNFR1 can lead to either of these opposite responses.
Four of the signaling proteins involved are known to possess enzymatic properties (Fig. 1). Two of them, RIP and MAPKKK, belong to the kinase family. When activated, they trigger a chain reaction called a kinase cascade, leading to the activation of transcription factors, gene expression, and eventually to proliferation of the cell. The same principle apply to the two proteins of the protease family, FLICE and ICE, generating a protease cascade leading to the fragmentation of the DNA, which is essentially gene destruction [Hale et al., 1996; Nagata 1997].
Fig. 6 shows three possible clusters that
can be formed with stimulated TNFR1 receptors. The first cluster
may activate a protease cascade, the third one, a kinase cascade, and
the middle one has no associated enzymatic activity.
|
Figure 6: Possible responses to TNFR1 stimulation |
In order to predict final cellular response, we simulated cluster formation in a virtual laboratory. Starting with a medium composed of several copies of each of the death factors and their relatives, we computed, among cluster populations containing n proteins, the expected number of clusters containing protein kinase and protease. Fig. 7 shows the frequencies of clusters containing proteins with an enzymatic domain (kinase or protease) in presence of, respectively, the receptors TNFR1, TNFR2, and FAS.
The expected number of cluster exhibiting a characteristic in a population is obtained by summing the probabilities of of existence of each cluster with the characteristic. For exemple in the 14 clusters of 2 proteins discussed in section "Populations of Clusters", two contain a protease:
{FADD, FLICE}, {RAIDD, ICE}
and five contain a kinase:
{TRADD, RIP}, {RAID, RIP}, {FADD, RIP}, {RIP, RIP}, {RIP, TRAF-2}
The probability of occurrence of each cluster is computed according to the formula developed in the section 'Modeling Cluster Formation'.
|
|
|
| Figure 7: Frequency of clusters containing protease and kinase among population of clusters of n proteins in simulated stimulation of receptors | ||
Results and Discussion
Our results, as expected from biological data, show that the three receptors lead to the formation of clusters associated with kinase or protease (Fig. 7). However, the profile of cluster formation completely differs from one receptor to the other.
With the TNFR2 receptors, the expected number of clusters associated with a protein kinase reaches a high level compared to clusters with protease. Almost none of the clusters with less than six proteins are associated with protease.
In contrast, in the presence of FAS receptors, a higher proportion of clusters with protease is observed, even with clusters containing very few proteins. These characteristics correlate well with the main biological activities respectively mediated by these two receptors namely, the kinase activation (proliferation), and the protease activation (apoptosis).
As previously mentioned, the stimulation of TNFR1 may lead to transmission of either a proliferation or apoptotic signal. Interestingly, under the same experimental conditions the proportion of TNFR1-signalization clusters associated to protease is intermediate compared to the ones calculated with TNFR2 and FAS. However, and in contrast with cluster formation associated to TNFR2, a high proportion of clusters of less than four proteins could contain kinase. This intermediate behavior could reflect the adaptability of the TNFR1 to mediate opposite cellular responses upon physiological variation in a given cell. Indeed, minor changes in the cytosolic factors composition could favor the kinase or protease recruitment among populations of TNFR1-associated clusters.
In order to verify this hypothesis, we
introduced in our virtual cell a new protein, FADD-
, which
contains only the DD domain of the FADD factor (Fig. 1).
This protein is able
to bind to the TNFR1 DD domains but is unable to recruit additionnal
members in the cluster. The presence of this protein, which will compete
with FADD, is expected to protect cells from apoptosis.
As presented in the first three columns of Tab. 3, while a high kinase/protease ratio is associated with stimulation of the TNFR2 - a receptor associated mainly with cellular proliferation -, a lower ratio is associated with induction of apoptosis (FASR). We computed this ratio for clusters of different sizes (n).
Table 3: The kinase/protease ratios with increasing concentration of
FADD-
| n | TNFR2 | FASR | TNFR1 FADD- *0 |
TNFR1 FADD- *3 |
TNFR1 FADD- *6 |
| 2 | - | 3.01 | 6.37 | 7.09 | 7.80 |
| 3 | - | 1.82 | 3.73 | 3.89 | 4.08 |
| 4 | - | 1.80 | 3.71 | 3.05 | 2.59 |
| 5 | 176.40 | 5.60 | 8.61 | 9.28 | 10.01 |
| 6 | 84.40 | 4.04 | 6.80 | 7.38 | 8.21 |
| 7 | 26.96 | 10.83 | 17.85 | 21.87 | 26.02 |
In vitro, the artificial over-expression of FADD-
was shown to block
the TNF-induced apoptosis [Chinnaiyan et al., 1995; Liu et al., 1996]. Similarly, computational data reveals
that the introduction of an increasing number of FADD-
results
in an increase of the ratio of
kinase/protease recruited by the TNFR1 receptors. The initial medium
was composed of 3 proteins of each of the death factors.
In Tab. 3, the last three
columns give the ratios for a medium containing no FADD-
protein,
3 FADD-
proteins, and then 6 FADD-
proteins.
The combinatorial model presented here is a refinement and
a simplification of [Bergeron et al., 1997]. The first modification is to consider a more
general binding relation which allows us to adjust the degree of
affinities of domains. This, in turn, influences the
probability of formation of a given cluster. We also simplified
the recruitment operation, allowing a protein to link to a cluster
with only one domain.
The Binding Relation
Given a set T of interaction domains, the binding relation
is given by a symmetric function:

which assigns to a pair of domains (d1, d2) a real number Aff(d1, d2) called the affinity of (d1, d2). This value is set to 0 if the domains do not interact, and to a positive value if the domains are known to interact.
In the experiments presented in section 3, we treated with special care the binding of DD domains. Theoretically, any two DD domains can interact, but DD domains on different proteins can have different composition, and some attractions are stronger than others. For example, it is known [Hsu et al., 1995] that the protein TRADD has a high affinity for the DD domain of the receptor TNFR1, but a low affinity for the DD domains of FAS. On the other hand, the FADD protein binds strongly to FAS but weakly to TNFR1 [Chinnaiyan et al., 1995].
We arbitrarily assigned values from 1 to 4 to distinguish between
reported low to
high affinities. For the DD domains present on the two receptors
TNFR1 and FAS, and in the four death factors TRADD, FADD, RIP and
RAIDD, we have the following matrix for the binding relation:
| TRNR1 | FASR | TRADD | FADD | RIP | RAIDD | |
| TNFR1 | 0 | 0 | 4 | 1 | 2 | 2 |
| FASR | 0 | 1 | 4 | 2 | 2 | |
| TRADD | 1 | 1 | 1 | 1 | ||
| FADD | 1 | 1 | 1 | |||
| RIP | 4 | 4 | ||||
| RAIDD | 1 |
All other interactions among the factors involved other
domains than DD. Their affinities were set to 4.
Modeling Cluster Formation
In the virtual laboratory, a protein is modeled as the (multi) set of its domains. For example, the protein TRAF-2 is described by:
TRAF-2 = {Ring, T', Traf}
A cluster is a set of proteins, together with a set
of free domains. For example, the cluster depicted in
Figure 2 is represented by:
Proteins = {RAIDD, TRADD, TRAF-2, MAPKKK}
Domains = {DD, P2, Traf}
A cluster C can recruit a protein P
with the link (d1, d2) if:
The resulting cluster is obtained by adding the protein P to cluster C, removing d1 from the free domains of C, and adding to them any domain of P other than d2.
Given a cluster C and a medium M which contain proteins that can
be recruited, the probability that C recruits a protein P
M with the link (d1, d2) is given by:
where:
Statistics on Popn
Starting with an initial medium M, and an initial cluster representing the domains of a receptor, the laboratory computes successively the sets of different clusters containing 1, 2, 3, ... proteins. The set of clusters containing n proteins is called Popn. We consider two clusters to be equal if they contain the same proteins, and have the same free domains.
Each cluster in Popn is obtained by a sequence of recruitments,
thus we can compute its probability of formation. Given these
probabilities, it is possible to compute the expected level of
protease and kinase of a given population [Bergeron et al., 1997].
The Virtual Laboratory
The virtual laboratory is a program written in JAVA which can be used to study any process involving cluster formation. The experimenter must provide the description of the proteins involved, such as: enumeration of domains on each factor, quantity of factors, rules of interactions between domains (affinities).
An experiment is initiated by giving an initial cluster, which is simply the list of its free domains. Several parameters can be adjusted to control the growth of clusters. For example, in the present experiments, we chose to stop the growth of clusters as soon as one of the proteins MAPKKK, FLICE, or ICE was recruited.
For the experiments described in this paper, we had an initial
medium of about 30 proteins - 3 of each of the death factors. The
computations took, typically, a few minutes to generate all possible
clusters. Once the populations are known and stored, the program can
compute statistics on the composition of clusters.
As shown in the present experiments, the use of computational tools can provide guidelines to experiments on the cellular response upon stimulation in different cellular contexts. Although presently limited, our virtual laboratory could be improved to include parameters such as geometrical localization of the protein domains, and the relative affinity of domain interactions based on experimental values. These values could be derived from experimental systems such as the yeast two-hybrid system [Fields et al., 1989]. In this system, different protein domains can be linked, using genetic engineering, to different chimeric proteins. When two of these chimeric proteins are artificially expressed in yeast cells, the binding affinity between the two domains under scrutiny directly correlates with the intensity of an indicator biochemical reaction. Thus, it is possible to systematically measure the two-by-two interactions of a set of domains, and to construct the weight matrix of their relative affinities.
Further implementations will
also include serial connections between virtual laboratories.
For example, the output of a first experiment on a signalization pathway
can be fed to a second one, modifying accordingly the composition of
transcription factors. Such simulations are expected
to mimic more closely the sequence of cellular events leading to
decision of a cell to divide, or to commit suicide. We also plan to construct
biological databases of regulatory proteins in terms of their binding
domains, or binding partners. Such protein mappings will be useful
to rapidly construct virtual experiments based on protein interactions.