|In Silico Biology 5, 0046 (2005); ©2005, Bioinformation Systems e.V.|
IMGT, the international ImMunoGeneTics information system®
Université Montpellier II, Laboratoire d'ImmunoGénétique Moléculaire LIGM
UPR CNRS 1142, Institut de Génétique Humaine IGH
141 rue de la Cardonille
34396 Montpellier Cedex 5, France
Phone: +33-4-99 61 99 65, Fax: +33-4-99 61 99 01
Institut Universitaire de France
Edited by E. Wingender; received September 02, 2005; revised and accepted September 18, 2005; published October 20, 2005
One of the key elements in the adaptive immune response is the presentation of peptides by the major histocompatibility complex (MHC) to the T cell receptors (TR) at the surface of T cells. The characterization of the TR/peptide/MHC trimolecular complexes (TR/pMHC) is crucial to the fields of immunology, vaccination and immunotherapy. In order to facilitate data comparison and cross-referencing between experiments from different laboratories whatever the receptor, the chain type, the domain, or the species, IMGT, the international ImMunoGeneTics information system® (http://imgt.cines.fr), has developed IMGT-ONTOLOGY, the first ontology in immunogenetics and immunoinformatics. In IMGT/3Dstructure-DB, the IMGT three-dimensional structure database, TR/pMHC molecular characterization and pMHC contact analysis are made according to the IMGT Scientific chart rules, based on the IMGT-ONTOLOGY concepts. IMGT/3Dstructure-DB provides the standardized IMGT gene and allele names (CLASSIFICATION), the standardized IMGT labels (DESCRIPTION) and the IMGT unique numbering (NUMEROTATION). As the IMGT structural unit is the domain, amino acids at conserved positions always have the same number in the IMGT databases, tools and Web resources. For the TR alpha and beta chains, the amino acids in contact with the peptide/MHC (pMHC) are defined according to the IMGT unique numbering for V-DOMAIN. The MHC cleft that binds the peptide is formed by two groove domains (G-DOMAIN), each one comprising four antiparallel beta strands and one alpha helix. The IMGT unique numbering for G-DOMAIN applies both to the first two domains (G-ALPHA1 and G-ALPHA2) of the MHC class I alpha chain, and to the first domain (G-ALPHA and G-BETA) of the two MHC class II chains, alpha and beta. Based on the IMGT unique numbering, we defined eleven contact sites for the analysis of the pMHC contacts. The TR/pMHC contact description, based on the IMGT numbering, can be queried in the IMGT/StucturalQuery tool, at http://imgt.cines.fr.
Availability: IMGT/3Dstructure-DB is freely available at http://imgt.cines.fr.
Keywords: IMGT, T cell receptor, TR, major histocompatibility complex, MHC, pMHC, TR/peptide/MHC complex, TR/pMHC, three-dimensional structure, 3D structure, contact analysis, IMGT/3Dstructure-DB, IMGT/StructuralQuery, immunoinformatics, immunogenetics, immune system
T cells are involved in the specific immune response against a stress of viral, bacterial, fungal or tumoral origin. They identify antigenic peptides presented by the major histocompatibility complex (MHC) cell surface glycoproteins. The recognition is carried out by the T cell receptor complex (TcR), a multisubunit transmembrane surface complex made up of a T cell receptor (TR) and of the CD3 chains, that is associated, in the immunological synapse, to the CD4 or CD8 coreceptors, to the CD28 and CTLA-4 costimulatory proteins, to the CD2 adhesion molecule and to intracellular kinases . The TR directly binds the peptide/MHC complex (pMHC), and activates the T cell through interactions with the CD3 and other components of the TcR [2, 3, 4]. Three-dimensional (3D) structures of the TR, pMHC and TR/pMHC complexes provide an atomic description of their interactions [5, 6].
Since 1989, IMGT, the international ImMunoGeneTics information system® [7, 8, 9, 10], http://imgt.cines.fr, created by Marie-Paule Lefranc, Laboratoire d'ImmunoGénétique Moléculaire (LIGM) (Université Montpellier II and CNRS) at Montpellier, France, has offered standardized genetic and structural data on immunoglobulins (IG), TR and MHC, and on related proteins of the immune system (RPI) that belong to the immunoglobulin superfamily (IgSF) and to the MHC superfamily (MhcSF). In order to facilitate data comparison and cross-referencing between experiments from different laboratories whatever the receptor, the chain type, the domain, or the species, IMGT developed IMGT-ONTOLOGY , the first ontology in immunogenetics and immunoinformatics.
Based on the IMGT-ONTOLOGY concepts, the IMGT Scientific chart provides the controlled vocabulary and the annotation rules necessary for the identification, the description, the classification and the numbering of the IG, TR, MHC and RPI . The IDENTIFICATION concept refers to the IMGT standardized keywords indispensable for the sequence and 3D structure assignments. The DESCRIPTION concept provides the IMGT standardized labels used to describe structural and functional regions that compose IG, TR, MHC and RPI sequences and 3D structures. Standardized labels have also been defined to characterize the three-dimensional assembly of domains and chains. The CLASSIFICATION concept provides immunologists and geneticists with a standardized nomenclature per locus and per species. The human IG and TR gene nomenclature elaborated by IMGT was approved by the Human Genome Organisation (HUGO) Nomenclature Committee, HGNC , in 1999. The mouse IG and TR gene names with IMGT reference sequences were provided by IMGT to HGNC and to the Mouse Genome Database (MGD) , in July 2002. The NUMEROTATION concept provides the IMGT unique numbering for the IG and TR V-DOMAIN and V-LIKE-DOMAIN of the IgSF proteins other than IG or TR ,and for the IG and TR C-DOMAIN and C-LIKE-DOMAIN of the IgSF proteins other than IG or TR . An IMGT unique numbering has also been set up for the MHC G-DOMAIN and G-LIKE-DOMAIN of the MhcSF proteins other than MHC .
The IMGT standardization has allowed to build a unique frame for the comparison of the TR, peptides and MHC interactions in the different resources provided by the information system. IMGT/3Dstructure-DB , the IMGT structural database, is used with the IMGT sequence databases (IMGT/LIGM-DB [7, 8] and IMGT/MHC-DB ), the IMGT gene database (IMGT/GENE-DB ), the IMGT tools for sequence analysis (IMGT/V-QUEST , IMGT/JunctionAnalysis ) and the IMGT tool for 3D structure analysis (IMGT/StructuralQuery ), to explore the TR and MHC conserved structural features. In this paper, we describe the molecular characterization and standardized contact analysis of the TR/pMHC complexes in IMGT/3Dstructure-DB. Coordinate files are from IMGT/3Dstructure-DB , http://imgt.cines.fr, with original crystallographic data from the Protein Data Bank PDB . Eleven IMGT pMHC contact sites were defined (C1 to C11) which can be used to compare pMHC interactions. We provide the description of the interactions of the TR V-ALPHA and TR V-BETA with MHC and the peptide using the IMGT unique numbering for V-DOMAIN  and the IMGT unique numbering for G-DOMAIN , which allows, for the first time, to compare interaction data, whatever the TR gene group (TRAV, TRBV), whatever the MHC class (MHC-I, MHC-II), and whatever the species (Homo sapiens, Mus musculus).
The T cell receptor (TR) is made of two chains, an alpha chain (TR-ALPHA) and a beta chain (TR-BETA) for the TR-ALPHA_BETA receptor, a gamma chain (TR-GAMMA) and a delta chain (TR-DELTA) for the TR-GAMMA_DELTA receptor . Each complete TR chain comprises an extracellular region made up of a variable domain V-DOMAIN (for instance, V-ALPHA for the alpha chain) and a constant domain C-DOMAIN (for instance, C-ALPHA for the alpha chain), a connecting region, a transmembrane region and a very short intracytoplasmic region (Table 1, Figure 1).
|Table 1:||IMGT standardized labels for the DESCRIPTION of the T cell receptors, chains, domains and regions.|
|IMGT receptor labels||IMGT chain labels||IMGT domain labels||IMGT region labels|
|C-ALPHA||Part of C-REGION (1)|
|C-BETA||Part of C-REGION (1)|
|C-GAMMA||Part of C-REGION (1)|
|C-DELTA||Part of C-REGION (1)|
|(1) The TR chain C-REGION also includes the CONNECTING-REGION, the TRANSMEMBRANE-REGION and the CYTOPLASMIC-REGION which are not present in the 3D structures (Correspondence between labels for IG and TR domains in IMGT/3Dstructure-DB and IMGT/LIGM-DB, IMGT Scientific chart).|
The MHC-I is formed by the association of an heavy chain (I-ALPHA) and a light chain (beta-2-microglobulin B2M) (Table 2, Figure 1). The MHC-II is an heterodimer formed by the association of an alpha chain (II-ALPHA) and a beta chain (II-BETA). The I-ALPHA chain of the MHC-I, and the II-ALPHA and II-BETA chains of the MHC-II comprise an extracellular region made of three domains for the MHC-I and of two domains for the MHC-II chains, a connecting region, a transmembrane region and an intracytoplasmic region.
|Figure 1: T cell receptor/peptide/MHC complexes with MHC class I (TR/pMHC-I) and MHC class II (TR/pMHC-II). [D1], [D2] and [D3] indicate the domains. (A) 3D structures of TR/pMHC-I (1oga)  and TR/pMHC-II (1j8h) . The figure was generated with Pymol, http://pymol.sourceforge.net. (B) Schematic representation of TR/pMHC-I and TR/pMHC-II. The TR (TR-ALPHA and TR-BETA chains), the MHC-I (I-ALPHA and beta-2-microglobulin B2M chains) and the MHC-II (II-ALPHA and II-BETA chains) are shown with the extracellular domains (V-ALPHA and C-ALPHA for the TR-ALPHA chain; V-BETA and C-BETA for the TR-BETA chain; G-ALPHA1, G-ALPHA2 and C-LIKE for the I-ALPHA chain; C-LIKE for B2M; G-ALPHA and C-LIKE for the II-ALPHA chain; II-BETA and C-LIKE for the II-BETA chain), and the connecting, transmembrane and cytoplasmic regions. Arrows indicate the peptide localization in the G-DOMAIN groove. The MHC G-DOMAINs and TR V-DOMAINs are likely to be in a diagonal rather than in a vertical position relative to the cell surface [24, 25].|
The I-ALPHA chain comprises two groove domains (G-DOMAIN), G-ALPHA1 [D1] and G-ALPHA2 [D2], and one C-LIKE domain [D3]. The B2M corresponds to a single C-LIKE domain. The II-ALPHA chain and the II-BETA chain each comprises two domains, G-ALPHA [D1] and C-LIKE [D2], and G-BETA [D1] and C-LIKE [D2] (Table 2). Only the extracellular region that corresponds to these domains has been crystallized (Figure 1).
|Table 2:||IMGT standardized labels for the DESCRIPTION of the MHC receptors, chains, domains and domain numbers.|
|IMGT receptor labels||IMGT chain labels||IMGT domain labels||Domain numbers|
|(1) The I-ALPHA, II-ALPHA and II-BETA chains includes at the C-terminal end of the C-LIKE-DOMAIN, the CONNECTING-REGION, the TRANSMEMBRANE-REGION and the CYTOPLASMIC-REGION which are not present in the 3D structures.|
The TR V-DOMAINs and MHC G-DOMAINs that are directly involved in the TR/pMHC interactions are described in the next sections.
The V-DOMAINs have an immunoglobulin fold, that is an antiparallel beta sheet sandwich structure with 9 strands [14, 21], the A, B, E and D strands being on one sheet, and the G, F, C, C' and C" strands on the other sheet. These strands are indicated in the IMGT Colliers de Perles (Figure 2) which are IMGT 2D graphical representations based on the IMGT unique numbering for V-DOMAINs . IMGT Colliers de Perles of the V-ALPHA and V-BETA domains from 1ao7  are shown as examples in Figure 2.
The V-ALPHA and V-BETA domains share main conserved characteristics of the V-DOMAIN which are the disulfide bridge between cysteine 23 (1st-CYS) and cysteine 104 (2nd-CYS), and the other hydrophobic core residues tryptophan 41 (CONSERVED-TRP) and leucine (or hydrophobic) 89  (Figure 2). The A strand comprises positions 1 to 15, B strand positions 16 to 26, C strand positions 39 to 46, C' strand positions 47 to 55, C" strand positions 66 to 74, D strand positions 75 to 84, E strand positions 85 to 96, F strand positions 97 to 104, and G strand positions 118 to 128 . Compared to the general V-DOMAIN 3D structure, the V-ALPHA domains have shorter C" and D strands at the C’D turn (with 7 gaps at positions 71 to 77) and, in contrast, longer D and E strands at the DE turn (with additional positions at 84A, 84B and 84C).
|Figure 2: IMGT Collier de Perles of the V-ALPHA and V-BETA domains from 1ao7  (IMGT/ 3Dstructure-DB, http://imgt.cines.fr) (A) on one layer (B) on two layers. Amino acids are shown in the one-letter abbreviation. Hydrophobic amino acids (hydropathy index with positive value) and tryptophan (W) found at a given position in more than 50% of analyzed IG and TR sequences are shown. The CDR-IMGT are limited by amino acids shown in squares, which belong to the neighbouring FR-IMGT and represent anchor positions. The CDR3-IMGT extend from position 105 to 117 . Hatched circles correspond to missing positions according to the IMGT unique numbering. Arrows indicate the direction of the beta sheets and their different designations in 3D structures.|
The three hypervariable loops or complementarity determining regions (CDR-IMGT) of each V-DOMAIN are involved in the pMHC recognition. The CDR1-IMGT comprises positions 27 to 38, the CDR2-IMGT positions 56 to 65 and the CDR3-IMGT positions 105 to 117 . The CDR3-IMGT corresponds to the junction resulting from the V-J and V-D-J rearrangement, and is more variable in sequence and length than the CDR1-IMGT and CDR2-IMGT that are encoded by the V-REGION only . Lengths of the CDR1-IMGT are shown separated by dots between brackets . For examples, 1ao7 [6.5.11] V-ALPHA means that in the V-ALPHA domain of 1ao7, CDR1-IMGT has a length of 6 amino acids, CDR2-IMGT a length of 5 amino acids and CDR3-IMGT a length of 11 amino acids, and 1ao7 [5.6.14] V-BETA means that in the V-BETA domains of 1ao7, CDR1-IMGT, CDR2-IMGT and CDR3-IMGT have a length of 5, 6 and 14 amino acids, respectively .
|Figure 3: IMGT Collier de Perles of MHC G-DOMAINs. (A) MHC-I G-ALPHA1 and G-ALPHA2 domains (B) MHC-II G-ALPHA and G-BETA domains. MHC-I G-DOMAINs are from 1ao7  and MHC-II G-DOMAINs are from 1j8h  (IMGT/3Dstructure-DB , http://imgt.cines.fr). Amino acid positions are according to the IMGT unique numbering for G-DOMAIN . Positions 61A, 61B and 72A are characteristic of the G-ALPHA2 and G-BETA domains (and are not reported in the G-ALPHA1 and G-ALPHA IMGT Collier de Perles).|
Owing to its standardization, the IMGT unique numbering for G-DOMAIN  has allowed to graphically represent, in the IMGT Colliers de Perles for G-DOMAIN, the MHC amino acid positions that have contacts with the peptide side chains. Eleven IMGT pMHC contact sites were defined (C1 to C11) which can be used to compare pMHC interactions. Examples of contact sites for a MHC-I binding a 8-amino acid peptide (1jtr), for a MHC-I binding a 9-amino acid peptide (1ao7) and for a MHC-II binding 9 amino acids of the peptide in the groove (1j8h) are shown in Figures 4, 5 and 6, respectively.
In contrast to previous attempts to define pockets , structural data for defining the IMGT pMHC contact sites take into account the length of the peptides and are considered independently of the MHC class and sequence polymorphisms. The interactions between the peptide amino acid side chains and MHC amino acids were computed using an interaction scoring scheme based on true mean energy ratio. The score assigned to each contact is a constant value, independent on the distance between atoms (hydrogen bond 40, water mediated hydrogen bond 20, contact between polar atoms 20, contact between non polar atoms 1). All direct contacts (defined with a cut off equal to the sum of the atom van der Waals radii and of the diameter of a water molecule) and water mediated hydrogen bonds were taken into account for the definition of the IMGT pMHC contact sites. The analysis was carried out for the pMHC available in IMGT/3Dstructure-DB , http://imgt.cines.fr. One hundred fourteen 3D structures with peptides of 8, 9 and 10 amino acids bound to MHC-I and forty-four 3D structures of pMHC-II were identified. The contact analysis was performed for the peptide amino acid side chains of the 9 amino acids located in the groove. Results for MHC-I with 8-amino acid peptides (30 pMHC-I 3D structures), MHC-I with 9-amino acid peptides (74 pMHC-I 3D structures), and MHC-II for the 9 amino acids located in the groove (44 pMHC-II 3D structures) are reported in Table 3 (the results for the ten pMHC-I with 10-amino acid peptides are not shown). These "IMGT reference pMHC contact sites" are also available as IMGT Colliers de Perles. They will be updated as the number of 3D structures increases.
IMGT Colliers de Perles for IMGT pMHC contact sites are provided for each individual pMHC and TR/pMHC entry in IMGT/3Dstructure-DB. They allow to easily identify the amino acid contacts between the MHC and the peptide amino acid side chains and to compare them with the "IMGT reference pMHC contact sites".
|Table 3:||IMGT reference pMHC contact sites. (A) MHC-I, (B) MHC-II.|
|8-amino acid peptides|
|C1||1||59 62 63 66||73 77 81|
|C3||2||7 24 45||9|
|C4||3||9 24 63 66 67 70|
|C6||5||7 9 22 70 74||7 9 24 26|
|C9||6||59 61A 63 66|
|C10||7||77 73 76|
|C11||8||77 80 81 84||5 26 33 34 55 59|
|9-amino acid peptides|
|C1||1||5 59 62 63 66||73 77 81|
|C3||2||7 9 22 24 34 45 63 66 67 70|
|C4||3||7 9 24 66 67 70|
|C6||5||70 73 74||7 26 66 67|
|C8||6||66 69 70 73 74||7 24 62 66|
|C9||7||7 24 59 61A 63 66|
|C10||8||72 73 76 80||58|
|C11||9||77 80 81 84||5 26 33 34 55 59|
|C1||1||26 33 34 47 60 61 62||77 80 81 82 84 85|
|C2||2||72A 73 76|
|C3||3||7 24 62 63 66 67 69|
|C4||4||7||9 11 22 24 66 67 70 73 74|
|C6||6||9 69 70 73 74||7 26|
|C9||7||24 26 45 59 63 66|
|C11||9||77 80 81 84||5 33 55|
|(A) IMGT reference pMHC contact site results from one hundred and four pMHC-I 3D structures (30 with 8-amino acid peptides and 74 with 9-amino acid peptides. (B) IMGT reference pMHC contact site results from forty-four pMHC-II 3D structures with 9 amino acids in the groove.|
|Figure 4: IMGT pMHC contact sites of human HLA-A*0201 MHC-I and a 9-amino acid peptide side chains (1ao7) . Upper section: 3D structure of the human HLA-A*0201 groove. Lower section: IMGT pMHC contact sites IMGT Collier de Perles. Both views are from above the cleft, with G-ALPHA1 on top and G-ALPHA2 on bottom. In the box, C1 to C11 refer to contact sites. 1 to 9 refers to the numbering of the peptide amino acids P1 to P9. There are no C2 and C7 in MHC-I 3D structures with 9-amino acid peptides. There is no C5 in this 3D structure as P4 does not contact MHC amino acids (4G is shown between parentheses in the box).|
C1 to C11 refers to the eleven contact sites. 1 to 9 refers to the numbering of the peptide amino acids in the groove. The peptide binding mode to MHC-I is characterized by the N and C peptide ends docked deeply with C1 and C11 contact sites that correspond to the two conserved pockets A and F, and by the peptide length that mechanically constrains the peptide conformation in the groove. There are no C2, C7 and C8 contact sites for MHC-I with 8-amino acid peptides, no C2 and C7 for MHC-I with 9-amino acid peptides. In contrast, for MHC-II, C2 is present but there is no C7 and C8. Whereas C1 and C11 correspond to the conserved pockets A and F, respectively, the correspondence between the other contact sites and the previously defined pockets is more approximative. For MHC-I with a peptide of 8-amino acids C3, C4, C6 and C9 correspond roughly to the B, D, C and E pockets, and for MHC-I with a peptide of 9-amino acids C3, C4 and C9 correspond to the B, D and E pockets.
|Figure 5: IMGT pMHC contact sites of mouse H2-K1*01 MHC-I and a 8-amino acid peptide side chains (1jtr) . Upper section: 3D structure of the mouse H2-K1*01 groove. Lower section: IMGT pMHC contact sites IMGT Collier de Perles. Both are views from above the cleft with G-ALPHA1 on top and G-ALPHA2 on bottom. In the box, C1 to C11 refer to contact sites, 1 to 8 refer to the numbering of the peptide amino acids P1 to P8. There are no C2, C7 and C8 in MHC-I 3D structures with 8-amino acid peptides. There is no C5 in this 3D structure as P4 does not contact MHC amino acids (4K is shown between parentheses in the box).|
|Figure 6: IMGT pMHC contact sites of human HLA-DRA*0101 and HLA-DRB1*0401 MHC-II and the peptide side chains (9 amino acids located in the groove) (1j8h) . Upper section: 3D structure of the human HLA-DRA*0101 and HLA-DRB1*0401 groove. Lower section: IMGT pMHC contact sites IMGT Collier de Perles. Both are views from above the cleft with G-ALPHA on top and G-BETA on bottom. In the box, C1 to C11 refer to contact sites. 1 to 9 refers to the numbering of the peptide amino acid 1 to 9 located in the groove. There are no C7 and C8 in MHC-II 3D structures with peptide of 9 amino acids located in the groove. There is no C5 in this 3D structure as 5 does not contact MHC amino acids (5N is shown between parentheses in the box).|
Eighteen TR/pMHC 3D structures are available in IMGT/3Dstructure-DB  (Table 4). Fourteen 3D structures (twelve TR/pMHC-I and two TR/pMHC-II) comprise the complete extracellular region of the alpha-beta TR (TR-ALPHA_BETA) whereas four 3D structures comprise a Fv variable fragment (FV-ALPHA_BETA).
|Table 4:||T cell receptor/peptide/MHC (TR/pMHC) complexes in IMGT/3Dstructure-DB, http://imgt.cines.fr .|
|Code||Ref||Name||Sp||V-DOMAIN genes||CDR-IMGT||Sequence||Length||Sp||Gene and allele||R(Å)|
|Code||Ref||Name||Sp||V-DOMAIN genes||CDR-IMGT||Sequence||Length||Sp||Gene and allele||R(Å)|
|Sp: species, Hs: Homo sapiens, Mm: Mus musculus, R(Å): Crystallographic resolution in angstrom. Twelve 3D structures (10 TR/pMHC-I and 2 TR/pMHC-II) correspond to "complete" TR receptors (TR-ALPHA_BETA). Four 3D structures (1d9k, 1fo0, 1kj2 and 1nam) correspond to an Fv variable fragment (FV-ALPHA_BETA). Gene and allele names are according to IMGT/GENE-DB  for human and mouse TR, to IMGT/HLA-DB  for human MHC, and to MGD  and IMGT for mouse MHC. Amino acid sequences of the TR V-DOMAINs and MHC G-DOMAINs are reported in Figure 3 and Figure 4, respectively. H2-K1*01 encodes H2-K1b, H2-AB*02 and H2-AA*02 encode I-Abk and I-Aak, respectively. Between brackets, lengths of the CDR-IMGT are according to Lefranc et al. 2005 .|
The IMGT Protein display (Figure 7) shows the amino acid sequences of the different V-ALPHA and V-BETA domains found in the crystallized TR/pMHC. Lengths of the V-DOMAIN CDR-IMGT from available TR/pMHC 3D structures are reported in Table 4, together with the names of the V, D and J genes . For examples, the 1ao7 V-ALPHA [6.5.11] results from the TRAV12-2–TRAJ24 rearrangement, and the 1ao7 V-BETA [5.6.14] results from the TRBV6-5–TRBD2–TRBJ2-7 rearrangement. The amino acid sequences of the different G-DOMAINs found in the crystallized TR/pMHC are shown in the IMGT Protein display (Figure 8).
|Figure 7: IMGT Protein display of the TR V-ALPHA and V-BETA domains found in the TR/pMHC complexes in IMGT/3Dstructure-DB [5, http://imgt.cines.fr. Amino acid sequences and gaps (shown by dots) are according to the IMGT unique numbering for V-DOMAIN . The three additional positions in the CDR3-IMGT are 111.1, 112.2 and 112.1. Potential N-glycosylation sites are underlined. Assignments of the V, D and J genes are shown in Table 1.|
|Figure 8: Protein display of the G-DOMAINs found in the TR/pMHC complexes in IMGT/3Dstructure-DB, http://imgt.cines.fr . Amino acid sequence and gaps (shown by dots) are according to the IMGT unique numbering for G-DOMAIN. Amino acid resulting from the splicing with the preceding exon are shown within parentheses. Potential N-glycosylation sites are underlined. The gap in 54A corresponds to a position that is characteristic of the MhcSF G-ALPHA1-LIKE domain . Positions 61A, 61B and 72A are characteristic of the G-ALPHA2 and G-BETA domains.The corresponding gaps in G-ALPHA1 and G-ALPHA, shown in this IMGT Protein display, are not reported in the IMGT Colliers de Perles for as these gaps are shared by those two domains. H2-K1*01 encodes H2-K1b, H2-AB*02 and H2-AA*02 encode I-Abk and I-Aak, respectively.|
The analysis of the pairwise contacts that occur at the TR/MHC and TR/peptide interfaces was carried out using the IMGT unique numbering for V-DOMAINs  for the TR, and the IMGT unique numbering for G-DOMAINs for the MHC. Table 5 shows the interactions of the TR V-ALPHA and TR V-BETA with MHC-I and the peptide, in nine TR/pMHC-I 3D structures. Table 6 shows the interactions of the TR V-ALPHA and TR V-BETA with MHC-II and the peptide, in two TR/pMHC-II 3D structures. These tables provide for the first time the contacts using the IMGT unique numbering for V-DOMAIN  and the IMGT unique numbering for G-DOMAIN , allowing to compare data whatever the gene group (TRAV, TRBV), the MHC class (MHC-I, MHC-II), and whatever the species (Homo sapiens, Mus musculus).
The results show that positions implicated in the binding are well conserved but not the pairwise interactions. The MHC contact positions belong to the G-DOMAIN helices. The TR positions that are involved in the contacts belong mostly to the CDR-IMGT and to anchor positions (shown by squares in Figure 2). The FR-IMGT positions involved in the contacts are positions 84 and 84A that are located at the DE turn (designated as "hypervariable 4" or HV4).
The contact analysis confirms that the V-ALPHA CDR2-IMGT seats on top of the G-ALPHA2 (MHC-I) or G-BETA (MHC-II) helices, and that the V-BETA CDR2-IMGT seats on top of the G-ALPHA1 (MHC-I) or G-ALPHA (MHC-II) helices (Tables 5 and 6). This agrees with data from Lau and Karplus 1994  who showed that most of the TR/MHC specificity comes from the CDR1 and CDR2 because mutations in these CDRs are able to change specificity between MHC-I and MHC-II. V-ALPHA and V-BETA CDR3-IMGT usually follow the same G-DOMAIN contact preference as the CDR2-IMGT but they can also have contacts with the other G-DOMAINs. For example, in the 1oga 3D structure , position 66 of G-ALPHA2 is contacted by the V-ALPHA CDR3-IMGT but also by the V-BETA CDR3-IMGT.
|Table 5:||TR V-ALPHA and V-BETA CDR interactions with pMHC-I. (A) V-ALPHA CDR-IMGT interactions, (B) V-BETA CDR-IMGT interactions, (C) V-ALPHA and V-BETA FR-IMGT interactions.|
|TR positions in bold indicate hydrogen bonds. Three dimensional (3D) structures are from IMGT/3Dstructure-DB , http://imgt.cines.fr. Lengths of the CDR-IMGT are shown within brackets. Amino acids are shown in the one-letter code. Sequences of the peptides are reported in Table 4, sequences of the TR V-ALPHA and V-BETA domains in Figure 7 and sequences of the MHC-I G-ALPHA1 and G-ALPHA2 in Figure 8.|
|Table 6:||V-ALPHA and V-BETA CDR interactions with MHC-II. (A) V-ALPHA CDR-IMGT interactions, (B) V-BETA CDR-IMGT interactions, (C) V-ALPHA and V-BETA FR-IMGT interactions.|
|TR positions in bold indicate hydrogen bonds. Three dimensional (3D) structures are from IMGT/3Dstructure-DB5, http://imgt.cines.fr. Lengths of the CDR-IMGT are shown within brackets. Amino acids are shown in the one-letter code. Sequences of the peptides are reported in Table 4, sequences of the TR V-ALPHA and V-BETA domains in Figure 7, and sequences of the MHC-II G-ALPHA and G-BETA in Figure 8.|
The diagonal orientation of the TR/pMHC complex puts the TR in a globally conserved position to "read-out" the peptide . V-ALPHA is on top of the peptide N terminus while V-BETA is on top of the peptide C terminus. TR positions implicated in the peptide recognition are in the CDR3-IMGT and generally to a lesser extent in the V-ALPHA CDR1-IMGT (Tables 5 and 6). Nearly every 3D structure shows different CDR3 conformations and binding mode. In the JM22/peptide/HLA-A complex (1oga) , the V-BETA CDR3-IMGT extensively contacts the peptide and G-ALPHA2 through hydrogen bonds (Table 5), by inserting itself between the peptide and the G-ALPHA2. In contrast, the 2C/peptide/H2-K1 complex (1jtr)  has comparatively fewer contacts between the V-BETA CDR3-IMGT and the peptide and the MHC, however the V-BETA CDR1-IMGT has more contacts and hydrogen bonds with the peptide and G-ALPHA2.
The TR LC13 and 2C were crystallized both alone and in complex with a pMHC. The structural superimposition of both V-DOMAIN scaffold alpha carbons reveals large movements of the CDR3 and of the CDR1, respectively. The V-ALPHA domains of LC13, in the 1mi5 and 1kgc 3D structures, have 3.5Å root mean square (RMS) between their CDR3. The V-ALPHA domains of 2C, in the 2ckb and 1tcr 3D structures, have 2.3Å RMS between their CDR1. The TR A6 was crystallized in complex with the same MHC but with different peptides. In these structures, the V-BETA CDR3 adopt different conformations to adapt to the different peptides . The CDR3 conformational change does not increase the binding surface but gives a better shape complementarity to the interface .
The 3D structure of the MHC main chain is well conserved and the peptide binding groove specificity is due to side chains physicochemical characteristics . Both MHC-I and MHC-II grooves have pockets where side chains of bound peptides may anchor , the specificity of a peptide to a given MHC being controlled by the physicochemical properties of the pockets. Conversely comparison of peptide sequence alignments and pMHC 3D structures have revealed that some anchored peptide positions with conserved properties were needed to bind a peculiar MHC allele. Several databases, SYFPEITHI , JenPep  and MHCpep , provide peptide sequences associated with MHC alleles together with anchor positions and experimental data on affinity. These observations have extensively been used in peptide/MHC binding prediction [46, 47, 48] (a list of prediction programs and servers is available at "The IMGT Immunoinformatics page", http://imgt.cines.fr). Nevertheless exceptions have been found [49, 50, 51] and it has been noted that only 30% of peptides with the expected pattern really bind whereas some peptides without the expected pattern do bind . Peptide/MHC binding prediction and epitope prediction remain a big challenge. In order to compare interactions between MHC domains of classes I and II and with peptides of different lengths, we have defined eleven IMGT pMHC contact sites which are based on the IMGT unique numbering for G-DOMAIN and G-LIKE-DOMAIN . IMGT contact sites allow comparison either with the IMGT reference pMHC contact sites, or with other IMGT contact sites. They also allow to underline the impact of mutations of altered peptides, such as the ones observed in altered Tax peptide in 1qsf and 1qse . IMGT pMHC contact sites are available for all the pMHC and TR/pMHC in IMGT/3Dstructure-DB , http://imgt.cines.fr.
With only 18 TR/pMHC 3D structures, the atomic details of TR/pMHC interactions already show a great deal of variability. IMGT standardization is a step towards a better understanding of the mechanisms ruling TR/pMHC recognition. It will help comparing new experimentally resolved 3D structures with published data. However the TR/pMHC interactions are far from being unravelled and the study of the TR/pMHC interactions with the other proteins of the immunological synapse will be crucial. For example, the interaction between a MHC and the CD4 considerably enhances the pMHC/TR sensibility [53, 54]. The understanding of the T cell triggering early events is subject to active studies.
Although the TR/pMHC binding represents a necessary step for the TR recognition, many factors, the TR affinity for the pMHC, the relocation of surface proteins such as CD4 or CD8 in the immunological synapse are necessary for generating the T cell activation signal. Each of these steps needs to be described and characterized so that data from different experiments can be integrated. IMGT standardization will be further extended on the IMGT Web site at http://imgt.cines.fr as new parameters will become available.
Users are requested to cite reference 5 and this article, and to quote the IMGT home page URL, http://imgt.cines.fr.
We are grateful to the IMGT team for helpful discussion. K. Q. was the recipient of a doctoral grant from the Ministère de l'Education Nationale, de l'Enseignement Supérieur et de la Recherche (MENESR) and is currently supported by a grant from the Association pour la Recherche sur le Cancer (ARC). IMGT is a registered Centre National de la Recherche Scientifique (CNRS) mark. IMGT is a National RIO Bioinformatics Platform since 2001 (CNRS, INSERM, CEA, INRA). IMGT was funded in part by the BIOMED1 (BIOCT930038), Biotechnology BIOTECH2 (BIO4CT960037) and 5th PCRDT Quality of Life and Management of Living Resources (QLG2-2000-01287) programmes of the European Union and received subventions from ARC and from the Génopole-Montpellier-Languedoc-Roussillon. IMGT is currently supported by the CNRS, the MENESR (Université Montpellier II Plan Pluri-Formation, BIOSTIC-LR2004 Région Languedoc-Roussillon and ACI-IMPBIO IMP82-2004), and GIS-AGENAE. Part of this work was carried out in the frame of the European Science Foundation Scientific Network Myelin Structure and its role in autoimmunity (MARIE).