Understanding the repertoire of ab T cells that responds to different tumor vaccines has become an integral part of our research. We wish to understand these repertoires, in part, by analyzing the gene segments that encode the TCRs of the responding cells. We are taking advantage of the new high throughput sequencing technologies for this purpose (such as the Roche 454 technology), and it has become similarly important to update our TCR sequence analysis tools. Here we provide a program that is capable of analyzing up to 250,000 different sequences of 400 bases each provided in a FASTA format. The results include the frequency of different variable (V) and joining (J) gene segments and the frequency that they pair with each other. In addition, the analysis gives the CDR3 lengths.
Mouse TCR alpha sequence analysis tool:
Mouse TCR beta sequence analysis tool:
The TCR gene segment nomenclature is confusing!
In 1995 the “WHO-IUIS Nomenclature Sub-Committee on TCR Designation” agreed on a TCR nomenclature system. For the mouse ab TCR molecules, the 75 known a chains and the 23 b chains were aligned and grouped into subfamilies; subfamilies were 75% similar. The TCRs were named so that they could both be easily referred to in electronic files and without Greek letters, periods, hyphens, asterisks, or distinction between upper- and lower-case characters. For example, TCRBV8S3, also referred to as Vb8.3 in text. Many TCRs have been well-characterized in the literature using these names.
The ImMunoGeneTics (IMGT) database also developed a nomenclature (one scientist served on both committees). Most of the nomenclature does not overlap. At one point, the gene segments in the IMGT database were numbered in the same order as they appear on the chromosome, but new alpha genes were identified and it was discovered that the entire alpha locus has been duplicated in some inbred strains of mice (129) and triplicated in others (C57BL/6)!! (K/M discussions.) Thus, the numbering is no longer in order. In addition, investigators have identified differences in sequences that were assumed to be the same gene segment. It is unknown if these differences are sequencing errors, polymorphisms, or different genes resulting from duplication. The Vb8.3 gene using the first nomenclature is referred to as TRVB13-1 by IMTG, leading to mistakes and confusion.
IMTG provides a program on their website that will analyze one sequence at a time according to their nomenclature. For different purposes, we have tried to incorporate the different nomenclatures, to include the corrections that we are aware of (navigate to the specific V and J sequences from the frequently asked questions), and this tool also has the ability to compare many sequences to each other. Currently the program includes the mouse TCR gene segments, but depending on the interest and the direction of our research, we may include human gene segments and immunoglobulin in the future.
This program was developed by Jonathan Sprague, M.D., while he was a student at the University of Colorado School of Medicine. He received help and input from Kimberly Jordan, Ph.D., who provided most of the data to be analyzed during development. We also received significant input from Philippa Marrack, Ph.D. and John Kappler, Ph.D., particularly deciphering the alpha genes, and James Scott-Brown, Ph.D, in determining what features should be incorporated into the program.
IMGT, the international ImMunoGeneTics database. MP Lefranc , V Giudicelli , C Ginestoux , Bodmer J, W Müller, R Bontrop, M Lemaitre, A Malik, V Barbié, D Chaume. Nucleic Acids Res. 27(1):209-12 (1999).