We strongly encourage anyone who is considering nonhuman primate MHC allele discovery to use one of the full-length allele discovery protocols described below.

2015 - present : Pacific Biosciences full-length transcript sequencing

Pacific Biosciences long-read single-molecule sequencing enables full-length MHC class I and class II transcript sequencing. We are transitioning our allele discovery efforts to this platform. In our early studies, we are extending partial-length MHC transcripts to full-length transcripts and identifying many novel full-length sequences. Within a few years, we expect full-length transcript sequencing will result in allele databases that contain only full-length sequences; the existing criteria for naming MHC alleles based on complete exon 2/3 sequences will be retired. Our protocols for MHC class I and class II transcript sequencing are rapidly evolving. We are currently performing our Pacific Biosciences sequencing at the University of Washington PacBio Sequencing Service.

2012 - 2014 : Illumina MiSeq MHC class I transcript reconstruction

For three years, our primary method for MHC class I allele discovery involved PCR amplifying a full-length cDNA transcript, fragmenting the amplicon by tagmentation, and sequencing fragments on the Illumina MiSeq. The MiSeq can only generate a maximum of 600 base pairs of sequence per read. This necessitates stitching together overlapping sequences to generate full-length transcripts. In the absence of PCR artifacts (substitutions, chimeras) and sequencing artifacts, these assemblies would be straightforward. In practice, it is very difficult to accurately and confidently assemble overlapping reads to reconstruct full-length sequences.

2012 - 2014 : Roche/454 MHC class II transcript pyrosequencing

Before Pacific Biosciences sequencing was accurate enough for MHC allele discovery, Roche/454 pyrosequencing provided the best platform for 700-800 base pair long-read sequencing. We developed a protocol for MHC class II -DPA, -DPB, -DQA, -DQB, -DRA, and -DRB transcript sequencing (MHC class II transcripts are shorter than MHC class I transcripts). We performed these experiments on the Roche/454 GS FLX+ at the University of Illinois Urbana-Champaign, though they have since retired this platform.

pre-2012 : Sanger MHC class I and class II transcript sequencing

Prior to the advent of deep sequencing, MHC transcripts were sequenced by first generating a full-length cDNA PCR amplicon or preparing a cDNA library, cloning individual molecules, and Sanger sequencing individual plasmid clones. Screening 96-192 clones per animal typically detected major, transcriptionally abundant MHC sequences but did not consistently detect less abundant transcripts. We no longer perform Sanger sequencing of MHC transcripts ourselves.

Protocols for nameable MHC allele discovery

Per IPD, MHC class I transcripts can receive an official name if exons 2 and 3 are completely sequenced. MHC class II transcripts can be named if exon 2 is completely sequenced. We have developed (and subsequently retired) a series of protocols that can be used to generate nameable MHC sequences. We discourage investigators from using these protocols and recommend using the full-length transcript sequencing approaches described above instead. Within the next few years, we expect full-length transcripts to become the minimal standard for naming a new allele.