Several nonhuman primate MHC genotyping technologies are currently in use. All are designed to determine which MHC alleles are present in a given animal, but differ in their sensitivity, cost, ease-of-use, and throughput. Protocols for performing these assays and software for analyzing genotyping data are available, in addition to a guide to the high-value alleles that are most relevant to most genotyping users. A brief overview of different genotyping approaches and example results are presented below:

Deep sequencing of MHC class I and class II exon 2 genomic DNA or cDNA


This method focuses on sequencing the highly polymorphic exon 2 portions of MHC class I and class II alleles. Primers are designed in conserved regions flanking exon 2 such that essentially all MHC alleles encoded by an animal can be detected with as few as 7 sets of primers (one for all MHC class I loci, six for the MHC class II loci). The exon 2 amplicons can be generated from either genomic DNA or cDNA templates. All amplicons are molecularly barcoded with a unique sequence tag for each sample, and multiplex pools of samples are deep sequenced. The resulting reads are then compared against a species-specific database of known exon 2 allele sequences to determine genotypes. Exon 2 sequencing is sufficient for lineage-level resolution of MHC alleles.

Deep sequencing of MHC class I and class II exon 2 templates is relatively low-cost, easy to perform, and very high-throughput with the Illumina MiSeq instrument. Since genotypes are determined from direct sequence analysis, this method is more replicable and less ambiguous than the high-throughput methods that predate deep sequencing.
Stacks Image 12

Deep sequencing of MHC-enriched genomic DNA


This method incorporates comprehensive MHC genotyping into whole exome sequencing. It supplements the human whole exome (VCRome 2.1) and rhesus macaque exon (rheMac2) capture probe sets designed by the Baylor Human Genome Sequencing Center with a minimal series of MHC target capture probes developed in conjunction with Roche/Nimblegen. These MHC probes are designed to capture the exons, introns, 3'UTR, and 1kb of 5' upstream promoter sequences of MHC class I and class II loci. These combined probe sets for better capturing rhesus exomes (or Rhexomes) are used during a target capture process to generate an exome library enriched with MHC sequences for sequencing on an Illumina HiSeq instrument. Genotypes are determined by comparing the MHC sequencing reads against databases of known macaque alleles, focusing on exon 2, 3, and 4 for class I and exons 2 and 3 for class II.

Deep sequencing of full-length MHC-enriched genomic DNA typically provides a higher level of allelic resolution than the exon 2 genomic deep sequencing method, since additional exons are examined. However, it is a lower throughput and higher cost method since it essentially provides MHC genotypes as a supplement to a whole exome analysis.
Stacks Image 16

Deep sequencing of full-length MHC cDNA transcripts


This method mirrors the full-length allele discovery protocols. Full-length MHC class I or class II amplicons are generated from cDNA and sequenced using the PacBio RS II sequencing system. PacBio sequencing produces long reads from single molecules, so it is possible to sequence a full-length ~1.1kb MHC class I or ~800bp MHC class II cDNA transcript in its entirety from a single sequencing reaction. The system does not require cloning to sequence a single allele in isolation. PacBio sequences are processed first to characterize any novel nonhuman primate alleles, which are then added to the existing database of known alleles for a population. The genotype of each sample sequenced is determined by comparing all raw PacBio reads against the updated database of alleles to discern the complement of alleles observed in each animal.

Deep sequencing of full-length MHC transcripts provides the least ambiguous genotyping results, since all alleles are resolved to the highest possible resolution for the coding region. It is relatively easy to perform and is intermediate in terms of throughput, since less samples are included per multiplex sequencing pool than the exon 2 genomic deep sequencing method to ensure enough transcripts are detected per sample to produce an accurate genotype (the PacBio sequencer produces ~1000-fold less reads per cell than the Illumina MiSeq). Cost for this method is higher than the exon 2 genomic DNA deep sequencing.
Stacks Image 22

Allele-specific PCR


This method detects presence or absence of specific alleles. Sequence-specific primers (SSP) are designed so that the 3'-most base of the primer corresponds to a single-nucleotide polymorphism that can uniquely identify a particular allele (such as a position where the allele of interest contains a C, but every other allele contains a T). A polymerase lacking proofreading capabilities is used for the PCR-SSP reaction so that the primers will only bind and amplify a product in the presence of that specific allele. PCR-SSP products are run on a gel, and any samples producing a band are presumed to contain that allele (samples without product bands are presumed negative for that allele). The major limitation to this method is that the full database of MHC alleles is unknown for most nonhuman primate populations. PCR-SSP primers can be designed for polymorphic sites that distinguish one allele in a population against all other known alleles, but it is impossible to know if there is another as-yet-unidentified allele in that population that would share that same polymorphism. This can lead to false positives. Sequence verifying PCR-SSP products can reduce false positive calls, but cannot distinguish between closely-related alleles that only differ somewhere outside of the span of the PCR-SSP amplicon.

Allele-specific PCR is easy to perform and high-throughput. It is relatively low-cost, but sequence verification of positive amplicons adds some cost beyond the basic assay. However, gel bands can sometimes be faint and difficult to interpret and it is prone to false positives, especially in populations with incomplete MHC allele databases.
Stacks Image 20
Microsatellite haplotyping 
This method is based around size analysis of short tandem repeat (STR) segments of the genome. STRs are simple repeats of typically 2-4 nucleotides (such as AG, repeated AGAGAG) with a high degree of variation in the number of times the repeating unit occurs - for instance, an individual may have one chromosome with the AG repeating 10 times (creating a run of 20 total nucleotides), and on the other chromosome the AG repeats 18 times (creating a run of 36 total nucleotides). Primers are designed to flank these STRs, creating variable-sized fragments when amplified. A profile of multiple STR markers can distinguish one individual from another. Microsatellite haplotyping is typically of limited usefulness outside of related groups of individuals since it is extremely rare for unrelated individuals to share a profile of STR markers, but a child will share half of its marker sizes with its mother and half with its father. Microsatellite haplotyping of the MHC region can therefore be used in pedigreed colonies of nonhuman primates, but cannot be used for most outbred populations. One exception is cynomolgus macaques from the Indian Ocean island of Mauritius, since this geographically isolated population arose from a very small group of founding animals and the MHC complement of the entire population is limited to seven haplotypes. A panel of 16 STR markers was developed that spans the ~5 Mb MHC genomic region (class I and class II). Different STR profiles are associated with each of the seven haplotypes, and the complement of MHC alleles expressed by a Mauritian cynomolgus macaque can be inferred from its STR profile.
Microsatellite haplotyping is relatively low-cost, easy to perform, and high-throughput. However, it is of limited usefulness outside of Mauritian cynomolgus macaques and pedigreed colonies. It can also be difficult to infer expressed MHC alleles when there are chromosomal recombination events.
Stacks Image 38