SNPs used in forensic analyses are of four main categories: Identity Informative (IISNP), Ancestry Informative (AISNP), Phenotype Informative (PISNP), and Lineage Informative (LISNP). LISNPs found in the Y chromosome and mitochondrial genome may be useful in kinship analyses but are not widely implemented in commercial forensic assays. The remaining three SNP categories are described in detail below.
Tags: SNP, Identity, Ancestry, Phenotype, Pigmentation, Lineage
The three parts of a forensic SNP typing methodology are:
It is important for users to understand the research behind the panels, both for selecting a platform and for validation/training purposes. The genotyping platform is independent of the SNP application (Identity, Ancestry, Phenotype). Any commercial sequencing platform (e.g. MiSeq or S5) should be capable of providing correct genotypes for the SNP loci. At NIST, we have observed rare discordances when running the same SRM components across multiple platforms (SNP Information Values for NIST SRM 2391c). This information may provide insight in designing validation experiments, and the utility of SRM samples in such validations.
The SNP panels and interpretation models are specific to the application (Identity, Ancestry, Phenotype), as summarized in the following table and detailed below, by application.
| Application | Product Name | Number of SNPs | SNP Panel | Genotyping Platform | Analysis Software |
|---|---|---|---|---|---|
| Identity | Precision ID Identity Panel | 83 | SNPforID and Kidd | Ion Torrent S5/S5XL | Converge |
| Qiagen | 140 | SNPforID and Kidd | MiSeq | GeneGlobe | |
| ForenSeq | 94 | SNPforID and Kidd | MiSeq FGx | ForenSeq UAS | |
| Ancestry | Precision ID Ancestry Panel | 165 | Seldin and Kidd | Ion Torrent S5/S5XL | Converge |
| ForenSeq | 56 | Kidd | MiSeq FGx | ForenSeq UAS | |
| Phenotype | Precision ID Ampliseq DNA Phenotyping Panel | 24 | HIrisplex | Ion Torrent S5/S5XL | Converge |
| ForenSeq | 24 | HIrisplex | MiSeq FGx | ForenSeq UAS |
Ideal IISNPs have high heterozygosity across worldwide populations, which means they likely originated in early humans or higher primates, and they have a negligible effect on the fitness of an individual. IISNPs are used in a capacity similar to autosomal forensic STRs, for comparisons of reference samples to unknown samples. IISNPs may be of particular use in samples with degraded DNA, as the single base target means the amplicon can be designed much smaller than an STR target.
IISNP Panels:
The three commercially available assays containing IISNPs (ForenSeq, Precision ID, and QIAGEN) all contain combinations of SNPs found in publications from the SNPforID Consortium (Sanchez 2006) and the Kidd laboratory (Pakstis 2010).
IISNP Interpretation models and options:
Identity SNP markers for single source profiles can be interpreted similarly to STR loci, using random match probabilities. The following graph shows random match probabilities obtained from degraded samples sequenced for SNPs and typed in CE-based assays (adapted from Gettings 2015).
NIST and other laboratories are currently assessing the implications of combining panels of SNPs and STR markers into statistical analyses, particularly cases where markers are closely located on chromosomes. It is expected that researchers and vendors will develop methods for mixture interpretation including both SNP and STR data; such methods are not yet widely available.
SNPs for a universal individual identification panel
A multiplex assay with 52 single nucleotide polymorphisms for human identification
Performance of a next generation sequencing SNP assay on degraded DNA
AISNPs are often monomorphic for one allele (e.g. the “C” allele) in one population and monomorphic for the other allele (e.g. “T” allele) in all other populations. This is the opposite characteristic as compared to IISNPs. Some AISNPs appear to have arisen by chance through population bottlenecks while others conferred a distinct advantage, such as rs2814778, which became fixed in sub-Saharan Africa because this variant reduces an individual’s susceptibility to Malaria.
AISNP panels:
The ForenSeq system uses Ken Kidd’s panel of 55 AISNPs (Kidd 2015) and the Precision ID-S5 system uses both Kidd’s 55 and a panel of 128 SNPs from the Seldin Laboratory (Kosoy 2009). “Kidd 55” was developed specifically to identify the geographic/ethnic origin of an unknown sample as a forensic investigative tool; whereas, “Seldin 128” panel was meant to provide a method for determining and quantifying differences in continental populations using SNPs commonly found in arrays circa 2008. The following figures are adapted from each publication and demonstrate the goals of each panel, Seldin 128 to provide continental level ancestry and Kidd 55 to provide a more refined prediction.

ForenSeq and Precision ID also have eye color and hair color SNPs, known as Phenotype Informative SNPs, discussed below.
AISNP Interpretation models and options:
The ForenSeq Universal Analysis Software uses Principle Components Analysis (PCA) to plot the questioned sample alongside 1000 genome training data. The plot is limited to the first two components of the PCA. As shown below, this may make it challenging to distinguish e.g. a Hispanic individual (left image) from half European/ half East Asian individual (right image).

The Precision ID software uses the same method implemented in Dr. Kidd’s website, FROG-kb (http://frog.med.yale.edu). This method provides numerical likelihoods of the AISNP profile being found in each reference population present in Dr. Kidd’s database, ALFRED. In the partial output shown below, a Hispanic individual most closely groups with populations found in Europe, Asia and the Americas.

ForenSeq data could be interpreted with both the 1000 Genome method in the Verogen software and with the method behind the Precision ID software, by entering the genotypes into the Kidd 55 AISNP page on FROGkb (NOTE: there are strand reporting differences between Verogen and Dr. Kidd’s database, which are particularly challenging for CG or AT SNPs). Such analyses would provide a point of comparison and a laboratory could implement whichever interpretation method is more accurate/useful for the populations of interest. Ideally, the laboratory would test many individuals representative of the primary populations of interest, and understand how those individuals would be predicted to establish reporting guidelines.
A customized interpretation model could also be developed using an application such as Snipper (as initially described in Phillips 2007, http://mathgene.usc.es/snipper) and published allele frequency data from any population of interest. However, the panel of SNPs would also need to be capable of differentiating the populations of interest; therefore, a customized interpretation approach may also necessitate a customized SNP panel.
Progress toward an efficient panel of SNPs for ancestry inference
Ancestry informative marker sets for determining continental origin and admixture proportions in common populations in America
Inferring ancestral origin using a single multiplex assay of ancestry-informative marker SNPs
PISNPs in forensic applications typically refer to markers predictive of pigmentation levels in eyes, hair, and skin.
PISNP Panel:
The two commercially available assays containing PISNPs (ForenSeq and Precision ID) both include the SNPs found in the HIrisPlex system (Walsh 2013).
PISNP Interpretation:
Precision ID directs users to the HIrisPlex website for interpretation of results (https://hirisplex.erasmusmc.nl/), whereas the ForenSeq UAS interface incorporates the HIrisPlex data and model. ForenSeq users can also enter data into the HIrisPlex website for comparison/validation. Because the UAS model is fixed and the HIrisPlex website is updated periodically with additional data, minor differences in prediction may be noted.
The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA