NIST logo
STRBase logo

SNPs used in forensic analyses are of four main categories: Identity Informative (IISNP), Ancestry Informative (AISNP), Phenotype Informative (PISNP), and Lineage Informative (LISNP). LISNPs found in the Y chromosome and mitochondrial genome may be useful in kinship analyses but are not widely implemented in commercial forensic assays. The remaining three SNP categories are described in detail below.

Tags: SNP, Identity, Ancestry, Phenotype, Pigmentation, Lineage


Overview
A Single Nucleotide Polymorphism (SNP) is a one base change in DNA sequence. SNPs are typically biallelic, e.g. some people have a “C” allele where other people have a “T” allele, and they have lower mutation rates than STR alleles. Some SNPs are pathogenic, particularly in coding regions of the genome when they result in the incorporation of a different amino acid during translation. Deleterious changes are often subject to negative selective pressure, meaning they are less likely to be carried forward into future generations. Other SNPs may offer advantages, e.g. disease resistance, in which case they may be subject to positive selective pressure, increasing their likelihood of becoming fixed in the population. The majority of SNPs in the human genome are in non-coding regions and appear to have minimal or no effect. These SNPs are said to be under neutral selection and are fixed in or lost from a population by chance.

Back to Top
Methods

The three parts of a forensic SNP typing methodology are:

  1. Panel of SNPs
  2. Genotyping platform
  3. Interpretation model

It is important for users to understand the research behind the panels, both for selecting a platform and for validation/training purposes. The genotyping platform is independent of the SNP application (Identity, Ancestry, Phenotype). Any commercial sequencing platform (e.g. MiSeq or S5) should be capable of providing correct genotypes for the SNP loci. At NIST, we have observed rare discordances when running the same SRM components across multiple platforms (SNP Information Values for NIST SRM 2391c). This information may provide insight in designing validation experiments, and the utility of SRM samples in such validations.

The SNP panels and interpretation models are specific to the application (Identity, Ancestry, Phenotype), as summarized in the following table and detailed below, by application.

Application Product Name Number of SNPs SNP Panel Genotyping Platform Analysis Software
Identity Precision ID Identity Panel 83 SNPforID and Kidd Ion Torrent S5/S5XL Converge
Qiagen 140 SNPforID and Kidd MiSeq GeneGlobe
ForenSeq 94 SNPforID and Kidd MiSeq FGx ForenSeq UAS
Ancestry Precision ID Ancestry Panel 165 Seldin and Kidd Ion Torrent S5/S5XL Converge
ForenSeq 56 Kidd MiSeq FGx ForenSeq UAS
Phenotype Precision ID Ampliseq DNA Phenotyping Panel 24 HIrisplex Ion Torrent S5/S5XL Converge
ForenSeq 24 HIrisplex MiSeq FGx ForenSeq UAS

Back to Top
Identity Informative SNPs

Ideal IISNPs have high heterozygosity across worldwide populations, which means they likely originated in early humans or higher primates, and they have a negligible effect on the fitness of an individual. IISNPs are used in a capacity similar to autosomal forensic STRs, for comparisons of reference samples to unknown samples. IISNPs may be of particular use in samples with degraded DNA, as the single base target means the amplicon can be designed much smaller than an STR target.

IISNP Panels:
The three commercially available assays containing IISNPs (ForenSeq, Precision ID, and QIAGEN) all contain combinations of SNPs found in publications from the SNPforID Consortium (Sanchez 2006) and the Kidd laboratory (Pakstis 2010).
IISNP Interpretation models and options:
Identity SNP markers for single source profiles can be interpreted similarly to STR loci, using random match probabilities. The following graph shows random match probabilities obtained from degraded samples sequenced for SNPs and typed in CE-based assays (adapted from Gettings 2015).

Degraded DNA Study: Random Match Probability graph

NIST and other laboratories are currently assessing the implications of combining panels of SNPs and STR markers into statistical analyses, particularly cases where markers are closely located on chromosomes. It is expected that researchers and vendors will develop methods for mixture interpretation including both SNP and STR data; such methods are not yet widely available.


Additional Information Back to Top
Ancestry Informative SNPs

AISNPs are often monomorphic for one allele (e.g. the “C” allele) in one population and monomorphic for the other allele (e.g. “T” allele) in all other populations. This is the opposite characteristic as compared to IISNPs. Some AISNPs appear to have arisen by chance through population bottlenecks while others conferred a distinct advantage, such as rs2814778, which became fixed in sub-Saharan Africa because this variant reduces an individual’s susceptibility to Malaria.

AISNP panels:
The ForenSeq system uses Ken Kidd’s panel of 55 AISNPs (Kidd 2015) and the Precision ID-S5 system uses both Kidd’s 55 and a panel of 128 SNPs from the Seldin Laboratory (Kosoy 2009). “Kidd 55” was developed specifically to identify the geographic/ethnic origin of an unknown sample as a forensic investigative tool; whereas, “Seldin 128” panel was meant to provide a method for determining and quantifying differences in continental populations using SNPs commonly found in arrays circa 2008. The following figures are adapted from each publication and demonstrate the goals of each panel, Seldin 128 to provide continental level ancestry and Kidd 55 to provide a more refined prediction.

Text Needed

ForenSeq and Precision ID also have eye color and hair color SNPs, known as Phenotype Informative SNPs, discussed below.
AISNP Interpretation models and options:
The ForenSeq Universal Analysis Software uses Principle Components Analysis (PCA) to plot the questioned sample alongside 1000 genome training data. The plot is limited to the first two components of the PCA. As shown below, this may make it challenging to distinguish e.g. a Hispanic individual (left image) from half European/ half East Asian individual (right image).

Text NeededText Needed

The Precision ID software uses the same method implemented in Dr. Kidd’s website, FROG-kb (http://frog.med.yale.edu). This method provides numerical likelihoods of the AISNP profile being found in each reference population present in Dr. Kidd’s database, ALFRED. In the partial output shown below, a Hispanic individual most closely groups with populations found in Europe, Asia and the Americas.

Text Needed

ForenSeq data could be interpreted with both the 1000 Genome method in the Verogen software and with the method behind the Precision ID software, by entering the genotypes into the Kidd 55 AISNP page on FROGkb (NOTE: there are strand reporting differences between Verogen and Dr. Kidd’s database, which are particularly challenging for CG or AT SNPs). Such analyses would provide a point of comparison and a laboratory could implement whichever interpretation method is more accurate/useful for the populations of interest. Ideally, the laboratory would test many individuals representative of the primary populations of interest, and understand how those individuals would be predicted to establish reporting guidelines.

A customized interpretation model could also be developed using an application such as Snipper (as initially described in Phillips 2007, http://mathgene.usc.es/snipper) and published allele frequency data from any population of interest. However, the panel of SNPs would also need to be capable of differentiating the populations of interest; therefore, a customized interpretation approach may also necessitate a customized SNP panel.


Additional Information Back to Top
Phenotype Informative SNPs

PISNPs in forensic applications typically refer to markers predictive of pigmentation levels in eyes, hair, and skin.

PISNP Panel:
The two commercially available assays containing PISNPs (ForenSeq and Precision ID) both include the SNPs found in the HIrisPlex system (Walsh 2013).

PISNP Interpretation:
Precision ID directs users to the HIrisPlex website for interpretation of results (https://hirisplex.erasmusmc.nl/), whereas the ForenSeq UAS interface incorporates the HIrisPlex data and model. ForenSeq users can also enter data into the HIrisPlex website for comparison/validation. Because the UAS model is fixed and the HIrisPlex website is updated periodically with additional data, minor differences in prediction may be noted.


Additional Information Back to Top