Lecture 26: Beyond CSI II: Genetic Markers for Ancestry, Hair and Eye color

(version 21 Apil 2008)

This material is copyrighted and MAY NOT be used for commercial purposes

 You are visitor number   since 21 April 2008 

Causes of population divergence and differentiation

Suppose we have a series of populations that come from a common ancestor, but are subsequently isolated. Over time, these populations start to diverge from each other. Why?

  • Founder effects: If a few individual migrate out to form a new population, random sampling can result in this new population differing from the main population.

    Modern humans arose from Africa, with a subset of the African founder population migrating out to Europe, Asia, and (eventually) the Americas.

    As a result, African populations show much more genetic diversity than do European and Asian populations, which were derived from small samples of the African population.

  • Genetic Drift and Mutation: over time, allele frequencies can randomly change, with the rate of change faster in smaller populations. Likewise, new mutations that appear following isolation are restricted to the population in which they arose, unless there are high levels of migration.

  • Selection: Certain alleles can be favored by selection in different environments. For example, in malarial regions, sickle-cell alleles are rather common. As heterozygotes, these provide some protection against malarial infections, esp. in children. However, sickle-cell homozygotes are lethal, typically dying in their early adulthood.

    Shared variation and divergence in humans

    While we often focus on differences between humans whose ancestors are from different parts of the world, in relativity only around 3-5 percent of genetic variation is specific to certain populations.

    For example, although there are small differences in allele frequencies accross groups in the CODIS markers, one cannot predict orgins of ancestry from a CODIS profile.

    However, there are genetic markers whose allele frequencies differ dramatically between populations. These are called AIMS, for Ancestry Informative Markers.


    An example of an AIM is the Duffy gene (a blood group marker), which has a null (negative) allele that has a frequency close to 100 percent in individuals of Sub-Saharan extraction, but it is very rare in individuals whose ancestors were outside of this region. Thus if we find a duffy null homozygote in a crime sample, the person contributing the sample is likely of Sub-Saharan ancestry.

    As we have seen, Y chromosome haplotypes (collection of linked markers) and mtDNA haplotypes are also AIMs, providing information on the ancestral origion of the male (Y) and female (mtDNA) lines,

    For example, an A Y chromosome halotype implies a Sub-Saharan origin, a Q haplotype native American, and an R haplotype European or North African.

    The Search for AIMs

    As mentioned, most genetic markers do not show very large differences between populations, while a small fraction due.

    There has been a very active search for such AIMs, following a paper from Marc Feldman's group at Stanford that showed that around 300 markers, many showing only modest differences between populations, can be sufficient in classify individuals as to their ancestral origins (e.g., 30% African, 60% European, 5% native American, 5% Asian). While these numbers sound precise, in reality there is a high degree of uncertainly with many of these estimates.

    The company DNA print offers tests for both private individuals and law enforcement to ascertain ancestry.

    Wired article: The Inconvenient Science of Racial DNA Profiling

    Their forensic product, DNAWitness, offers to predict the ancestry of the individual who contributed a crime sample. It has been used in a couple of high-profile cases:

  • Derrick Todd Lee

  • Diego Olmos-Alcalde

    The on-going hunt for the Minstead Rapist

    A very interesting, and on-going case, using AIMS in the hunt for the Minstead Rapist. Wiki article.

    Since 1992, a burglar/rapist has been operating in South East London, with over 90 attacks. He is called the Minstead Rapist and the hunt for him is the largest and most complex rape investigation ever undertaken by the London police. He is also called the Night Stalker by the press as the attacks mainly take place at night.

    He targets older woman, and has broken into the homes of over 90 elderly women aged between 68 and 93. He is positively linked to four reported rapes and around 30 other sexual assaults.

    The Minstead Rapist is thought to be forensically-aware since he has never left a fingerprint at any scene. However, he does not use condoms and his DNA has been captured, starting in 1992. Recall that (since 1995) the British national DN database colelcting DNA from any suspect arrested. His profile has not been found. If the rapist has ever been arrested for burglary or a related offence, it must have been some time before 1995 when police began to routinely gather DNA samples from prisoners.

    DNAWitness have pointed towards a north Caribbean ethnic origin for the rapist (The results of the tests revealed the DNA contained alleles from America, Europe and sub-Saharan Africa, a combination found only in the Caribbean), and the police have identified around 21,000 possible suspects that fit such a profile.

    In March 2004, Operation Minstead detectives hand-delivered a letter to hundreds of black men in South London, asking for their help in voluntarily providing a DNA sample for elimination purposes. Police explained that they desperately needed to reduce the vast number of suspects in the operation and that this was the best way to do so. Volunteers were assured that their DNA sample would be destroyed as soon as it was confirmed to be unmatched with the rapist's DNA. The majority of those potential suspects were eager to help if it would assist police in catching the suspect. However 125 men initially refused to provide a sample, believing it was discriminatory and breached their human rights. Police brought pressure to bear on those who refused, explaining that their behaviour could be construed as suspicious.

    Although they were able to reduce the list of potential suspects from 21,000 to 1,000, police have now resigned themselves to only being able to obtain the DNA of certain suspects still on the list if and when they are arrested for an unrelated offence.

    The Minstead rapist has struck as recently at 15 November 2007.

    Molecular Photofitting: Predicting Appearance from a DNA Sample

    The term Molecular Photofitting was coined by Tony Frudakis (chief scientist of DNAPrint), and refers to providing some description of what a contributor to a DNA sample looks like.

    Some features are very complex, and unlikely to be predicted from a DNA sample (such as general features of the face).

    However, some features, such as hair, and eye color, are determined by just a few genes and hence may be easier to predict. Indeed, Murray Brilliant's lab here at Arizona (department of Pediatrics) has been able to use just 4-6 markers to explain around 50-70 percent of the variance in eye and hair color.

    Human Height should be even easier to predict, as around 90 percent of the variance in these traits is additive genetic variance, and hence easy to predict.