Lecture 19: DNA Mixtures

(version 24 March 2008)

This material is copyrighted and MAY NOT be used for commercial purposes

 You are visitor number   since 17 March 2008 

Suppose you observe the following data for three markers in a crime sample

Marker Sample
1 11,12,14
2 14, 16
3 13, 15, 16, 17

Clearly, since there are more than two alleles at some markers, this is a mixture of two (or more) DNA samples.

Now suppose you know the victim's DNA,

Marker Sample Victim
1 11,12,14 11, 12
2 14, 16 14, 14
3 13, 15, 16, 17 13, 15

How do we analyze this data?

Again, two issues:

While both of these questions are straightforward for a single-contributor sample, there are a number of complications with a mixture!

When can we exclude someone as a contributor?

Suppose there are two suspects, Bill and Ted. Their genotypes are

Marker Bill's genotype Ted's genotype
1 14, 14 12, 14
2 14, 16 14, 14
3 12, 18 16, 17

Can either (or both) be excluded? Let's look at the data again.

Marker Sample Victim
1 11,12,14 11, 12
2 14, 16 14, 14
3 13, 15, 16, 17 13, 15

Note that Marker 3 excludes Bill, but Ted is consistent as a contributor to all three markers.

Now suppose Ted's three marker is 16,16. Can we exclude Ted?

YES:, IF we assume only two contributors to the mixture (as Ted does not have a 17)

NO: IF we assume there may be more than two contributors.

Two-Person Mixtures

If we assume only a SINGLE other person besides the victim contributed to the DNA, then their possible genotypes are

Marker Sample Victim Contributor
1 11,12,14 11, 12 11, 14 OR 12, 14 OR 14,14
2 14, 16 14, 14 14, 16 OR 16, 16
3 13, 15, 16, 17 13, 15 16, 17

Let's compute the probabilty of failure to exclude (assuming a two person mixture). Suppose the allele frequencies at these three markers are

The probability of a random person matching (for marker one) is

Pr(11,14) + Pr(12,14) + Pr(14,14) = 2*0.05*0.2 + 2*0.1*0.2 + 0.2*0.2 = 0.10

The probability of a match at marker 2 is

Pr(14,16) + Pr(16,16) = 2*0.1*0.3 + 0.3*0.3 = 0.15

The probability of a match at marker 3 is

Pr(16,17) = 2*0.3*0.2 = 0.12

Hence, probablity a random person (in a two-person mixture) would fail to exclude is the product of these three,

Pr(Failure to exclude) = 0.10*0.15*0.12 = 0.0018 or roughly 1/556

As a point of reference, Freq(Victim) = [ 2*0.5*0.1 ] * [ 0.1*0.1 ] * [ 2*0.1*0.01 ] = 1/500,000

Key point: Because several genotypes can be consistent with a mixture (even with a two person mixture where one genotype is know), the probability of failure to exclude is much higher with a mixture than a single contributor sample. Put another way, the strength of evidence is not as great against a suspect with a mixture.

Arbitrary number of Contributors to a Mixtures

IF we know the EXACT number of contributors to a mixture, we can use the above approach and extend this to three, four, for five (or more) contributors.

A major problem is how do we know the number of contributors?

We can set a lower limit to the number of contributors, as 3 alleles means AT least two, 5 alleles AT least three, 7 alleles at least four, etc. The problem is that if there are (say) three contributors, there is some chance than none of the markers in the mixture would show more than four alleles, which we might take to be a two person mixture.

If, however, we wish to make no assumptions as to the number of contributors, then the possible genotypes look sometime like

Marker Sample Potential Contributors
1 11,12,14 11,11 OR 11,12 OR 11,14 OR 12,12 OR 12,14, 14,14
2 14, 16 14,14 OR 14,16 OR 16,16
3 13, 15, 16, 17 13,13 OR 13,15 OR 13,16 OR 13,17 OR 15,15 OR 15,16 OR 15,17 OR 16,16 OR 16,17 OR 17,17

Notice that we DID NOT use any information on the genotype of the victim (who is typically a known contributor).

Notice the increase in complications that arise with an unknown number of contributors.

Computing the failure to exclude probablity

While one coould compute all of the possible genotype probabilities for each marker, there is a nice short-cut.

Consider marker 1. For a match, the first allele must be eitehr an 11, 12, or 14. This occurs with probability Freq(11) + freq(12) + freq(14). Likewise, for a failure to exclude the second allele must also be an 11, 12, or 14. Putting these together,

Prob(fail to exclude at marker 1) = [ Freq(11) + freq(12) + freq(14) ] 2 = (0.05+0.1+0.2)2 = 0.1125 Likewise, for markers 2 and 3 we have

Prob(fail to exclude at marker 2) = (0.1+ 0.3)2 = 0.16

Prob(fail to exclude at marker 3) = (0.1+ 0.01 + 0.3 + 0.2)2 = 0.37

Thus, Probability that a random individual is not excluded as a contributor is

0.1125*0.16*0.37 = 0.007 or 1/137 This type of calculation is also called a random man not excluded calculation

A Final complication: Allelic Dropout

Very often, one can use the peak height in an electropherogram to estimate the number of contributors.

However, if an individual is a very minor contributor to a mixture, then their alleles will have very small peak heights in the electropherogram.

This can lead to allelic dropout, where the alleles from a very minor contributor may to too rare relative to the major contributor to give a significant peak height.

These loci (showing only victim's DNA) are then discarded. The result is that the 13 CODIS markers may only have (say) 5-6 that give a mixture signal, which results in a much larger random-match probability.

A more problematic case where where some minor alleles amplify and others do not. Suppose a suspect has a 14,15 genotype at marker one. If both alleles amplify, we exclude this individual (as there is no 15 in the sample).

However, if the 14 allele is just significant, the 15 allele may not amply, giving the mixture we see.