(version 24 March 2008)

* This material is copyrighted and MAY NOT be used for commercial purposes*

You are visitor number |
since 17 March 2008 |

Suppose you observe the following data for three markers in a crime sample

Marker |
Sample |

1 |
11,12,14 |

2 |
14, 16 |

3 |
13, 15, 16, 17 |

Clearly, since there are ** more than two alleles at some markers**, this is a **mixture** of two (or more) DNA samples.

Now suppose you know the victim's DNA,

Marker |
Sample |
Victim |

1 |
11,12,14 |
11, 12 |

2 |
14, 16 |
14, 14 |

3 |
13, 15, 16, 17 |
13, 15 |

Again, two issues:

- Under what conditions can an individual be
**excluded**as a contributor? - Given a failure to exclude, how likely is this?

While both of these questions are straightforward for a single-contributor sample, there are a number of complications with a mixture!

Suppose there are two suspects, Bill and Ted. Their genotypes are

Marker |
Bill's genotype |
Ted's genotype |

1 |
14, 14 |
12, 14 |

2 |
14, 16 |
14, 14 |

3 |
12, 18 |
16, 17 |

Can either (or both) be excluded? Let's look at the data again.

Marker |
Sample |
Victim |

1 |
11,12,14 |
11, 12 |

2 |
14, 16 |
14, 14 |

3 |
13, 15, 16, 17 |
13, 15 |

Note that Marker 3 excludes Bill, but Ted is consistent as a contributor to all three markers.

Now suppose Ted's three marker is 16,16. Can we exclude Ted?

**YES:**, IF we assume only two contributors to the mixture (as Ted does not have a 17)

**NO:** IF we assume there may be more than two contributors.

Marker |
Sample |
Victim |
Contributor |

1 |
11,12,14 |
11, 12 |
11, 14 OR 12, 14 OR 14,14 |

2 |
14, 16 |
14, 14 |
14, 16 OR 16, 16 |

3 |
13, 15, 16, 17 |
13, 15 |
16, 17 |

Let's compute the probabilty of failure to exclude (assuming a two person mixture). Suppose the allele frequencies at these three markers are

- Marker 1
- freq(11) = 0.05, freq(12) = 0.1, freq(14) = 0.2

- Marker 2
- freq(14) = 0.1, freq(16) = 0.3

- Marker 3
- freq(13) = 0.1, freq(15) = 0.01, freq(16) = 0.3, freq(17) = 0.2

The probability of a random person matching (for marker one) is

**Pr(11,14) + Pr(12,14) + Pr(14,14) = 2*0.05*0.2 + 2*0.1*0.2 + 0.2*0.2 = 0.10**

The probability of a match at marker 2 is

** Pr(14,16) + Pr(16,16) = 2*0.1*0.3 + 0.3*0.3 = 0.15**

The probability of a match at marker 3 is

** Pr(16,17) = 2*0.3*0.2 = 0.12**

Hence, probablity a random person (in a two-person mixture) would fail to exclude is the product of these three,

**
Pr(Failure to exclude) = 0.10*0.15*0.12 = 0.0018 or roughly 1/556**

As a point of reference, Freq(Victim) = [ 2*0.5*0.1 ] * [ 0.1*0.1 ] * [ 2*0.1*0.01 ] = 1/500,000

**Key point:** Because several genotypes can be consistent with a mixture (even with a two person mixture where one genotype is know), the probability of failure to exclude is much higher with a mixture than a single contributor sample. Put another way, *the strength of evidence is not as great against a suspect with a mixture.*

IF we know the EXACT number of contributors to a mixture, we can use the above approach and extend this to three, four, for five (or more) contributors.

A major problem is **how do we know the number of contributors?**

We can set a **lower limit** to the number of contributors, as 3 alleles means AT least two, 5 alleles AT least three, 7 alleles at least four, etc. The problem is that if there are (say) three contributors, there is some chance than none of the markers in the mixture would show more than four alleles, which we might take to be a two person mixture.

If, however, we wish to make **no assumptions** as to the number of contributors, then the possible genotypes look sometime like

Marker |
Sample |
Potential Contributors |

1 |
11,12,14 |
11,11 OR 11,12 OR 11,14 OR 12,12 OR 12,14, 14,14 |

2 |
14, 16 |
14,14 OR 14,16 OR 16,16 |

3 |
13, 15, 16, 17 |
13,13 OR 13,15 OR 13,16 OR 13,17 OR 15,15 OR 15,16 OR 15,17 OR 16,16 OR 16,17 OR 17,17 |

Notice that we DID NOT use any information on the genotype of the victim (who is typically a known contributor).

Notice the increase in complications that arise with an unknown number of contributors.

**Computing the failure to exclude probablity**

While one coould compute all of the possible genotype probabilities for each marker, there is a nice short-cut.

Consider marker 1. For a match, the first allele must be eitehr an 11, 12, or 14. This occurs with probability **Freq(11) + freq(12) + freq(14)**. Likewise, for a failure to exclude the second allele must also be an 11, 12, or 14. Putting these together,

**Prob(fail to exclude at marker 1) = [ Freq(11) + freq(12) + freq(14) ] ^{2} = (0.05+0.1+0.2)^{2} = 0.1125
**
Likewise, for markers 2 and 3 we have

**Prob(fail to exclude at marker 2) = (0.1+ 0.3) ^{2} = 0.16**

**Prob(fail to exclude at marker 3) = (0.1+ 0.01 + 0.3 + 0.2) ^{2} = 0.37**

Thus, Probability that a random individual is not excluded as a contributor is

**0.1125*0.16*0.37 = 0.007 or 1/137
**
This type of calculation is also called a **random man not excluded** calculation

Very often, one can use the peak height in an electropherogram to estimate the number of contributors.

However, if an individual is a very minor contributor to a mixture, then their alleles will have very small peak heights in the electropherogram.

This can lead to **allelic dropout**, where the alleles from a very minor contributor may to too rare relative to the major contributor to give a significant peak height.

These loci (showing only victim's DNA) are then discarded. The result is that the 13 CODIS markers may only have (say) 5-6 that give a mixture signal, which results in a much larger random-match probability.

A more problematic case where where some minor alleles amplify and others do not. Suppose a suspect has a 14,15 genotype at marker one. If both alleles amplify, we exclude this individual (as there is no 15 in the sample).

However, if the 14 allele is just significant, the 15 allele may not amply, giving the mixture we see.