(version 27 August 1999)
This material is copyrighted and MAY NOT be used for commercial purposes
| You are visitor number | since 20 July 1999 |
This lecture is designed for projection at 18 pt type font and has several greek characters embedded as18 pt figures. Hence, it may look rather odd when viewed at different fontsizes. You instead may wish to view the pdf version using the Adobe Acrobat feature of many browsers.
For example, the probability of no successes is
Likewise, the probability of all successes is
What is the probability of one success in five traits where Prob(success) = p? Let 1 = success, 0 = failure.
Thus, the overall probability is 5 * p (1-p)4.
More generally, consider the probability of obtaining k successes in n total trials.
Letting = expected number of
successes, then the probability that we observe k successes is
In particular,
The Poisson and binomial are connected in that = np, so that if n is large (i.e., n >>
),
the Poisson provides a quick approximation of the binomial.
The Poisson distribution follows by assuming that the expected number of events occurring in some small
interval of time
is
so that the rate of events per time unit is
Hence, the Poisson assumes that there is a constant rate of successes occurring, which is akin to assuming independent trials.
| Distribution | Known parameters |
|---|---|
| Binomial | Number of Trials, n Probability of success/trail, p |
| Poisson | Expected number of successes, np = |
Note that if n >> , we can set
= np and use the Poisson to approximate the Binomial.
How long until the first success?
Let p = Probability of a success. If all trials are independent, then
Hence, while the average number of trials required is 1/p, there is a distribution about this average value.
Suppose both parents are Aa, where the aa genotype displays a horrific disease, and as a consequence the family stops having children after the first such child appears. What is the probability they will have 1, 2, 3, or 4 children? Since Pr( aa)= 1/4 ,
| Family size | Probability | Cumulative |
|---|---|---|
| 1 | p = 0.250 | 0.25 |
| 2 | p(1-p) = 0.188 | 0.4375 |
| 3 | p(1-p) 2 = 0.141 | 0.578 |
| 4 | p(1-p) 3 = 0.105 | 0.684 |
Here is the distribution for up to 12 children
Gives the waiting time to the first success when time is now continuous
(i.e., how long until your light bulb fails?) Here the distribution parameter is , the success rate per time
interval. Hence, the mean time until a success is 1/
.
The exponential distribution is very closely related to the geometric, with and p having essentially the same
role. Noting that
the geometric probability can be approximated by
Under the exponential distribution, the waiting time probabilities are given by the appropriate area under the curve given by
Thus, the probability that the first success occurs at, or before, time T is
Suppose you have a constant risk of 0.01 per year of getting cancer. This gives an average age for getting cancer at age 100. What is the probability you are cancer-free at ages 20, 40, 60, and 80?
= 1 - ( 1- e-0.01*T) = e-0.01*T
giving
| Age | Probability cancer-free |
|---|---|
| 20 | 0.819 |
| 40 | 0.670 |
| 60 | 0.549 |
| 80 | 0.449 |
Here is the distribution out to age 200

The parameters for this distribution are the
mean and the variance
(a measure of the spread). The square root
of the
variance,
, is referred to as the standard deviation.
As the figure shows, the mean corresponds to the peak of the distribution, while the variance measures the spread. The larger the variance, the more spread out the distribution is.
The probability of a particular event is just given by the area under the normal (bell-shaped) curve.
Under the normal distribution, 95 percent of all values lie within 1.95 Standard Deviations of the mean,
For n large and p moderate,
number of successes roughly follows a normal distribution with mean
= np and
variance
= np(1-p).
Thus, approximately 95% of the values fall within the interval
The probability that in individual is a certain genotype is expected to be 0.05. Should we be suspicious if we observe
70 such genotypes in a population of 1000 individuals? Using the normal approximation
with n = 1000 and p = 0.05, = np = 50,
= np(1-p) = 47.5. Since the upper 95% limit
is
+ 1.96*
= 63.5, on average we expect such an excess of genotypes to occur
less than 5% of the time. Hence, we should indeed be suspicious.
Onto: Lecture 3