Lecture 7: DNA: Structure, Sequencing and PCR

(version 7 Jan 2008)

This material is copyrighted and MAY NOT be used for commercial purposes

 You are visitor number   since 7 Jan 2008 

DNA Structure

DNA, or Deoxyribonucleic acid is the molecule that codes for the genetic material (Mendel's discrete particles).

The structure of DNA was worked out by James Watson and Francis Crick in 1953. They were awarded the Nobel Prize in 1962 for this work.

In a classic understatement, they noted "This structure has novel features which are of considerable biological interest"

Interesting reading: Did Rosalina Franklin miss out on the Nobel prize for her contribution to DNA structure?

Indeed, the structure of DNA immediately told us how it might be replicated and eventually this allowed us to sequence DNA.

DNA is a polymer of four possible nucleotides, where each nucleotide consists of a sugar (the deoxyribose part) to which is attached one of four possible bases, A, C, G, or T.

A string of nucleotides are joined together to make up one strand of a DNA molecule, for example AGGGAGGGTCCA

  • Each strand of DNA thus consists of backbone to which are attached one of four different bases

  • Each stand has a polarity (basically, a top and a bottom), giving a 3' (three prime) and 5' (five prime) ends.

    The 5' end contains a Phosphate group, the 3'end a hydroxyl (-OH) group.

    A DNA molecule consists of two complementary strands interwoven in the form of a helix (hence the term double helix).

    They are complementary because of base pairing, wherein an A on one strand always pairs with a T on the complementary strand, and likewise, a G always pairs with a C.

    Hence, if one strand is

    5' A-A-G-G-C-C-T-T- 3'

    The other strand is

    3' T-T-C-C-G-G-A-A- 5'

    Because of this complementary, the DNA sequence for a gene is simply the sequence of one the strands, e.g.


    One chromosome is a single DNA molecule. A typical length (number of nucleotides) for a chromosome is in the hundreds of millions of bases.

    Given these large numbers, we speak in terms of thousands of bases, (kb or kilo-bases), so that 4.5kb = 4,500 bases, and also in terms of millions of bases (mb or mega-bases ) so that 4.5 mb = 4,500,000 bases

    Total number of bases in the human genome = 3 billion (3,000 million or 3000 megabases).

    DNA replication

    DNA replicates by having the helix unwind, with both strands serving as templates for new replication.

    Base pairing is the key, as a A on a template strand gives a T on the newly-synthesized strand, and so forth.

    Polarity is important, as new nucleotides are added so that the new strand grows in the 5' to 3' direction (the start has a 5' phosphate, and the new nucleotides are added to the current 3'-OH group)

    DNA replication is started by a Primer, a short piece of RNA (or DNA) to provide a free 3'-OH group for the chain to grow on.

    DNA sequencing

    By a simple manipulation of how DNA replicates itself, we can also sequence DNA.

    First, because DNA has a charge (it is an acid), DNA molecules of different sizes have a different total electric change. Thus, when DNA fragments are placed in an electric field, the fragment move (migrate) at different speeds.

    Thus, if we run a solution of different size DNAs on a gel in an electric current, the result is a series of bands. DNA molecules that differ in size by even a single nucleotide will run at different speeds, and hence give different bands.

    Chain termination sequencing works by letting DNA replicate, but replacing a very small amount of each of the A, T, C, and G in the solution with a special modification that stops the chain (no 3'-OH). Thus, when a growing chain incorporates one of these chain-terminators, it stops.

    Thus, suppose we use an A chain-terminator and after running the fragments out on a gel find fragments of lengths 1, 6, 9, 11, and 15. This means that there is an A at these positions.

    Doing the same for all of the bases generates a series of fragments which we can separate out in an electric current, generating a DNA sequence.

    This is now typically done by a machine, as an ABI Automated Capillary sequencer,

    The output is known as an electropherogram

    PCR, the Polymerase Chain Reaction

    Invented by Kary Mullis, who won the Nobel prize for this method.

    By using short primers on the two different strands, any particular region from even a single DNA molecule can be amplified up into millions of copies.

    Each cycle of PCR doubles the current amount of DNA.

    PCR allows even the tiniest trace amounts of DNA to be amplified up into as much DNA as is needed for analysis.

    PCR Flash Animation