Lecture 24: Gene Structure and Evolution
(version 15 April 2003)
| You are visitor number
|| since 4 April 2003
Gene Structure and Evolution
A deep understanding of evolution requires an understanding of how genes are structured and regulated.
Prokayrotic Gene Structure
Bacteria, cholorplasts, mitochondrial genomes
Basic Gene Structure
Key features of prokayrotic gene structure and regulation:
- Promoter region (transcriptional regulation)
- Positive control: Something (protein/protein complex) must bind to promoter, otherwise gene is off.
- Negative control: Gene is on unless something binds to promoter
- Untranslated region (UTR) -- 5' and 3' regions
- sequences in these regions can still regulate gene expression. For example, the stability (half-life) of the mRNA.
- Ribsome binding sequence (RBS).
Single RNA polymerase used for all genes, but with different co-factors (sigma factors)
- Operons = polycistronic mRNAs (multiple genes ("cistrons") coded in a single transcript)
- polyproteins -- single large polypeptide that is later processed proteolytically to yield a series of individual polypeptides.
Coupled transcription and translation.
- This allows for several interesting arenas to control gene expression. One mechanism, attenuation, causes the RNA polymerase to fall off the DNA when the rate of translation slows sufficiently.
Essentially no mRNA processing
Eukaryotic Gene Structure
Three basic types of eukaryotic genes
- Pol I genes (rRNA)
- Pol II genes (protein coding genes, some small RNAs)
- Pol III genes (small RNAs such as tRNAs
Pol II genes
Key differences from prokayrotic genes:
- Monocistronic mRNAs --- one polypeptide (protein) per mRNA
- Uncoupled transcription and translation
- Very extensive mRNA processing
- mRNA Cap added to 5' send (replaces the role of the RBS)
- Poly-A tail added to 3' end of message (AAUAA is the poly-A processing signal)
- Removal of Introns (more on this later)
- RNA editing
Very important control elements for regulation may be very far upstream (5' of gene) or downstream (3' of gene).
Alternative Splicing of exons allows one gene to make several different mRNAs, depending on which exons are included in the final message. Hence, one "gene" may code for a large number of different products. Some genes are known to make close to 100 different transcripts based on splicing patterns.
Pol I genes
For ribosomal RNA (rRNA) genes
- Single RNA transcript processed to yield the three final rRNA products: 28S, 5.8S, 18S
- 5' upstream control elements
- rRNA genes (each gene = a transcriptional unit yielding a 28S, 5.8S, and 18S rRNAs) are typically arrayed in Tandem cluster, with hundreds to thousands of copies arrayed end-to-end.
Pol III genes
- tRNAs, other small RNA genes
- Key feature: Pol III genes have Internal promoters
Four Families of Introns
Nuclear tRNA introns
- Small (5-20 bases) introns found in the anti-codon loop of some tRNA genes in the nucleus of some eukaryotes
Group I introns
- Mitochondria and plastid genomes of plants and protists (rRNA, tRNA and mRNA genes)
Nucleus of certain protists, fungi and lichens (rRNA genes)
- Eubacteria (tRNA genes) & phages
- Metazoans - only in mitochondrial genes of a few anthozoans (e.g., sea anemone)
- Some conserved seqeunce blocks
- Self-splicing. G attacks 2'OH of RNA
Group II introns
- Mitochondrial and plastid genomes of plants and protists (rRNA, tRNA and mRNA genes)
- Some Eubacteria
- No (functioning) ones in nucleus
- Some conserved sequence blocks
- Self-splicing in some cases, enzymatic help in others. Lariat structure intermediate.
Pol II introns
This is the group that most thing of when speaking about "introns". These are best viewed as a degenerate case of Group II introns.
- Protein-coding (Pol II) genes of many eukaryotes
- small conserved sequences at junction (5' GT , 3' AG), other small blocks
- A complex of proteins and snRNAs (the spliceasome) is required. Lariat structure intermediate.
Evolution of (Pol II) Introns: Introns Early vs. Introns Late
Big debate as to whether Pol II introns are early or late --- where they were initially present in bacteria but later lost (Introns early) or did they only arise in the eukaryotes (introns late).
Support for the "early" view: exons roughly correlated with protein domains.
The idea is that primitive genes were put together by exon shuffling of functional domains.
- The idea was that the original genome (the progenote) was likely an RNA-based collection of minigenes, which were the forerunners of the earliest exons.
- Problem: Poor statistical correlation with domains.
Support for Introns late
Group I and Group II introns are mobile, and have infected certain genes/genomes. Pol II introns seem to be degenerate Group II introns.
Very clear evidence of exon shuffling in many (higher) eukarytoic-specific genes. Hence, higher eukarytoes seem to indeed use some exons as functional units.