Introductory Molecular Genetics

 

Dr. James McInerney

Part 1 - DNA and RNA Structures

DNA is the hereditary material which carries the genetic information required by a cell to reproduce itself. DNA is a polymer consisting of a long chain of monomers called nucleotides. The DNA molecule is said to be a polynucleotide. Each nucleotide has 3 parts : a sugar group , a nitrogen containing ring structure called a base , and a phosphate group.

The Sugar:
The sugar present in DNA is a 5 carbon sugar (pentose) called 2'-deoxyribose.

The carbon atoms of the sugar are numbered 1', 2' , 3', 4', and 5'.

The Bases:
Nucleotides contain 1 of 4 bases : adenine, guanine, thymine or cytosine.

These are complex molecules containing carbon and nitrogen ring structures.

Adenine and Guanine
These contain 2 carbon-nitrogen rings and are called purines.

Cytosine and Thymine contain a single ring and are called pyrimidines.

The Phosphate group:
A nucleoside is called a nucleotide when a phosphate group (PO4) is attached.

A nucleotide may exist in a cell as an individual molecule (nucleotide triphosphates play an important role as energy carriers in cells)

Or

A nucleotide may also be polymerized as a nucleic acid , which is DNA or RNA.

A sugar + base is called a nucleoside.
A sugar + base + phosphate group is called a nucleotide.

DNA Structure
DNA molecules have a very distinct structure known as a double-helix. The structure of DNA was discovered in 1953 by Watson and Crick using X-ray diffraction.

DNA exists as 2 polynucleotide chains wrapped around each other to form the double-helix.

The sugar-phosphate part of the molecule forms a spine or backbone which is on the outside of the helix. The bases, which are flat molecules, face inwards towards the centre of the helix and are stacked on each other like a pile of plates.

The double-helix is said to be antiparallel i.e. 1 strand runs in the 5' to 3' direction and the other strand runs in the 3' to 5' direction.

                                                                   5'                             3'
                                                                   3'                             5'

 

The double-helix is right-handed, this means that if the double-helix were a spiral staircase and you were climbing upwards, the sugar-phosphate backbone would be on your right.

The bases of the 2 polynucleotide chains interact with each other. The space between the polynucleotides is such  that  a 2-ring purine interacts with a single-ring pyrimidine. Therefore a Thymine always interacts with Adenine and Guanine with Cytosine. Hydrogen bonds form between these bases and help to stabilize the interaction. 2 bonds form between A and T and 3 bonds form between G and C.                            

Because G must always bond to C and A to T the sequences of the 2 strands are related to each other and are said to be complementary with the sequence of one strand predicting and determining the sequence of the other.
 

RNA Structure
The main difference between RNA and DNA is in the sugar - ribose replaces deoxyribose

Another important difference lies in the bases : Uracil which base pairs with Adenine replaces Thymine.

                                                      DNA                                     RNA
                                  

                                          Cytosine        C                        Cytosine     C

                                          Thymine         T                        Uracil         U

                                          Adenine        A                       Adenine     A

                                          Guanine        G                        Guanine     G

RNA is a transient intermediate which is relatively unstable . It has a 1/2 life of about 3 minutes for prokaryotic cells , whereas DNA is very stable , as it generally doesn't decay.

RNA is also much shorter than DNA: DNA is a polynucleotide and it can be 1 million to 3 billion nucleotides in length whereas RNA can be anywhere from 7 bases to 100,000 bases in length .

Also, RNA molecules mainly exist as as a single polynucleotide strand and do not form a double-helix.

There are a number of types of RNA species in the cell
(1) mRNA (Messenger RNA)

      This is an intermediate that contains the gene sequence.

(2) tRNA (transfer RNA)
      This is usually about 70 base pairs in length. It has a cloverleaf structure.

      tRNA can carry an amino group when charged.

(3) rRNA (ribosomal RNA)
      rRNA and protein make up Ribosomes. Prokaryotic ribosomes are 70S with 50S and 30S

      subunits.  They contain 3 rRNAs (23S ,16S and 5S)

      The "S" is the Svedberg constant which tells one the order that they sediment in an ultra

      centrifuge.

 
 



 
 
  Part 2 - Transcription and Translation     The central dogma of molecular biology is that DNA (Deoxyribonucleic acid) is converted to RNA (Ribonucleic acid) which is then converted into protein. This requires 2 processes called transcription and translation

 

     DNA Transcription RNA Translation Protein
 

  Transcription       Introduction

Transcription is the process by which the base sequence in one strand of the DNA is copied into a complementary sequence in a strand of RNA.

RNA synthesis always occurs in the  5' to 3' direction . Since double-stranded nucleic acids are always anti-parallel this implies that transcription occurs off a DNA strand running in the 3' to 5' direction.
Only 1 strand is copied into mRNA for any one gene - this is called the coding strand.

Transcription can be summarized under 3 headings:
Initiation

Elongation

Termination

 
 

(i) Initiation

Bases are located on the inside of the double-helix . They are H-bonded to bases on the complementary strand

*Remember: A pairs with T by 2H bonds
                   G pairs with C by 3H bonds

The first step , in order for transcription to occur must be therefore to unwind the double-helix and then to separate the complementary strands.
The enzyme RNA Polymerase plays a very important role in this step.

 

 

RNA Polymerase
RNA Polymerase is a protein with a molecular weight of 500,000 daltons. The core enzyme has 4 subunits a1 , a2 , b , b' and has sigma as it's 5th subunit . The sigma-factor is important because it recognizes the start of a gene sequence . RNA polymerase will explore the double-helix by binding weakly and then releasing at different positions . It does this until it finds a promoter sequence . The sigma-factor enables the RNA Polymerase to bind tightly at this promoter.  The promoter will not be transcribed instead it directs RNA Polymerase to a specific sequence downstream .

 

 
 
                                     Upstream  5'                                                                          3' Downstream
 

 

The sequence at promoters is conserved (stays the same) throughout evolution which shows the importance of these sequences. The exact sequences can vary slightly but all conform to an overall pattern known as a consensus sequence and it is this consensus sequence that is conserved. In E. Coli
there are 2 consensus sequences recognized by RNA Polymerase  known as the -10 sequence and the -35 sequence . The Pribnow Box (TATAAT) is about -10 b.p. ( i.e. 10 b.p. upstream of the site of initiation of transcription)  and the TGTTGACA is about -35 b.p. (i.e. 35 b.p upstream of the site of initiation of transcription).

Transcription starts  at a purine (A or G) about 10 b.p. downsteam of the Pribnow Box.

 
 

(ii) Elongation

In order to synthesize a polymer of RNA, nucleotides must be added  onto a growing chain of nucleotides to give a polymer of several hundred to several thousand nucleotides long. This process is called elongation.
The 5' end has a free triphosphate group (PO4)  and the 3' end has a free hydroxyl group (OH) thus  covalent phophodiester bonds to form between adjacent nucleotides in the RNA chain.

Transcription proceeds by RNA polymerase unwinding and melting the double-helix of DNA (the template) and copying it into an RNA chain. The sigma factor is needed for initiation and is released from the RNA polymerase after a few nucleotides have been copied. RNA being synthesized is hydrogen bonded to the DNA template about 10-12 nucleotides at any one time. The amount of DNA unwound at any one time is about 17 nucleotide pairs.

 
 

(iii) Termination

RNA synthesis must stop eventually otherwise the cell will have wasted unnecessary energy making RNA which is not needed for protein synthesis.
Termination occurs when RNA synthesis stops, hydrogen bonds reform and the DNA strands rewind. RNA polymerase, and finally RNA, are released from the DNA template.

There are 2 types of terminator:
 

(a) Rho-independant terminators
      These have a region of dyad symmetry ( think of the word NAVAN) centred 15-20

      nucleotides before the end of the RNA coding region.

      RNA can no longer hydrogen bond to DNA if a hairpin loop forms within the DNA template strand

 
                                                                GTAAGC--GCTTAC

                                                                     (dyad symmetry)

 

      There is also  a run of about 6 A's in the DNA which are transcribed into U's at the end of the
      RNA. As there are only 2 hydrogen bonds between A and T (or U) and 3 between G and C ,

      therefore the bonding between a string of A's and a string of U'S is weak  and natural breakage

      occurs of H-bonds occurs at the end of the RNA/DNA hybrid.

 

(b) Rho-dependant terminators

      These terminators lack the run of A's and often have regions of poor dyad symmetry.

      Termination here requires participation of another subunit of RNA polymerase called rho (p).

      RNA polymerase pauses even at weak hairpins and probably binds to p . P hydrolyses ATP and

      uses the energy generated to release the RNA and the enzyme from the DNA strand.

 

Conclusion

Transcription is the process by which DNA is converted into RNA. Transcription proceeds with the assistance of RNA polymerase which unwinds the double-helix of the DNA template and copying it into and RNA polymer. The sigma-factor is needed for initiation and is released from RNA polymerase after a few nucleotides have been copied. The RNA being synthesized is H-bonded to the DNA template about 10-12 nucleotides at any one time. The amount of the DNA double-helix unwound at any one time is about 17 b.p. Termination occurs and RNA synthesis stops , the helix is reformed and the new RNA strand is released . Termination may be either Rho-dependent or Rho-independent. Rho-independant terminators have regions of dyad symmetry and weak H-bonds, Rho-dependant terminators use another subunit P to hydrolyse ATP and to use the energy generated to release the RNA and the enzyme from the DNA strand .

See the Process of Transcription in Action    


   
Translation  
 

Introduction

Translation is the process of decoding of base sequences in mRNA into amino acid sequence to form a protein. The base sequence in mRNA determines which amino acid is incorporated into protein. The sequence in mRNA doesn't directly recognize amino acids. Instead adapter molecules recognize specific amino acids as well as specific triplets in the mRNA. These adapter molecules are called transfer RNA molecules (tRNA). They are short RNA molecules  about 70-80 nucleotides long. There are 30-40 different types of tRNA in bacteria. Each one recognizes one or more of the codons that specify a given amino acid.
 

tRNA Structure
The structure consists of a single-strand of RNA folded in a clover leaf structure. The 4 stems are stabilized by base-pairing. The 2-D structure is then folded into an L-shaped 3-D from. Constant features of tRNA's are the  dihydrouridine loop, the pseudouridine loop  and the anticodon loop.

Three bases (called the anticodon) in the anticodon loop from hydrogen bonds with complementary bases in a triplet in the mRNA. Perfect base-pairing isn't required. Some tRNA's recognise more than 1 of the 61 codons that specify amino acids. This non-standard base-pairing is called wobble. Flexibility occurs between the 3rd (3') position of the codon and the 1st (5') position of the anticodon.

e.g.. codon in mRNA 5' UUU 3'
 

      anticodons in

      tRNA which will

      bind the above

      codon  :                  3' AAA 5'

                                    3' AAG 5'     (Note the flexibility in the 1st (5') position of the anticodon this is

                                    3' AAI  5'                       called the wobble-position)

 

The effect of the wobble may be to speed up protein synthesis because different tRNA's can be used.
 
 

Recognition of amino acids by tRNA's
The recognition of amino acids by tRNA's is made specific with the help of enzymes (called aminoacyl-tRNA synthetases). There are at least 20 of these enzymes which recognize 20 amino acids and their compatible tRNA's. Each enzyme can recognise just one amino-acid but one or more tRNA's with different anticodons.

Ribosomes
Protein synthesis would be very slow without ribosomes. Ribosomes are large complexes of RNA and protein to which the components of protein synthesis bind.

23S rRNA + 5S rRNA +31 proteins --------> 50S large ribosomal subunit

16S rRNA + 21 proteins ---------> 30S small ribosomal subunit

30S + 50S subunits ---------> 70S complete ribosome

(S= the Svedberg constant, i.e. a measure of the rate of sedimentation in a centrifugal field)
 

Translation takes place in the same direction as transcription (5'->3' of mRNA) so therefore transcription and translation can and do take place at the same time. Thus the amino terminus of proteins is synthesized first and the carboxyl terminal is synthesized last.
In prokaryotes several genes can be transcribed into a single mRNA which is translated into several proteins i.e.. mRNA's can be polycistronic. Translation may be described under 3 headings (the same as for transcription) these are:

Initiation
Elongation

Termination

 
 

(i) Initiation

When they are not actively involved in transcription ribosomes exist as separate and large and small subunits. The first step in translation involves the binding of the small ribosomal subunit to the mRNA. Translation usually begins at the sequence AUG which encodes methionine and is known as the translation-initiation-codon. The small subunit binds to the mRNA at a specific point upstream of the AUG. In prokaryotes this is the Shine-Delgarno sequence (5' AGGAGGU 3') found near the start of the mRNA. Once bound, the small subunit migrates in a 3' direction along the mRNA until it finds the AUG, usually about 10 nucleotides downsteam. A tRNA charged with methionine binds to the AUG located by the small ribosomal subunit. The methionine is modified by the addition of a formyl group (CHO) to one of the hydrogens of the amino group. This successfully blocks the amino group, thus polymerization of the polypeptide can only occur in the amino to carboxy direction. The combination of the mRNA and the small ribosomal subunit and the modified formyl methionine is called the initiation complex.
       A number of accessory proteins called initiation factors are required for initiation. Bacteria have 3, known as IF1, IF2 and IF3. Initiation begins with binding of IF1 and IF3 to the small ribosomal subunit. This helps to prevent binding of the large subunit before the mRNA has bound. Next, IF2 complexed with GTP binds the small subunit, it's purpose is to assist binding of the initiator tRNA.

The small subunit then binds the mRNA and locates the AUG complex and IF3 is released. This marks the end of initiation.

 
 

(ii) Elongation

Elongation begins with binding of a large ribosomal subunit to the initiation complex to form a complete ribosome. This is accompanied by the release of IF1 and IF2 and the hydrolysis of GTP.
The complete ribosome contains two binding sites for tRNA molecules. The 1st site is the P site or Peptidyl site and the 2nd site is the A site called the Aminoacyl site. The P-site is occupied by the initiating tRNA  which is base-paired to the AUG. The A-site is positioned over the second codon. Elongation begins when a tRNA enters the A-site and base-pairs with the second codon. With both sites occupied by charged tRNA's, the attached amino acids are placed in close contact and a peptide bond forms between the carboxyl group of the methionine and the amino group of the second amino acid. The reaction is catalysed by a complex enzyme called peptidyl transferase which probably contains several different ribosomal proteins. Peptidyl transferase works in conjunction with another enzyme called tRNA deacylase, which breaks the link between the methionine and it's tRNA after formation of the peptide bond. Since new amino acids enter at the A-site the peptide must shift position before the next amino acid enters, therefore the translocation of the peptide from the A-site to the P-site occurs.  At the same time the mRNA moves in order to place the next codon in position opposite the A-site and the next tRNA moves into the A-site forming base-pairs with the next codon on the mRNA.

 

(iii) Termination

Protein synthesis terminates when one of the 3 stop codons are reached on the mRNA. These are UAA,UGA, and UAG. No tRNA's have anticodons for these 3 codons. Instead a Release Factor binds which causes the peptidyl tranferase to transfer the peptide chain to water. E.Coli , for example has 3 release factors.

In E. Coli:
RF1 recognizes UAG and UAA.

RF2 recognizes UGA AND UAA 

RF3 binds GTP and stimulates the release reaction.

There are approximately 35 amino acids of the growing chain buried within the ribosome at any one time. During synthesis the peptide chain begins to assume secondary structure and also to bind other subunits. The average protein in synthesized by translation in 10-20 seconds.
  Conclusion     The small ribosomal subunit binds to the Shine-Delgarno sequence which is upstream of the start codon AUG. A tRNA charged with methionine binds to the AUG located by the small ribosomal subunit. The methionine is modified by the addition of a formyl group (CHO) . The combination of the formyl methionine, the small subunit and the mRNA is called the iniation complex. Initiation factors, IF1  and IF3 help prevent the binding of the large subunit before mRNA has bound. IF2 + GTP helps the small subunit bind .

        Elongation begins with binding of the large ribosomal subunit to the initiation complex to form a complete ribosome. The complete ribosome contains 2 sites for binding tRNA molecules called the P and A sites. Elongation begins when a tRNA enters A and base-pairs with the second codon. With both sites occupied by charged tRNA's, the attached amino acids are placed in close contact and a peptide bond forms between the carboxyl group of the methionine and the amino group of the second amino acid. The reaction is catalysed by a complex enzyme called peptidyl transferase. Since new amino acids enter at  A the peptide must shift position before the next amino acid enters, therefore the translocation of the peptide from  A to  P occurs.  At the same time the mRNA moves in order to place the next codon in position opposite A and the next tRNA moves into A forming base-pairs with the next codon on the mRNA.

                Termination occurs when a stop codon (UAA, UAG, UGA) enters A because there are no tRNA's have anticodon's for these 3 codon's. Instead a release factor binds which causes the peptidyl transferase to transfer the peptide chain to water.

See the Process of Translation in Action