4.3 The flow of biological information

Julian Pakay; Hendrika Duivenvoorden; Thomas Shafee; Kaitlin Clarke

4.3 The flow of biological information

Information flows from genes to proteins but not in reverse.

The gene expression factory

You can think of a cell as a factory. In the middle of the factory is the head office (nucleus) with the instructions to make everything in the factory, including the machinery (proteins), workers (proteins), and even some other rooms (organelles). When something new needs to be made, the big manual of instructions (DNA) in the head office is sorted through and the specific instructions (one or more genes) are found. The instructions are not allowed to leave the head office, so must be photocopied (transcribed) into a single copy (messenger RNA or mRNA). This single copy leaves the head office, heads out to the factory floor (cytoplasm) and is read by a machine (ribosome). The ribosome translates the photocopy into a different language entirely – a chain (of amino acids) – and when this translation is finished the chain can contort into a new shape and becomes a functioning piece of machinery or worker that might be used in the factory or in a neighbouring factory (exported) or back in the head office (nucleus).

This analogy explains gene expression, when we go from the original gene to a functional ‘product’. Watch this video for a detailed visualisation of the process from DNA to protein.

From DNA to protein – 3D

Schematic depicting the cell as a factory as described in the text, with rooms as head office, initial manufacturing, processing and refinement. Products in different states of construction are moved around the cell, with some later incorporated into the walls or exported outside. — **Figure 4.2 The cell as a factory:** In some ways, a cell is like an automated factory. Instructions (DNA) stored in the head office (nucleus) are photocopied (transcription), with the instruction copy (mRNA) sent to initial manufacturing (endoplasmic reticulum) to be used as the blueprints for building (translation) a specified product (protein). Sometimes this product undergoes further refinement and processing (Golgi), with final products being stored, used in the factory itself or its walls, or exported outside. Illustration by Thomas Shafee.

DNA vs RNA

There are two main differences between deoxyribonucleic acid (DNA) and ribonucleic acid (RNA):

DNA is normally double stranded and RNA is normally single stranded (in humans anyway; other organisms are different).
DNA is made up of adenine (A), thymine (T), cytosine (C) and guanine (G) bases. Each base is attached to a phosphate and a sugar molecule, which together make a nucleotide. RNA is made up of A, C and G nucleotides and instead of thymine, it has uracil (U).

In DNA, A always pairs with T, and C always pairs with G (the straight letters stick together and the curvy letters stick together) to create a base pair. In RNA, A always pairs with U. These pairings are how complementary strands ‘stick together’ and how by looking at only one strand you can replicate the DNA (if you want to pass it onto a daughter cell) or transcribe it into RNA.

**Figure 4.3 The nucleotides of RNA and DNA:** RNA and DNA are commonly shown as ribbons, where the sugar-phosphate backbone forms a helix and the nucleobases point inwards. RNA and DNA share three of their bases: adenine, cytosine and guanine. The fourth base is uracil for RNA and thymine for DNA, which differ by a single methyl group. In the double helix of DNA, the base-pairing of these nucleotides (A-T and C-G) hold together the familiar double helix structure. Image: Difference DNA RNA-EN by sponk from Wikimedia Commons used under CC BY-SA 3.0.

Transcription

There is different machinery involved in the process from DNA to protein. During transcription, RNA polymerase finds the gene that needs to be transcribed and unwinds (‘unzips’) the double DNA strand at this location (Figure 4.4). The RNA polymerase reads the template strand of the DNA and brings in the necessary nucleotides to build a complementary mRNA strand (with uracil instead of thymine) (Figure 4.5A). When it has finished transcribing the gene (following cues from within the DNA code), the RNA is released and the RNA polymerase ‘zips’ the double-stranded DNA back together. The mRNA requires some processing before it can leave the nucleus: introns (segments of DNA or RNA that do not code for proteins) are removed through splicing, the poly-A tail (a long sequence of ‘AAAAAA’) is added to the 3’ end (three prime end) to increase the stability of the molecule and a methylated ‘cap’ is added to the 5’ end (five prime end) to help initiate protein synthesis and protect the mRNA from being degraded. In the opposite of what you might think from the names, introns are removed and exons (segments of DNA or RNA containing coding information) stay in, because intron is derived from ‘intragenic’ (inside a gene) and exons are ‘expressed’.

Displays the process of the template strand of DNA being transcribed by RNA polymerase to produce the complementarity mRNA molecule. — **Figure 4.4 Transcription by RNA polymerase:** RNA polymerase transcribing the template strand of DNA to produce an mRNA molecule, complimentary to the template strand of DNA. Image: RNA polymerase transcribing template strand DNA by Kep17 from Wikimedia Commons used under CC BY-SA 4.0.

Perhaps confusingly, the ‘template’ strand of the DNA is called the non-coding strand, and the non-template strand is called the coding strand. This is because the DNA non-template strand will have the same sequence as the mRNA strand transcribed from this gene, except T’s will be U’s. In Figure 4.4 you can see that the mRNA molecule has the same sequence (apart from U/T) as the top coding strand, and these are both complementary to the template strand.

‘Transcription’ outside of biology means to convert audio or speech into written text in the same language. Therefore, as DNA and RNA use the same ‘language’ of base pairs, this process is called transcription.

Translation

Now that the ‘photocopy’ of the gene has been made, the mRNA can leave the nucleus through a little pore and find a ribosome. The ribosome reads the RNA in codons (three nucleotides in a row) and recruits transfer RNA (tRNA) with a complementary anti-codon to the RNA sequence. The tRNAs have a specific anti-codon at their ‘feet’, and carry a specific amino acid on their ‘head’. Within the ribosome, when the tRNA anti-codon matches up with the codon in the mRNA the amino acid is released from the head of the tRNA and joins the growing amino acid chain. This process continues until all the mRNA codons have been matched with tRNA anti-codons and a ‘stop’ codon is reached; then the amino acid chain is complete. This amino acid chain may then undergo further processing, fold into a certain shape or join with other amino acid subunits to make a functional protein. Note that some genes do not undergo translation, they are destined to remain as RNA such as ribosomal or transfer RNAs (see Figure 4.5 B, C).

As the RNA and amino acids are in different ‘languages’, this process is called translation.

Space filling models of the molecular structures of different RNAs. A) An extended messenger RNA is produced out of a polymerase protein from a long DNA template. B) Many short ribosomal RNAs and ribosomal proteins combine into a precise ribosome structure. C) A transfer RNA folds alone into a complex shape. — **Figure 4.5 Different examples of RNA:** A: Messenger RNA (mRNA) in pink is the best known type of RNA, produced as a copy of a stretch of DNA to take information out of the nucleus. B: Ribosomes contain multiple small ribosomal RNAs (rRNA) in red and multiple proteins in blue in order to translate mRNAs into their encoded proteins. C: Transfer RNA (tRNA) is a key intermediate in protein production, binding to each codon in the mRNA messenger and carrying the relevant amino acid to the ribosome. Images: Adapted from 2014: Tour of the Protein Data Bank from PDB-101 by Goodsell used under CC BY 4.0.

Transcription is copying (with some modification) DNA into RNA.
Translation is decoding RNA into protein.

The central dogma of molecular biology

The central dogma of molecular biology is an explanation of the flow of genetic information in a biological system. It was first stated by Francis Crick, who together with James Watson and Maurice Wilkins was awarded a Nobel prize in 1962 for explaining the helical structure of DNA. The dogma is often stated as:

DNA is transcribed to RNA which is translated to protein.

Alt text: An overview of the basic central dogma of molecular biochemistry. DNA is self-replicated by DNA polymerase. DNA is also transcribed into RNA by RNA polymerase. RNA is then translated into protein by ribosomes. — **Figure 4.6 Overview of the central dogma:**
Depicting the one-way flow of biological information from DNA to RNA to protein and identifying the key enzymes and machinery involved at each step. Image: ‘Central Dogma of Molecular Biochemistry with Enzymes’ by Daniel Horspool from Wikimedia Commons used under CC BY-SA 3.0.

Decoding: understanding the triplet code

As we’ve explained, the four little letters of the DNA alphabet encode all life. Ribosomes read the mRNA in groups of three nucleotides called codons, and the specific order of the nucleotides in the codon determine which amino acid joins the chain. The amino acid codon table (Figure 4.7) shows how to determine which amino acid is encoded for by the codon in the mRNA. The next video illustrates how to use this table.

How to read a codon chart

A table used to identify which amino acids genetic codons translate to. — **Figure 4.7 Amino acid codon table:** mRNA nucleotides are read in groups of three nucleotides, called codons. This table reveals which amino acids are encoded by the nucleotide codons. The order of the first, second and third base is important for determining the amino acid it encodes. Image: ‘Amino Acid Codon Table’ by Scott Henry Maxwell from Wikimedia Commons used under CC BY-SA 4.0.

There are four important codons highlighted in the table to keep in mind:

AUG is known as the start codon. Ribosomes know to look for it as it signals where to start translating the mRNA and also establishes the reading frame. AUG encodes for methionine, which is almost always the first amino acid in the chain but can also be inserted in other locations along the chain.
UAA, UAG and UGA are known as stop codons. They do not encode for an amino acid and instead at this position the amino acid strand will be released from the ribosome and translation is complete.

Codon redundancy

Studying the amino acid codon table (Figure 4.7) we can see that there is some redundancy – for example, UCU, UCC, UCA and UCG all encode for the amino acid serine, while CGU, CGC, CGA, CGG, AGA and AGG encode for arginine. This makes sense. There are only 20 amino acids, but 64 possible combinations of the four nucleotides (4 × 4 × 4). It’s remarkable that all life uses this same coding system to build proteins from the genes we inherit.

This redundancy is also useful if there is a mutation at the third base of a codon (e.g. a C mutated to an A in UCC) – the amino acid encoded would not be different; we would still get serine added to the amino acid chain. You might think a single amino acid change isn’t too much of an issue, but it could have detrimental effects. A single change could alter the shape of the final protein, alter the active site of an enzyme or change the hydrophobic/hydrophilic nature of the protein so that it no longer sits in the plasma membrane. This might not sound problematic, but if the enzyme has an important job or the plasma membrane protein pumps a molecule that your cell can’t live without then a small amino acid change could be fatal. Imagine if you had another accelerator pedal in your car instead of a brake. It looks like a trivial change (you have the same number of pedals), but it’s likely to be seriously detrimental. It is important to keep in mind that a lot of mutations in DNA code are not ‘solved’ by this codon redundancy though.

Limitations to information flow

As we’ve discussed, DNA is transcribed into RNA and RNA is translated into protein (figures 4.8 and 4.9). If you were given a string of DNA letters, you would be able to transcribe this yourself into RNA (knowing what bases/nucleotides complement each other) and then, using the amino acid coding table, decode the order of amino acids that would result if this mRNA was translated. Note that this flow of information only goes one way. Protein cannot be turned back into RNA or DNA.

Schematic illustrating the process of DNA to RNA to protein with an eight codon/amino acid sequence. — **Figure 4.8 DNA to RNA to protein showing the triplet code:** Double-stranded DNA is transcribed to single-stranded RNA. The RNA is read in codons (illustrated in different colours) which translates to amino acids to build a protein. This illustration is of the first few amino acids for the alpha subunit of haemoglobin (without the start codon). Image: ‘Genetic code’ by Madprime (Madeline Price Ball) from Wikimedia Commons.

Protein cannot revert to RNA or DNA; biological information only flows in one direction.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Threshold Concepts in Biochemistry Copyright © 2023 by La Trobe University is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.