Home | Profile | Achievement | Programmes | Projects | Staffs | Publications | Journals |
Biotech Glossary | Bioinformatics | Lab Protocol | Notes | Malaysia University |



Index to this page

Some Definitions


One complete set of genes in an organism (a haploid set).

Except for occasional unrepaired damage to its DNA (= mutations), the genome is fixed.


All the messenger RNA (mRNA) molecules transcribed from the genome. [More]

Varies with the differentiated state of the cell and the activity of the transcription factors that turn gene transcription on (and off).


Two popular definitions:


All the metabolic machinery, e.g., present in a cell at a given time.

Varies with the differentiated state of the cell and its current activities.

The Proteome

The proteome is the protein complement of the genome . It is quite a bit more complicated than the genome because a single gene can give rise to a number of different proteins through

While we humans may turn out to have only 25 to 30 thousand genes, we probably make at least 10 times that number of different proteins. More than 50% of our genes produce pre-mRNAs that are alternatively-spliced.

The study of proteomics is important because proteins are responsible for both the structure and the functions of all living things. Genes are simply the instructions for making proteins. It is proteins that make life.

The set of proteins within a cell varies

How To Study?

  1. Isolate a homogeneous population of cells (e.g., yeast cells that have just been switched from glucose to galactose as their energy source).
  2. Extract the contents of the cells and separate the mix of proteins from other components.
  3. Separate the proteins in the mix by two-dimensional (2D) gel electrophoresis. This separates the proteins
    • in one dimension by their electrical charge;
    • in the second dimension by their size.
    (The procedure is analogous to that used in paper chromatography. [Link])
  4. Stain the gel to visualize the various spots of protein.
    External Link
    Follow this link to see a 2D gel (made by Large Scale Biology Corp.) of proteins from rat liver cells.
    Please let me know by e-mail if you find a broken link in my pages.)
  5. Punch out a spot.
  6. Add a protease (e.g., trypsin) to digest the protein in that spot into a mix of peptides.
  7. Run the mix through a mass spectrometer, which will separate the peptides into sharply-defined peaks.
  8. Run the resulting data through a database of all known proteins (that have been digested with the same enzyme) to see if you can find a match.
What if there is no match; that is, you have stumbled on an unknown protein?
  1. Isolate individual peptides from your mix and run one through a mass spectrometer that has been modified to
    • first randomly break the peptide into a mix of fragments containing one, two, etc. amino acids
    • then measure the mass of each fragment.
  2. Enter the resulting data into a database that matches the mass data with known pairs, triplets, etc. of amino acids.
  3. With the aid of overlaps, assemble the fragments to reveal the entire sequence of the peptide.
  4. "Back-translate" the amino acid sequence to determine what sequence of nucleotides in DNA could encode that peptide.
  5. Search the genome database for an open reading frame (ORF) that contains that sequence.
  6. Translate that ORF to get the entire amino acid sequence of your protein.

What Does This New Protein Do?

Some proteins act alone, and the function of many of these has been know for years. But probably the majority of the proteins in a cell act in concert with others.


How to Find Proteins that Function Together

1. Affinity Chromatography

For this procedure,
Link to an illustrated page describing affinity chromatography in more detail.

2. The Yeast Two-Hybrid System

The budding yeast, Saccharomyces cerevisiae, provides an excellent tool for discovering protein partners.
Link to page describing the life cycle of Saccharomyces cerevisiae.

The two-hybrid system also takes advantage of the fact that transcription factors (proteins) usually contain

The Method

  1. Using recombinant DNA methods, create a plasmid containing
    • the DNA encoding the DNA-binding domain of a transcription factor needed to turn on expression of a "reporter gene" such as the lacZ gene that encodes the enzyme β-galactosidase coupled to
    • the DNA encoding the "target" protein; that is, the protein whose possible partners you wish to identify.
    Insert the plasmid into living haploid yeast of one mating type (e.g., a)
  2. Using the same methods, create many different plasmids each containing
    • the DNA encoding the activation domain of the transcription factor;
    • the DNA encoding a possible partner ("bait") protein. (With the help of automated equipment, you can even make plasmids representing each of the entire 6,000-gene genome of yeast.)
    • Insert each of these plasmids into α yeast cells and grow them as separate clones.
  3. Mate each α clone with the target clone (a).
  4. If the fusion protein produced by the transcription and translation of a "bait"-containing plasmid can bind to the fusion protein containing the target,
  5. the two domains of the transcription factor can interact to turn on expression of the reporter gene (lacZ in our case).
  6. Grown on an indicator substrate, these colonies will turn blue. [Another example]
  7. The DNA in these colonies can then be isolated and sequenced.
  8. The result: identification of the proteins that can associate with the target protein.
Using the two-hybrid method, it has been possible to identify many sets of interacting proteins in yeast and other organisms. (The 23 September 2005 issue of Cell reports the identification of over 3000 interactions among pairs of human proteins.

3. Phage Display

This method exploits:

The method:

  1. Transform bacteriophages with a
    • random mix of DNA from the organism you are interested in coupled to
    • the DNA encoding one of the viral coat proteins.
  2. Infect E. coli with these phages.
  3. As the viruses replicate, they will not only propagate the recombinant gene but also express it as a coat protein.
  4. Both with be incorporated into new virions.
  5. Harvest the mix of viruses.
  6. Pass the mixture through an affinity chromatography column to which your "target" protein has been fixed.
  7. Those viruses that display a piece of foreign protein (peptide) that can bind to the target will stick to it.
  8. Elute the bound phage with a buffer.
  9. Repeat steps 6–8 to further enrich your binders.
  10. Infect E. coli.
  11. Grow separate colonies (clones).
  12. Sequence the coat protein gene to find the sequence of the foreign DNA inserted in it.
  13. Using the codon table, determine the amino acid sequence of the peptide.
  14. Search databases for a protein containing this sequence.
  15. Result: another protein that associates with your target protein.

Phage display is also used to make monoclonal antibodies (without the need for mice).

4. Protein Chips

Protein chips work on much the same principle as DNA chips [Link].

Although simple in principle, protein chips are far more difficult to work with than DNA chips because proteins

Fragments of DNA, in contrast, vary only in their nucleotide sequence and all bind their partners by simple Watson-Crick base pairing.

Three-Dimensional (3D) Structure

The clearest picture of how different proteins interact with one another to form functional complexes will come from determining the 3D structure of the complex. There are two methods:

X-ray crystallography requires that you be able to crystallize the protein. This is often a difficult task and especially difficult for complexes of two or more proteins.

Here are some links to 3D images of proteins.

Note that although in both cases the proteins are binding to DNA, they are also binding to each other (as homodimers).

NMR spectroscopy has been especially useful in producing 3D images of proteins that cannot be crystallized.

Welcome&Next Search

9 October 2005