Home | Profile | Achievement | Programmes | Projects | Staffs | Publications | Journals |
Biotech Glossary | Bioinformatics | Lab Protocol | Notes | Malaysia University |

Resistance Gene Homologues: a Shortcut Strategy for Marker Assisted Breeding
David N. Kuhn, J. Steve Brown, Maria Heath and Raymond .1. Schuellt


Theobroma cacao is an economically important tropical tree crop whose seeds are used to produce cocoa and chocolate, Many commercial cultivars are susceptible to major diseases and pests. Thus, the potential for crop loss due to disease or even pandemic is a real concern to cocoa producers. This situation is currently exemplified by the drastic reduction in cocoa production in Brazil due to infection with Witches’ Broom Disease (Crinipell!s perniciosa). Breeding of tree crops for disease resistance requires large populations segregating for the desired character and an effective means of assaying disease resistance. If trees cannot be assayed until they produce pods, this could delay assay for up to five years. Thus, molecular markers are sought that could speed the selection and propagation of disease resistant cultivars, A normal molecular genetic approach to this problem would be to screen hundreds of genetic markers (RFLP, SSR, allozymes, AFLP) for segregation with the desired phenotype. We are proposing a shortcut to this approach by looking for resistance gene homologues (RGHs). RGHs have been identified for many plant species by using degenerate PCR primers designed to a highly conserved region of the nucleotide binding site of plant resistance genes (Shen eta!, 1988; Aarts eta!, 1988). The RGHs map to the same location as known resistance genes in Arabidopsis and lettuca Thus, RGHs may make better molecular markers for disease esistance in cocoa because resistance genes are often physically clustered in plants. By identifying putative members of the clusters, we may be able to rapidly identi& useful genetic markers which will show a stronger linkage to the disease resistance phenotype.


Cocoa is an important agronomic crop but, because it is a tree crop, it has a number of problems with regard to improvement through traditional breeding methods;

  • Trees raised from seed may not produce fruit until they are three to five years old.
  • Many cultivars are self-incompatible, which makes it difficult to produce F2 populations.
  • Large areas are needed to plant thousands of trees,
  • Because of the length of time and large areas needed for such breeding experiments, maintenance of families can be affected by political and programmatic changes.
  • Even the hybrid cocoa grown today has been reproduced in seed gardens and its pedigree and genotype may be uncertain or unknown.
  • An obstacle to disease resistance breeding is the inability to study different races of the pathogens in commercial production areas due to limitations on movement of the pathogens into these areas.
  • Many of the existing F1 families have been produced to study segregation of agronomic characters, rather than disease resistance
  • FinaUy, disease resistance in I cacao may be multigenic, making discovery of individual genes responsible for resistance unlikely.

    Marker assisted breeding is meant to overcome some of these concerns by developing molecular markers associated with the desired phenotype, in this case, disease resistance, These markers can be used to preselect material for assay of disease resistance or for breeding before the phenotype is expressed. For example, by identifying a molecular marker associated with pod rot, one could identify material for breeding without having to wait until it had actually produced pods for assay of disease resistance. Such measures should speed up selection of agronomically useful clones for assay or propagation.

    A number of disease resistance genes have been isolated from plants by traditional genetic means. Sequencing of these genes revealed that most of the products of resistance genes are involved in pathogen recognition and signal transduction. Comparison of resistance gene sequences from a number of plant sources has identified a few highly conserved sequence motifs, such as the nucleotide binding site (NBS) usually toward the 5’ end of the gene and a leucine rich repeat (LRR) toward the 3’ end of the gene. In addition, mapping experiments in Arabidopsis and lettuce have shown that resistance genes are clustered physically on the chromosomes, apparently without regard to the type of pathogen. Thus, a gene for resistance to nematodes could be adjacent to a gene for resistance to rust. Finally, by analysing EST libraries of Arabiclopsis for the presence of sequences similar to the known NBS motif, it has been estimated that 1% of all Arabidopsss genes (approximately 200 of 21 000) are disease resistance genes.

    Based on these facts, it seems possible to identify portions of resistance gene homologues using primers to the conserved NBS site in T. cacao. If these RGHs are polymorphic, they should allow mapping of resistance gene clusters in T cacao, Even if the RGHs are not the actual genes for resistance to a particular disease, they may be closely linked due to resistance gene clustering. Identification of RGHs associated with a specific disease resistance will depend on the availability of large enough families segregating for disease resistance.

    Materials and methods

    Degenerate primers were designed from highly conserved regions in the nucleotide binding site (NBS) domain from known plant disease resistance genes by Shen et al. (1998) and Aarts et al. (1998). Aarts et al. (1998) designed RG1 (GGIATGGGIGGIGTIGGIAARACNACN) (GMGGVGKTV) and RG2 (ICCIAGIACYTTIARIGCIARIGGIARWCC) (GLPLALKVLG). Shen et al. (1998) designed PLOOPGA (GAAUCGGNGTNGGNAPUA.GACAAC) (EFGVGKTT) and GLPL6 (GTCGACftANGCCAANGGCMTCC) (GLPLALS). The primer sets were similar as can be seen by the amino acid translations but differed in their degeneracy. RG1 was 1024 fold degenerate and RG2 was 8192 fold degenerate, assuming two fold degeneracy for each I. PLOOPGA was 64 fold degenerate and GLPL6 was 16 fold degenerate. We used both sets of primers to amplify genomic DNA from seven different cultivars: SCA6, EET400, P12, GS46, Amelonado, lOSRand IMC67xSCA12, a clonal selection from a hybrid family. Amplicons from all amplification reactions were mixed and cloned into the pCR-4Topo plasmid (Invitrogen). Plasmid DNA was isolated from each of the 650 colonies and, initially, 40 candidates were sequenced.

    Results and discussion

    Nucleotide sequences were compared against the non-redundant GenBank databases using BLAST (Altschul et al. 1997) at the NIH-NCBI website ( Of the first 40 sequences, 18 had a 350 bp insert that was identical to chloroplast DNA. All 650 colonies were grown on LB agar in a matrix and screened by colony hybridisation with the chloroplast sequence as a probe. Colonies hybridising to the probe were not further analysed. The 350 remaining colonies were amplified individually using the M13 forward and reverse primers and categorised by size by agarose gel electrophoresis. Shen et al. (1998) and Aarts et al. (1998) had observed that inserts of approximately 530 nucleotides in length that contained a continuous open reading frame which included the two primers were usually RGH sequences. Thus, we sequenced 64 amplicons of approximately that size and compared their nucleotide sequences to the GenBank database. Typical results for a clone are shown in

    The largest fragments that produced a match at the nucleotide level were 50-80 nucleotides long or only 10-15% of the total length of the query sequence. E values, calculated to indicate the probability of producing a match due to chance, were not that small (le-05) considering the shortness of the match sequence and the vast size of the GenBank database. Although the top matches were indeed for plant disease resistance genes, high scores were also found for sequences from humans. Nucleotide sequences of putative RGHs were translated into amino acid sequences and used as the query sequence for a blastp search of the nonredundant (nr) GenBank database. This search compares the predicted protein sequence of the query against the predicted protein sequence in all possible frames of all nucleotide sequences in the database. Typical results are shown in Table 2. The fragments matched along their entire length (173 amino acids). The E values reflect the very small likelihood that such matches occurred by chance. All 100 of the top matches were for plant disease resistance genes or RGHs from other plants with the highest E value at Se-i 1. Thus, we believe we have identified RGHs from T. cacao. BLAST analysis of the 64 amplicons identified 51 as RGHs. Alignment of the nucleotide sequences using Pileup (GCG) identified 10 sequence clusters with only two of the 51 sequences identical. Five of the clusters contained between 6 and ii sequences and five clusters contained between 1 and 3 sequences. Variation among sequences within a cluster may be due to differences between cultivars or differences between homologues within a cultivar. Such sequence differences are being analysed to design PCR primers that will allow us to distinguish between categories of RGH by length differences. Thus, we hope to analyse the RGHs in cocoa populations in a manner similar to SSR markers

    The successful development of the RGHs as markers for disease resistance gene clusters requires families of trees that can be analysed. At the USDA National Germplasm Repository (NOR) facility in Mayaguez, Puerto Rico, four families of trees planted in 1989 are available for mapping of RGH. In each family, there are 10 tree genotypes pen replication, eight replications per location and two locations to give a total of 160 trees per family. The four crosses are UF658 x P7, 1MC67 x UF613, EET400 x SCA12, and SCAB x EET62. SCA12 and SCAB are both resistant to witches’ broom disease while EET400 and EE162 are susceptible. Although disease resistance data have not been collected for the families because they are planted in the NGR, other agronomic data have been collected. The parents of these families will be analysed for presence of specific RGH categories and inheritance of RGH genes will be mapped to linkage groups by analysis of co-segregation of SSR markers.


    Aarts M. G. M., B.T.L. Hekkert, E.B. Holub, J.L. Beynon, W.J. Stiekema and A. Pereira. 1998. ldentiflcauon or R-Gene Homologous DNA Fragments Genetically Linked to Disease Resistance Loci in Arabidopsis thahana. Molecular Plant-Microbe Interactions 11:251-258.
    Altschul S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W, MOJer and D.J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research, 25: 3389-3402.
    Shen K. A., B.C. Meyers, MN, Islam-Faridi, D.B. Chin, D.M. Stelly and R.W. Michelmore. 1998. Resistance Gene Candidates identified by PCR with Degenerate Oligonucleotide Primers Map to clusters of resistance genes in lettuce. Molecular Plant-Microbe Interactions 11:815- 823.