Am. J. Bot. Join BSA Today!
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Hilu, K. W.
Right arrow Articles by Sharova, L. V.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Hilu, K. W.
Right arrow Articles by Sharova, L. V.
Agricola
Right arrow Articles by Hilu, K. W.
Right arrow Articles by Sharova, L. V.
(American Journal of Botany. 2002;89:211-219.)
© 2002 Botanical Society of America, Inc.


Genetics and Molecular Biology

Evolutionary implications of substitution patterns in prolamin genes of Oryza glaberrima (African rice, Poaceae) and related species1

Khidir W. Hilu and Lioudmila V. Sharova

Department of Biology, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061 USA

Received for publication March 23, 2001. Accepted for publication August 16, 2001.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
Patterns of sequence variation of nuclear genes encoding 10-kDa and 16-kDa prolamin seed storage proteins were examined in Oryza glaberrima (African rice, Poaceae) and O. barthii and compared to available sequences for the genus to assess potential application of these gene families in evolutionary studies. Sequence variation among species in 10-kDa genes was very low. In contrast, the 16-kDa genes have undergone rapid evolution, displaying a larger number of length and point mutations that in some cases result in frame shift or produce truncated protein or pseudogenes. The proportion of nonsynonymous substitution is high in both genes. Although nonsynonymous mutations did not alter the overall profile of the protein, pronounced shifts in proportions of some amino acids were evident and could have systematic application. The data provide support for a proposed direct evolution of the Asian (O. sativa) and African rice from O. rufipogon and O. barthii, respectively. Patterns of amino acid frequencies of the 10-kDa genes show the distinctness of O. rufipogon and O. longistaminata from the other species. The study underscores the potential application of the prolamin genes as markers from the nuclear genome for evolutionary studies in grasses at different taxonomic levels.

Key Words: molecular evolution • OryzaOryza barthiiOryza glaberrima • Poaceae • prolamin genes • protein


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
Prolamin is a class of seed storage protein encoded by multigene families and is considered unique to the Poaceae (grass family). It constitutes the major component of storage proteins in the grain of most grasses (Shewry, Napier, and Tatham, 1995 ). The relative molecular masses of major prolamin polypeptides in grasses vary from 10 to ~100 kDa (kilodaltons). Hilu and Esen (1988) and Esen and Hilu (1989) have demonstrated that prolamin size and polypeptide diversity fall into three size and immunological classes: 10–16, 20–35, and 35–100 kDa. These prolamin classes correspond, respectively, to three major lineages in the Poaceae: oryzoids-bambusoids, panicoids-arundinoids-chloridoids (PACC group), and pooids (Hilu, 2000). The oryzoid and bambusoid grasses are considered among the basal groups in most of the recently constructed grass phylogenies (Barker, Linder, and Harley, 1995 ; Clark, Zhang, and Wendel, 1995 ; Hilu, Alice, and Liang, 1999 ). Their species contain three multigene families that encode the S-rich 10-kDa, the 13-kDa, and 16-kDa prolamins (Masumura et al., 1989, 1990 ; Shewry, Napier, and Tatham, 1995 ; Hilu and Sharova, 1998 ).

The molecular diversity and patterns of evolution of the 10–16-kDa gene families are not well studied in the oryzoid-bambusoid grasses in spite of their possible ancestral status due to the basal position of the species in which they occur. Published work on the structure of the 13- and 16-kDa prolamin genes is limited to the Asian rice species O. sativa L. and the 10-kDa genes to O. sativa, its proposed ancestor O. rufipogon Griff., and related O. longistaminata Chev. et Roehr. The 10–16-kDa genes are simpler in structure than most of the other prolamin gene families and encode low molecular mass (LMM) prolamin. Due to their small size and ancestral status, accumulation of sequence data on genes encoding the LMM prolamins could provide valuable information on the evolution of other prolamin multigene families, the phylogeny of the groups possessing them, and the controversial issue of early evolution in the grass family.

Oryza includes ~22 species distributed in Asia, Australia, Africa, and the Americas (Vaughan, 1994 ). Two of these species are the domesticated Asian rice (O. sativa) and African rice (O. glaberrima). The former is globally the second most important crop. Both rice species are diploids (x = 12) and are considered to possess the AA genome (Tsunoda and Takahashi, 1984 ). It has been proposed that the Asian rice evolved from O. rufipogon and the African rice from O. barthii, two wild species that are widely distributed in Asia and Africa, respectively (Vaughan, 1994 ). The first two represent the sativa complex while the latter two the glaberrima complex (Cordess, Second, and Delsney, 1990 ). The evolutionary relationship between the two groups has been disputed. Nayar (1973) proposed a recent (10th century AD) and direct origin of O. glaberrima from O. sativa plants introduced to Africa. Chang (1976) indicated a common origin for the Asian and African rice groups from different populations of the O. perennis complex (to which O. longistaminata belongs) in Asia and Africa. In contrast, Cordess, Second, and Delsney (1990) proposed a distinct evolution for the two complexes, where the Asian complex evolved in Africa from a common ancestor with the African species O. longistaminata, followed by dispersal to Asia, while the African glaberrima complex had an independent evolution sharing a "more ancient ancestor" with the other complex. The data generated in this study will be used to shed some light on the evolution of the sativa and glaberrima groups.

This study explores the degree and patterns of substitution in the 10-kDa and 16-kDa prolamin genes in closely related species as a first step in assessing their utility in evolutionary studies and in providing an insight into the evolution of these multigene families. Consequently, domesticated African rice Oryza glaberrima Steud. and related wild species O. barthii A. Chev. were examined and contrasted with available sequences for the genus.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
Material
Seed of O. glaberrima and O. barthii from two USDA accessions (PI369447 and PI236393, respectively) were used in this study. The seeds were grown in pots in the greenhouse, and leaves from individual, 5-wk seedlings were harvested and stored at –80°C for DNA isolation.

DNA isolation and PCR amplification
Whole genomic DNA was isolated from fresh leaves following Hilu (1994) and used as templates in the polymerase chain reaction (PCR). The design of the primers used for the amplification of the 10-kDa prolamin genes was based on the cDNA sequence of Barbier and Ishihama (1990) to complement two relatively conserved sectors flanking most of the coding region. The primers match the 5' region leader sequence and a unique sequence at the 3' end. Five primers were designed for the amplification of the 16-kDa prolamin genes (Fig. 1) because our alignment of available sequences show the genes fall into two groups that differ by an internal 15-bp insertion and a number of mutations at the 3' end. The sequences, relative position, and sources of the primers are noted in Fig. 1. Primers OSa and Osb amplify the whole coding region of the "rice major" prolamin genes (Kim and Okita, 1988a ; GenBank number X60979) as they complement sequences close to the 5' and 3' ends. To amplify the whole coding region of the 16-kDa long form (form containing the 15-bp insertion), the OSa primer designed for the 5' end of rice major was used because of the conserved nature of the region in both forms. However, a new primer, OSc (Fig. 1), was designed based on the RICPLOL17A clone (Kim and Okita, 1988a ; GenBank number AA751313). For specific amplification of the long form, two internal primers, OSdF and OSdR (Fig. 1), were designed to complement the 15-bp insertion at positions 232–246 unique to this form of the 16-kDa genes. The latter two primers were designed to amplify in different directions when used in combination with the primers designed for the 5' and 3' regions (primers Osa and OSc). BamHI and EcoRI sites were added to all primers to improve efficiency of DNA cloning. DNA synthesis was carried out using 2.5 units of Taq DNA polymerase, 0.2 µg total DNA, 10 pmol/L of each primer, and 2.5 mmol/L MgCl2 in a final volume of 50 µL. Thirty cycles of amplification were carried out in a PTC-100 thermal cycler (MJ Reserach, Watertown, Massachusetts, USA) under the conditions of 1 min denaturation at 95°C, 1 min annealing at 45°C, and 3 min DNA synthesis at 72°C.



View larger version (11K):
[in this window]
[in a new window]
 
Fig. 1. A diagram illustrating the open reading frame (ORF) of the gene encoding the 16-kDa prolamin in Oryza and the locations of primers used in its amplification. Primers OSa (TCGTCTTTGCTCTCCTTGCTA) and Osb (AGGTACTATGGTGCACCCAGT) were used to amplify the rice major form, whereas primers OSa and OSc (GTGTCTGGTACTGAATTGTAAGGATCCACGT) were used for the long form (GenBank RICPLOL17A). Primers OSdF (ACGTGAATTCATGCAGCAACAGTGTTGC) and OSdR (ATGCAGCAACAGTGTTGCGGATCCACGT) were designed to complement a 15-bp insertion specific to the long form. Primers Osa, Osc, and OsF-OsR complement nt 45–65, 488–508, and 257–254 of the GenBank RICPLOL17A sequence. A restriction site and hanging bases were added to some of the primers. Primer Osb designed after Kim and Okita (1988a)

 
Southern blot analysis
Southern blot was used to positively identify the amplification products prior to sequencing and to estimate the number of gene copies. Polymerase chain reaction products were run on 0.8% agarose gel, and the DNA was transferred to Zeta-probe via the alkaline method (Reed and Mann, 1985 ). A 30-bp oligonucleotide that covers the insertion at positions 232–246 characteristic of the 16-kDa long form was labeled with {gamma}32P-ATP following the procedure recommended for T4 kinase (Promega, Madison, Wisconsin, USA) and used as a hybridization probe. For the estimation of gene copy number, EcoRI and HindIII genomic DNA digests for the two species were used in a Southern blot following the procedure cited above. The 10-kDa and 16-kDa rice prolamin cDNA inserts were labeled with 32P using the Random Primers DNA Labeling System (Promega, Madison, Wisconsin, USA) and used as hybridization probes. The membranes were hybridized with the radioactive probe overnight at 65°C in 3x SSC (0.15 mol/L NaCl, 0.015 mol/L sodium citrate), 20 mmol/L phosphate buffer pH 7.0, 7% SDS, 10x Denhardt's solution, and 100 µg/mL salmon sperm DNA. Membranes were washed twice in 2x SSC for 15 min each, once in each 1x and 0.5x SSC for 10 min, and exposed to Kodak XAR-5 films.

Cloning, nucleotide sequencing, and data analysis
Prolamin-positive PCR products (verified by the Southern blot assay) were isolated by low-melt agarose gel purification and phenol-chloroform extraction method of Appels and Lagudah (unpublished data) as described in Hilu and Stalker (1995) . The purified inserts were cloned into BamHI/EcoRI site of pUC18R vector for further mapping and sequencing. Search for clones with inserts was done using the blue-white colony screening method (Sambrook, Fritsch, and Maniatis, 1989 ). Positive clones were sequenced by ABI Prism TM 377 Automated DNA Sequencer using Taq polymerase, DyeDeoxy TM terminator cycle sequencing method (PE Biosystems, Foster City, California) with pUC primers. Alignment of sequences, generation of statistical data, and deduction of amino acid composition were calculated by using various options of DNAStar computer program (DNASTAR, Madison, Wisconsin, USA). Phylogenetic analysis for the 10-kDa sequences rooted with Phyllostachys aurea sequence (Hilu and Sharova, 1998 ) and the 16-kDa sequences of the glaberrima group rooted with rice major 16 were computed with Fitch parsimony (MP) as implemented in PAUP*4.0b5 (Swofford, 2001 ) using heuristic searches consisting of 1000 replicates of random stepwise addition of taxa with MULPARS on and tree-bisection-reconnection (TBR) branch swapping. To take into consideration the different patterns of evolution in the 16-kDa genes, the data were also analyzed with Maximum Likelihood (ML) using the HKY85 model (Hasegawa, Kishino, and Yano, 1985 ) and assuming empirical base frequencies, 1.26 transition/transversion ratio (estimated from MP trees), 0.45 proportion of invariable site, and equal rates for all sites. Heuristic search settings were similar to the MP analysis.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
Different forms of the genes encoding the 10- and 16-kDa prolamin were found, displaying sequence variation both at the nucleotide and amino acid levels (Figs. 2–4).



View larger version (51K):
[in this window]
[in a new window]
 
Fig. 2. Alignment of the 10-kDa gene sequences obtained for O. glaberrima and O. barthii as well as GenBank and published sequences for three other Oryza species. Abbreviations of clone names: rice10 refers to GenBank clone OS10KDPRO, O. rufipog to O. rufipogon GenBank clone ORPROLA1, O. longistam to O. longistaminata GenBank clone OLPROLA1, Og- to O. glaberrima, and Ob- to O. barthii

 
The10-kDa prolamin genes
The two primers for the 10-kDa prolamin gene amplified 400-bp DNA fragments. The PCR products were cloned, and a total of seven clones were sequenced for O. glaberrima and O. barthii. Sequences of 397 bp and 287 bp were obtained (Fig. 2). Alignment of these two length sequences indicates that an internal BamHI site is present in the PCR products and that the size difference is due to incomplete digestion (Fig. 2). The PCR products amplified by the 10-kDa primers were all digested with BamHI and EcoRI prior to cloning to generate sticky ends for ligation because we designed the primers with restriction sites for these two endonucleases. To further confirm this point, the GenBank sequences of 10-kDa genes were checked for the presence of a BamHI restriction site. We found that this site exists in all the sequences of Oryza (O. sativa, O. rufipogon, and O. longistaminata). The BamHI site is also found in the bambusoid Phyllostachys aurea (Hilu and Sharova, 1998 ).

Two long and one short 10-kDa sequence were generated for O. barthii (Fig. 2). The two long sequences (Ob10-8-10-4 and Ob10-8-10-9) were identical, while the short sequence differed from the long sequences in the presence of three mutations. The DNA sequences were translated into 132 amino acids for the long sequences and 93 for the short one. The amino acid composition of these genes clearly reflects their sulfur-rich structure (Fig. 4). The three mutations in the short sequence of O. barthii did not interrupt the reading frame, but two of them were nonsynonymous, resulting in mutations at the amino acid level from glutamine to arginine and leucine to histidine. The four clones sequenced for O. glaberrima represent two long and two short sequences (Fig. 2). Three of the sequences of O. glaberrima (one long and two short) were 99–100% similar to each other at the nucleotide and amino acid levels. One of these short sequences was identical to the long form, while the other differed from them by two nucleotide substitutions that resulted in synonymous mutations at the amino acid levels. These three sequences of O. glaberrima were quite similar (98.6–99.7%) to the 10-kDa genes sequenced from O. barthii. The fourth (long) sequence (clone Og10-8-10-8 of O. glaberrima), however, was relatively divergent, showing 92.5–93.5% similarity to the other three sequences of O. glaberrima. This distinct sequence also had a unique TGA insertion that corresponded to one we found in the GenBank sequence of O. longistaminata (Fig. 2). The deduced amino acid sequence of this distinct form was 90% similar to the long sequence and its identical form of the short sequence, only 84% similar to the second short sequence, and 84–89% similar to the 10-kDa prolamins of O. barthii. Phylogenetic analysis of the 10-kDa sequences resulted in an almost complete polytomy due to the extremely low sequence variation, and, thus, it is not discussed further.



View larger version (33K):
[in this window]
[in a new window]
 
Fig. 4. Amino acid composition of (A) 10-kDa and (B) 16-kDa prolamin of O. barthii (Ob) and O. glaberrima (Og) deduced from the sequences generated in this study. Deduced 10-kDa prolamins for Asian rice (O. sativa; rice major and rice sequences), O. rufipogon, and O. longistaminata (O. longistam) were generated from GenBank and published sequences and used in the interspecific comparison

 
The 16-kDa prolamin genes
Sequences were obtained from five O. barthii and six O. glaberrima clones (Fig. 3). Except for two clones, DNA sequences of 411 bps were obtained. The two exceptions were O. barthii clone Ob16-9-4-4 that was 387 bp long (Fig. 3). The deduced proteins for all clones except for the latter were 137 amino acids long. The 16-kDa sequences were quite variable in both species. The O. barthii clones were 88–95% similar at the DNA level and 67–84% at the amino acid level. Likewise, the six clones of O. glaberrima were 88–96% similar at the DNA level and slightly less diverse than the O. glaberrima (79–91%) at the amino acid level. The similarities between the 16-kDa sequences for O. barthii and O. glaberrima ranged from 79 to 96% at the DNA level, but exhibited lower similarities (66–93%) at the amino acid level. Seven of the 11 clones from both species gave sequences with deduced open reading frame (ORF) for the whole sequence. These seven clones appear to represent functional genes. The other clones had very specific point mutations at positions 239 and 311 (Fig. 3) that resulted in stop codons (C->T produces TAG stop codons in both positions). The MP analysis of the 16-kDa sequences resulted in six most parsimonious trees that are 109 steps long and have 0.91 and 0.89 consistency and retention indices, respectively. Of the overall aligned 486-bp sequences, 89 (0.18%) were variable, and 48 (54%) of those are parsimony informative. The consensus tree is shown in Fig. 5. The ML resulted in a single tree (not shown) similar in topology to one of the MP trees and differs from the consensus in resolving all the O. glaberrima clones in a single lineage.



View larger version (67K):
[in this window]
[in a new window]
 
Fig. 3. Alignment of the 16-kDa gene sequences obtained for O. glaberrima and O. barthii as well as GenBank and published sequences for Oryza species. Abbreviations of clone names: rice16 refers to GenBank clone RICPROL17A, rice16major to the sequence in Kim and Okita (1988b) , Og to O. glaberrima, and Ob to O. barthii

 


View larger version (21K):
[in this window]
[in a new window]
 
Fig. 5. A consensus gene tree of five most parsimonious trees based on sequences of genes encoding the 16-kDa prolamin of O. glaberrima and O. barthii. The tree is rooted with a sequence of rice major16. Bootstrap values are indicated above branches

 
Although the amino acid composition for the 16-kDa was typical for prolamin (high content of glutamine, leucine, alanine, and phenylalanine), variability within and between species in the frequency of these amino acids was evident (Fig. 4). For instance, compared with Asian rice, African rice's 16-kDa proteins have a higher content of alanine, lysine, and asparagine, and a lower content of cysteine, methionine, glutamate, and threonine. Variation in glycine, isoleucine, and valine was also evident in Asian and African rice.

To demonstrate whether or not O. glaberrima and O. barthii have the long form of the 16-kDa prolamins reported for O. sativa, PCR amplification was carried out on genomic DNA using primer combinations OSa-OSdR and OSdF-OSc specific for this form. A single band of ~250 bp was amplified for each of the OSa-OSdR and OsdF-OSc primer combinations. The ~250 bp products represent the 5' and 3' sections of the gene. The oligonucleotide probe specific to the insertion unique to the long form hybridized strongly to the two products, confirming the presence of the long 16-kDa form in O. barthii and O. glaberrima. Figure 6 shows the amplification and Southern blot results for the 5' section of the coding region.



View larger version (38K):
[in this window]
[in a new window]
 
Fig. 6. (A) The PCR-amplified product obtained from using primers OsdF and OSc that are complementary to the insertion characteristic of the long form of 16-kDa prolamin genes and the 3' end of the coding region, respectively (A), and (B) a Southern blot representing the hybridization of these products to a 30 bp probe of oligonucleotides that correspond to the insertion of the 16 kDa long. Lanes 1 and 2 represent O. barthii and O. glaberrima, respectively, and lane "a" contains the Lambda Hind III molecular weight DNA marker

 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
Of the two major gene families examined for the African rice and proposed putative ancestor O. barthii, the 10-kDa family appears to be quite uniform at the DNA and deduced amino acid levels. The sequence similarities in 10-kDa prolamin genes are close to 99% except for clone Og10-8-10-8 of O. glaberrima that display ~83–89% similarity. Our alignment of the 10-kDa GenBank sequences of rice, O. sativa and its proposed progenitor O. rufipogon, resulted in 100% identity. These sequence similarity values can be contrasted with the relatively lower similarity values (~87–96%) calculated between O. longistaminata and the other four species. The two rice crops, thus, appear to have 10-kDa gene sequences almost identical to their proposed respective wild progenitors. These data present further evidence in support of the proposed direct evolution of the Asian rice from O. rufipogon and the African rice from O. barthii (Oka, 1974 ; Chang, 1976 ). The O. longistaminata 10-kDa GenBank nucleotide sequence was equally similar to both species complexes and, thus, could not verify Cordess, Second, and Delsney's (1990) proposed hypothesis of the common ancestry between the sativa complex and O. longistaminata. However, at the amino acid level, the O. longistaminata and O. rufipogon sequences were quite similar to each other, particularly in alanine, phenylalanine, isoleucine, serine, and threonine content (Fig. 4A). Although the amino acid data underscore some affinities between the latter two species, sequences of other species of Oryza, particularly the perennis complex, are essential for a comprehensive phylogenetic assessment.

Compared to available sequences for 10-kDa prolamin genes in Oryza, the sequence obtained for the O. glaberrima clone Og10-8-10-8 appeared distinct, differing from the other sequences obtained for this species by 14 point mutations and 11 amino acid substitutions. The clone had an in-codon TGA insertion that was translated to an additional methionine and changed the next amino acid from lysine to asparagine. Thus, clone Og10-8-10-8 of O. glaberrima represents a distinct form of the 10-kDa prolamin gene family. The distribution and potential evolutionary implication of this form in oryzoid grasses need to be examined.

In contrast with the 10-kDa, the 16-kDa gene sequences were considerably more variable (Table 1; Fig. 3). The sequence variability was higher among the O. glaberrima as they form a polytomy at the base of the MP tree (Fig. 5). Although the ML resolved the O. glaberrima in one lineage, the branch leading to it was short. In contrast, the O. barthii sequences appeared in a clade supported by 80% bootstrap (Fig. 5). The 16-kDa sequences display unique mutations characteristic of an individual sequence or a group of sequences (Fig. 3). For instance, clones Ob16-9-4-8 and Obarh16-9-4-3 were 99% similar and appeared in a clade supported by 100% bootstrap (Table 1, Fig. 5) and possess unique mutations that are not found in the other clones of O. barthii, O. glaberrima, or O. sativa (Fig. 3). Clone Og16-9-4-4 has a number of indels at the 3' end (starting at position 370; Fig. 3), changing the reading frame and making the coding region much shorter than the other 16-kDa genes characterized in Oryza. This gene is distinct and produces a truncated protein. Therefore, it appears that several forms (loci) of the 16-kDa gene family are present.


View this table:
[in this window]
[in a new window]
 
Table 1. Similarities (in percentages) of DNA (upper) and deduced amino acid (lower) sequences of 16 kDa prolamin genes of clones representing Oryza barthii (Ob), O. glaberrima (Og), and O. sativa (rice). Rice16major: GenBank sequence #X60979; Rice16: clone RICPROL17A, GenBank sequence #AA751313; Kim and Okita (1988a); na: amino acid comparison is not applicable due to presence of internal stop codon in the open reading frames

 
Of the six 16-kDa clones of O. glaberrima, three have stop codons within the ORF and appear to represent pseudogenes (clones Og16-9-2-1, Og16-9-3-2, and Og16-9-3-4), while the other three gave open reading frames for the whole length of the sequences. The nucleotide sequence similarity among the latter three genes is 96%, compared with 98–99.5% among the pseudogenes. The pseudogenes sequences formed a clade in both MP and ML analyses with strong support (100% bootstrap) in the MP (Fig. 5). These pseudogenes have 13 unique point mutations at corresponding positions, two of them are stop codon mutations. Consequently, it is very likely that these genes have a common origin. The pseudogenes displayed relatively low sequence similarities (78–93%) to the other 16-kDa genes, indicating that they have diverged by accumulating neutral mutations, a situation typical of pseudogenes genes (Gojobori, Li, and Graur, 1982 ).

Comparative sequence analysis among 10-kDa and 16-kDa genes in Oryza
To understand the patterns of nucleotide substitutions and their impact on amino acid composition, sequences generated here and others obtained from GenBank and other published works (Kim and Okita, 1988a, b ; Masumura et al., 1989 ; Barbier and Ishihama, 1990 ; Feng et al., 1990 ) were aligned (Figs. 2–3). Among the 10-kDa sequences, clone Og10-8-10-8 of O. glaberrima remains quite distinct from the others by 13 point mutations and the three-nucleotide insertion (Fig. 2). The insertion was shared only with another African species, O. longistaminata, presenting a synapomorphy of potential geographic and taxonomic significance. The remaining clones were either identical to one another or differing in 1–2 mutations (Fig. 2). These mutations were nonsynonymous for the most part, resulting in seven amino acid substitutions and pointing to relatively relaxed selectional constraints at the protein level. In fact, the fluctuation in amino acid composition was, by far, more pronounced in the 10-kDa than the 16-kDa genes (Fig. 4). The overall low number of nucleotide substitutions observed in the different forms of the 10-kDa genes underscores the conserved nature of this gene family and points to their potential utility in understanding the phylogenetic relationships above the species level.

Compared with the 10-kDa genes, the pattern of variation in the 16-kDa gene is more complex. A comparative alignment of the 16-kDa sequences from O. glaberrima, O. barthii, and O. sativa showed a range of similarities from 76.0 to 100% at the nucleotide level and 74.5–100% at the amino acid level (Table 1). The two sequences chosen to represent the diversity of the 16 kDa genes in O. sativa, RICPROL17A sequence and rice major prolamin (Kim and Okita, 1988a ), were different from each other. The RICPROL17A sequence is 18 bp and six amino acids longer than the rice major prolamin (Fig. 3), contains an insertion at positons 232–246, and has two indels at positions 250–251 and 265. These indels represent a frame shift followed soon by a restoration. A closer look at the carboxy termini of the RICPROL17A and the rice major form showed that they are different in structure, while the amino termini were identical. This hypothesis was confirmed when primer combinations OSa/OSb and OSa/OSdR produced high PCR yield while OSdF/OSb failed. Thus, the two represent different families of 16-kDa prolamin genes. We will refer to the RICPROL17A as the "16-kDa long form." In spite of the presence of mutational differences among the two 16-kDa sequences of O. sativa, they differ from the glaberrima groups (O. glaberrima and O. barthii) by two major indels (positions 362–367, 448–486) and several synapomorphic mutations (Fig. 3).

The O. barthii and O. glaberrima 16 kDa displayed considerably higher sequence similarities to the 16-kDa rice major (84–94%) than to the RICPROL17A clone (72%), revealing the same deletion found in the former. The presence of the long form of the 16-kDa prolamin in O. barthii and O. glaberrima was evident when primer combinations OSa/OSc, OSa/OSdR, and OSdF/OSc produced high PCR yield. This was further confirmed when a Southern blot of the PCR products was probed with a 5' end labeled oligonucleotides specific to the unique region found in the GenBank sequences. The probe hybridized strongly with the PCR-amplified products obtained from using the primers specific to the long form (Fig. 6).

Variation in the 10-kDa and 16-kDa genes also corresponds to variation in amino acid composition (Fig. 4). The most variable amino acids were alanine, cystine, glycine, isoleucine, valine, serine, and threonine. Although variation in amino acid frequencies was noticeable in both gene families and the 16-kDa genes resulted in overall higher rates of nonsynonymous substitutions, the magnitude of variation in the 10-kDa was quite pronounced in alanine, isoleucine, serine, and threonine (Fig. 4). It is worth noting that the latter two amino acids and the variable tyrosine are involved in protein phosphorylation. In spite of this variation, the overall amino acid profiles were prolamin specific (Fig. 3), i.e., rich in hydrophobic and uncharged and poor in charged amino acids.

The overall intragenomic homogenization of these multigene families can be achieved through concerted evolution (Zimmer et al., 1980 ) operating through molecular-drive mechanisms such as biased gene conversion, unequal crossing over, among others (reviewed in Elder and Turner, 1995 ). However, the presence of a large copy number of those genes per genome may allow for a certain number of mutations that are not functionally deleterious to be fixed and also for the presence of pseudogenes at relatively low frequencies. Such diversity within a gene family underscores the need to differentiate between orthologous and paralogous genes (Sanderson and Doyle, 1992 ; Buckler, Ippolito, and Holtsford, 1997 ), a prerequisite for segregating species trees from gene trees in molecular evolutionary studies based on multigene families. Due to the high sequence similarities among most of the 16-kDa clones (excluding rice 16), the emergence of O. barthii and O. glaberrima in distinct clades does not appear to be based on paralogous loci.

The two multigene families encoding the 10- and 16-kDa prolamin appear to evolve at different rates. The 10-kDa genes are conserved at the intrageneric level but display an appreciable degree of divergence among genera (Hilu and Sharova, 1998 ). The large number of substitutions (26%) and length mutations underscores the potential application of the 16-kDa gene family in assessing variation and resolving evolutionary patterns at the population level. The ML as well as MP analyses underscore the utility of these genes among closely related species (Fig. 5). These valuable attributes render the prolamin genes as potentially useful molecules in evolutionary studies at different taxonomic levels. However, discrimination among families of divergent paralogs is necessary to avoid inaccurate organismal phylogenies (Buckler, Ippolito, and Holtsford, 1997 ). A notable example here is rice 16, which represent a distinct 16-kDa gene family and thus a paralog and hence it was excluded from the analysis. The maintenance of prolamin-specific amino acid profiles in both gene families in spite of an appreciable number of nonsynonymous mutations reflects a certain degree of selectional constraint operating on these genes. Consequently, a study tracing the subsequent evolution of these gene families in the Poaceae is quite conceivable and would have the potential to provide information on evolution of multigene families, a basic feature of plant and animal nuclear genomes.


    FOOTNOTES
 
1 The authors thank the U.S. Department of Agriculture National Small Grains Collection (NSGC) for providing the seed accessions used in this study and Susana S. Neves for valuable assistance. This study was supported by grants from the Jeffress Foundation and the College of Arts and Sciences, Virginia Tech. Back


    LITERATURE CITED
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
Barbier P. A. Ishihama 1990 Variation on the nucleotide sequence of a prolamin gene family in wild rice. Plant Molecular Biology 15: 191-195[CrossRef][ISI][Medline]

Barker N. P. H. P. Linder E. H. Harley 1995 Polyphyly of Arundinoideae (Poaceae): evidence from rbcL sequence data. Systematic Botany 20: 423-435[CrossRef][ISI]

Buckler E. S. A. Ippolito T. P. Holtsford 1997 The evolution of ribosomal DNA: divergent paralogues and phylogenetic implications. Genetics 145: 821-832[Abstract]

Chang T. T. 1976 The origin, evolution, cultivation, dissemination, and diversification of Asian and African rices. Euphytica 25: 425-441[CrossRef][ISI]

Clark L. G. W. Zhang J. F. Wendel 1995 A phylogeny of the grass family (Poaceae) based on ndhF sequence data. Systematic Botany 20: 436-460

Cordess F. G. Second M. Delsney 1990 Ribosomal gene spacer length variability in cultivated and wild rice species. Theoretical and Applied Genetics 79: 81-88[ISI]

Elder J. F. B. J. Turner 1995 Concerted evolution of repetitive DNA sequences in eukaryotes. Quarterly Review of Biology 70: 297-320[CrossRef][Medline]

Esen A. K. W. Hilu 1989 Immunological affinities among subfamilies of the Poaceae. American Journal of Botany 76: 196-203[CrossRef][ISI]

Feng G. L. Wen J. K. Huang B. S. Shorrosh S. Mythukrishnan G. R. Reeck 1990 Nucleotide sequence of a cloned rice genomic DNA fragment that encodes a 10-kDa prolamin polypeptide. Nucleic Acids Research 18: 683[Free Full Text]

Gojobori T. W.-H. LI D. Graur 1982 Patterns of nucleotide substitution in pseudogenes and functional genes. Journal of Molecular Evolution 18: 360-369[CrossRef][ISI][Medline]

Hasegawa M. H. Kishino T. Yano 1985 Dating of the human-ape splitting by a molecular clock of a mitochondrial DNA. Journal of Molecular Evolution 21: 160-174

Hilu K. W. 1994 Evidence from RAPD markers in the evolution of Echinochloa millets (Poaceae). Plant Systematics Evolution 189: 147-157

Hilu K. W. L. A. Alice H. Liang 1999 Phylogeny of Poaceae inferred from matK sequences. Annals of Missouri Botanical Garden 86: 835-851[CrossRef][ISI]

Hilu K. W. A. Esen 1988 Prolamin size diversity in the Poaceae. Biochemical Systematics Ecology 16: 457-465[CrossRef]

Hilu K. W. L. Sharova 1998 Cloning and characterization of two prolamin genes in the bamboo grass Phyllostachys aurea. Riv. American Journal of Botany 85: 1033-1037

Hilu K. W. H. T. Stalker 1995 Genetic relationship between peanuts and wild species of Arachis section Arachis: evidence from RAPD. Plant Systematics Evolution 198: 167-178[CrossRef]

Kim W. T. T. W. Okita 1988a Nucleotide and primary sequence of a major rice prolamin. FEBS Letters 231: 308-310[CrossRef][ISI][Medline]

Kim W. T. T. W. Okita 1988b Structure, expression, and heterogeneity of the rice seed prolamins. Plant Physiology 88: 649-655[Abstract/Free Full Text]

Masumura T. T. Hibino K. Kidzu N. Mitsukawa K. Tanaka S. Fujii 1990 Cloning and characterization of a cDNA encoding a rice 13 kDa prolamin. Molecular General Genetics 221: 1-7

Masumura T. D. Shibata T. Hibino T. Kato K. Kawabe G. Takeba K. Tanka S. Fujii 1989 cDNA cloning of an mRNA encoding a sulfur-rich 10 kDa prolamin polypeptide in rice seeds. Plant Molecular Biology 12: 123-130

Nayar N. M. 1973 Origin and cytogenetics of rice. In E. W. Caspari [ed.], Advances in genetics, 153–292. Academic Press, London, UK

Oka H. I. 1974 Experimental studies on the origin of cultivated rice. Genetics 78: 475-486[Abstract/Free Full Text]

Reed K. C. D. A. mann 1985 Rapid alkaline transfer of DNA from agarose gels to nylon membranes. Nucleic Acids Research 13: 6207-7221

Sambrook J. E. F. Fritsch T. Maniatis 1989 Molecular cloning: a laboratory manual, 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, USA

Sanderson M. J. J. J. Doyle 1992 Reconstruction of organismal and gene phylogenies from data on multigene families: concerted evolution, homoplasy, and confidence. Systematic Biology 41: 4-17[CrossRef][ISI]

Shewry P. R. J. A. Napier A. S. Tatham 1995 Seed storage proteins: Structures and biosynthesis. Plant Cell 7: 945-956[CrossRef][ISI][Medline]

Swofford D. L. 2001 PAUP*: phylogenetic analysis using parsimony, 4.0b5. Sinauer, Sunderland, Massachusetts, USA

Tsunoda S. N. Takahashi 1984 Biology of rice. Japan Scientific Society Press, Tokyo, Japan

Vaughan D. A. 1994 The wild relatives of rice. International Rice Research Institute, Los Banos, Philippines

Zimmer E. A. S. L. Martin S. M. Beverley Y. W. Kan A. C. Wilson 1980 Rapid duplication and loss of genes coding for the alpha chains of hemoglobin. Proceedings of the National Academy of Sciences, USA 77: 2158-2162[Abstract/Free Full Text]





This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Hilu, K. W.
Right arrow Articles by Sharova, L. V.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Hilu, K. W.
Right arrow Articles by Sharova, L. V.
Agricola
Right arrow Articles by Hilu, K. W.
Right arrow Articles by Sharova, L. V.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS