Am. J. Bot. Join the BSA
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (21)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Liu, Q.
Right arrow Articles by Singh, S. P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Liu, Q.
Right arrow Articles by Singh, S. P.
Agricola
Right arrow Articles by Liu, Q.
Right arrow Articles by Singh, S. P.
(American Journal of Botany. 2001;88:92-102.)
© 2001 Botanical Society of America, Inc.

Evolution of the FAD2-1 fatty acid desaturase 5' UTR intron and the molecular systematics of Gossypium (Malvaceae)1

Qing LiuGo,2, Curt L. Brubaker3, Allan G. Green2, Don R. Marshall4, Peter J. Sharp4 and Surinder P. Singh2

2 CSIRO Plant Industry, GPO Box 1600, Canberra, ACT 2601, Australia; 3 Centre for Plant Biodiversity Research, CSIRO Plant Industry, GPO Box 1600, Canberra, ACT 2601, Australia;and 4 University of Sydney, Plant Breeding Institute, Cobbitty, PMB11, Camden, NSW 2570, Australia

Received for publication September 3, 1999. Accepted for publication March 3, 2000.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
The FAD2-1 microsomal {omega}-6 desaturase gene contains a large intron (~1133 bp [base pairs]) in the 5' untranslated region that may participate in gene regulation and, in Gossypium, is evolving at an evolutionary rate useful for elucidating recently diverged lineages. FAD2-1 is single copy in diploid Gossypium species, and two orthologs are present in the allotetraploid species. Among the diploid species, the D-genome FAD2-1 introns have accumulated substitutions 1.4–1.8 times faster than the A-genome introns. In the tetraploids, the difference between the D-subgenome introns and their A-subgenome orthologs is even greater. The substitution rate of the intron in the D-genome diploid G. gossypioides more closely approximates that of the A genome than other D genome species, highlighting its unique evolutionary history. However, phylogenetic analyses support G. raimondii as the closest living relative of the D-subgenome donor. The Australian K-genome species diverged 8–16 million years ago into two clades. One clade comprises the sporadically distributed, erect to suberect coastal species; a second clade comprises the more widely spread, prostrate, inland species. A comparison of published gene trees to the FAD2-1 intron topology suggests that G. bickii arose from an early divergence, but that it carries a G. australe-like rDNA captured via a previously undetected hybridization event.

Key Words: cotton • FAD2-1 • fatty acid desaturase • Gossypium • intron • Malvaceae • polyploidy • reticulate evolution


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
All higher plants contain one or more microsomal {omega}-6 desaturase(s) that insert a double bond between carbons 12 and 13 of monounsaturated oleic acid to generate polyunsaturated linoleic acid. This enzyme is mainly responsible for the production of the polyunsaturated fatty acids that are integral components of plant cellular membranes and of storage lipids in many vegetable oils (Shanklin and Cahoon, 1998 ). In Gossypium, the microsomal {omega}-6 desaturase family comprises at least two distinct members in diploid species and perhaps as many as five in the allotetraploid species (Liu et al., 1996 ). Of particular interest is one member of this gene family, FAD2-1, which is encoded by at least two copies in the tetraploid cotton species G. barbadense and G. hirsutum (ghFAD2-1), and by a single copy in the diploid cotton species G. arboreum, G. raimondii, and G. robinsonii (Liu et al., 1999 ).

FAD2-1 is highly expressed and seed-specific and, therefore, is probably the main contributor of the polyunsaturated fatty acids in the seed oil of cultivated cottons (Liu et al., 1999 ). In addition to the three histidine boxes that are typical of all membrane-bound desaturases, the ghFAD2-1 gene contains a stretch of six contiguous glycine residues in the C-terminus of the open reading frame. Moreover, comparisons of genomic and cDNA clones encoding the ghFAD2-1 gene revealed a single large intron (~1133 bp [base pairs]) in the 5' untranslated region (UTR) located 9 bp upstream from the putative translation start site (Liu et al., 1997 ). Preliminary examination of the FAD2-1 gene from five species (G. arboreum, G. barbadense, G. hirsutum, G. raimondii, and G. robinsonii) revealed that the size and position of the intron were conserved. Sequence comparisons also suggested that the FAD2-1 intron may be evolving at a quick enough rate for inferring evolutionary relationships among recently diverged lineages and, in this regard, could be particularly useful for elucidating evolutionary pathways among the 17 Gossypium species indigenous to Australia, a group whose evolutionary history remains unresolved (Seelanan et al., 1999 ).

The current evolutionary understanding of the 17 Australian Gossypium species is based on morphological and cytological comparisons, and the phylogenetic analyses of three nucleotide sequences derived from the rpl16 intron (1155 bp), the 18S–26S rDNA internal transcribed spacer (ITS: 688 bp), and a portion of an alcohol dehydrogenase gene (AdhD: 1600 bp) (Seelanan et al., 1999 ). These analyses confirmed hypotheses regarding the basal divergences on the Australian continent but provided little resolution of the evolutionary relationships among the 12 species indigenous to the Kimberley plateau of northwestern Australia. The nuclear gene topologies were also incongruent regarding the evolution of G. bickii, which has a biphyletic ancestry. Gossypium bickii captured a G. sturtianum-like chloroplast earlier in its evolutionary history, but to date there is no evidence that this was accompanied by nuclear introgression (Wendel, Stewart, and Rettig, 1991 ). The topological incongruencies may point to the first evidence that hybridization and introgression also altered the composition of G. bickii's nuclear genome, or that G. bickii experienced a second and heretofore undescribed evolutionary reticulation.

To provide an evolutionary context for investigations into the regulatory role of ghFAD2-1 intron and to refine our understanding of the evolution of the Australian Gossypium species, the FAD2-1 intron was cloned and sequenced from 31 Gossypium species. Each major geographic region within the indigenous range of Gossypium (Africa/Arabia A and E genome; New World D genome) is represented, including all of the Australian C, G, and K species and all the AD genome New World allotetraploids. The resultant data allowed us to address the following questions: (1) Does the intron occur in all the Gossypium FAD2-1 genes? (2) Are the two copies in the allotetraploid species orthologs, inherited from their A and D genome progenitors, respectively? and (3) Does this intron contain sufficient phylogenetic signal to resolve the evolutionary pathways among the Australian Gossypium species and the ambiguities regarding the evolution of G. bickii?


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
The microsomal {omega}-6 desaturase FAD2-1 intron was amplified and cloned from 39 accessions of 31 Gossypium species (Table 1). Multiple accessions of G. australe, G. bickii, G. nelsonii, G. robinsonii, and G. sturtianum were assayed to strengthen inferences regarding the basal divergences among the Australian Gossypium species. Pairs of putatively orthologous clones were sequenced from each of the five tetraploid species. The two A-genome species and five representative D-genome species were included to confirm the inferred subgenomic origin of the clones of the tetraploid species, including the two D-genome species nominated as most likely to be sister to the D-subgenome (Endrizzi, Turcotte, and Kohel, 1985 ; Wendel and Albert, 1992 ; Wendel, Schnabel, and Seelanan, 1995 ).


View this table:
[in this window]
[in a new window]
 
Table 1. Gossypium accessions assayed. Full provenance details available from C. L. Brubaker. Genome designations follow Stewart (1995); taxonomy follows Fryxell (1992)

 
Total genomic DNA was extracted following Paterson, Brubaker, and Wendel (1993) and further purified by CsCl gradients following Sambrook, Fritsch, and Maniatis (1989) . The entire 5' UTR intron was amplified using primers that flanked the predicted splice site. The upstream primer (S1: 5'-CCTGGCGTTAAACTGCTTTC-3') is located at 44–63 bp downstream of the transcription start site in the 5' UTR and the downstream primer (A1: 5'-GCATAGGTCATGGACCACGT-3') is located at 239–258 in the coding region (exon2) of ghFAD2-1 (EMBL accession X97016). The 50-µL polymerase chain reactions (PCRs) contained 200 µmol/L dNTPs, 1X PE Applied Biosystems (Scoresby, VIC, Australia) PCR buffer, 20 pmol of each primer, 10 ng genomic DNA, and 1 unit of Taq DNA polymerase. PCRs started with a 2-min denaturation at 94°C, followed by 30 cycles of 94°C for 1 min, 56°C for 1 min, and 72°C for 1 min, and finished with 10-min final extension at 72°C. PCR products were purified with Wizard® PCR Preps DNA Purification System (Promega; Annandale, NSW, Australia) and cloned into T®-vector (Promega) according to manufacturer's instructions. Plasmids were isolated following Sambrook, Fritsch, and Maniatis (1989) , and the DNA sequences were determined using the PRISMTM kit (PE Applied Biosystems) on an ABI373 DNA Sequencer.

Sequence analysis
Sequences were initially aligned using GCG-pileup (Wisconsin Package Version 9.1, Genetics Computer Group [GCG], Madison, Wisconsin, USA) and then adjusted manually. Individual sequences have been submitted to EMBL (Table 1); the sequence alignment was also submitted to EMBL (Accession DS41945). Mega 1.01 (Kumar, Tamura, and Nei, 1993 ) was used to characterize the sequences and compute pairwise Jukes-Cantor distances (Jukes and Cantor, 1969 ).

Topologies were inferred heuristically using the GCG implementation of PAUP (Wisconsin Package Version 9.1, Genetics Computer Group [GCG], Madison, Wisconsin, USA) using parsimony or distance (minimum evolution) as the optimality criterion. In both cases, starting trees were acquired by stepwise addition (simple), ten trees were held at each step, and the TBR algorithm was used for branch swapping using steepest descent. Gaps were treated as missing data, and potentially informative indels were recoded as binary characters and included in some analyses. Distance-optimized topologies were initially inferred using the Jukes-Cantor (Jukes and Cantor, 1969 ) model of nucleotide substitution, however, because transition/transversion ratios were generally <2 but the frequencies of the four nucleotides deviated substantially from equality, the Tajima-Nei (Tajima and Nei, 1984 ) estimator was also used (Kumar, Tamura, and Nei, 1993 ). Sites containing gaps and regions of ambiguous homology were ignored. Negative branch lengths were set to zero. The model for substitution rate variation across sites was determined by the gamma distribution (shape parameter set equal to 0.5). As a measure of clade "strength," Autodecay, in association with PAUP 3.1 for Macintosh (Eriksson and Wikström, 1995 ; Swofford, 1991 ), was used to determine the length of the shortest tree in which each clade failed to appear (Bremer, 1988 ; Donoghue et al., 1992 ). Relative rate tests followed Tajima (1993) using Tajima93 (see Seelanan et al., 1999 ).


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
Comparison of the cDNA sequence of the G. hirsutum ghFAD2-1 gene and its corresponding genomic clone confirmed the presence of a single large intron in the 5' untranslated region (UTR) (Liu et al., 1997, 1999 ). The intron is ~1133 bp and is located 9 bp upstream of the translation initiation site in G. hirsutum. This is strikingly similar to the microsomal {omega}-6 desaturase in Arabidopsis thaliana, which has a 1130-bp intron located 4 bp upstream of the translation initiation site (Okuley et al., 1994 ). In contrast, the Glycine max FAD2-1 has a much smaller intron of 320 bp located 4 bp downstream of the translation initiation site (Liu et al., 1997 ). Attempts to align these three intron sequences revealed numerous sequence dissimilarities and no obvious regions of conservation.

Using the primers developed for G. hirsutum, the intron was amplified from the 39 Gossypium accessions (Table 1). The diploid Gossypium species contained a single copy of the FAD2-1 gene, and in each case the gene contained the 5' UTR intron. The five allotetraploid species contained two distinct FAD2-1 genes and each contained an intron (subgenus Karpas; Table 1). All of the introns started with GT and ended with AG, consistent with the plant consensus 5' and 3' exon/intron boundaries (Simpson and Filipowicz, 1996 ). The introns had a mean GC content of 24%, and the 154 bp of exon 2 had a mean GC content of 55%. The length of the introns ranged from 1065 (G. australe-2) to 1166 bp (G. gossypioides). The G-genome species, G. raimondii, and the tetraploid A- and D-subgenome orthologs had mean intron lengths below 1120 bp, while the A-, C-, D-, E-, and K- genome species had mean intron lengths greater than 1130 bp. The introns contained 14 simple sequence repeat regions with more than five repeat units in at least one accession: 12 (T)n, 1 (CT)n, and 1 (AAG)n (individual accession data available from corresponding author).

Phylogenetic analysis
The final aligned length of the analyzed matrix was 1406 bp. The analyzed sequences start with the first nucleotide of the intron and end with the 154th nucleotide of the second exon. Twenty-eight of the insertion/deletion events (1–57 bp) inferred from this alignment were potentially phylogenetically informative and were coded as binary characters. The 154 nucleotides of exon 2 aligned without gaps. Simple sequence repeat regions were excluded. Homology assessments in several other short regions were ambiguous and also excluded. Of the final aligned length of 1406 bp, 221 bp were excluded from phylogenetic analyses. Considering only the 1185 nucleotide positions used in the phylogenetic analyses, 344 were variable, of which 169 were parsimony informative: 319 variable sites occurred in the intron, of which 158 were parsimony informative; 25 variable sites occurred in 154 bp of the 5' end of the exon, of which 11 were parsimony informative.

Parsimony-optimized topologies were inferred with and without the coded indels. An heuristic search without the coded indels returned 108 equally parsimonious trees with consistency indices (CI) of 0.692, excluding uninformative characters, and retention indices (RI) of 0.901 (not illustrated). With the 28 indels included, an heuristic search returned 103 equally parsimonious trees (CI = 0.717, excluding uninformative characters; RI = 0.912) (Fig. 1A). Strict consensus trees from analyses with and without the indels differed only in the resolution of the G. hirsutum-A/G. tomentosum-A/G. barbadense-A/G. darwinii-A clade relative to the G. arboreum/G. herbaceum clade. When the indels were excluded, these four A-subgenome introns collapsed to form a polytomy with the G. arboreum–G. herbaceum clade. With the indels included, the A-subgenome intron of these four species appeared as a sister clade to the G. arboreum/G. herbaceum clade (Fig. 1A).



View larger version (53K):
[in this window]
[in a new window]
 
Fig. 1. Parsimony (A) and distance (B) optimized topologies of the FAD2-1 intron among 39 accessions of 31 Gossypium species. (A) Strict consensus tree of 103 equally parsimonious trees (517 steps; consistency index = 0.717, excluding uninformative characters; retention index = 0.912). Twenty-eight indels included in analysis as binary characters, otherwise gaps were treated as missing data. The number of unambiguous substitutions are indicated above branches; the decay index for each clade is indicated below each branch. (B) Single most parsimonious distance-optimized (Tajima and Nei, 1984 ) tree (485 steps ; consistency index = 0.688, excluding uninformative characters; retention index = 0.899)

 
Distance-optimized topologies were inferred from the pared sequence matrix using Jukes-Cantor (Jukes and Cantor, 1969 ) or Tajima-Nei (Tajima and Nei, 1984 ) models of nucleotide substitution. Both analyses returned a single and identical tree (CI = 0.688, excluding uninformative characters; RI = 0.899) (Fig. 1B). The distance- and parsimony-optimized topologies are congruent except for the placement of the E-genome clade and G. marchantii (Fig. 1).

In both distance- and parsimony-optimized topologies, all the diploid genome groups are resolved as monophyletic lineages except the Australian G genome. Bremer support for the monophyly of the A-, D-, E-, and K-genome clades is strong, whereas the monophyly of the C genome clade is weakly supported (Fig. 1A). Because no outgroup was included, both topologies are midpoint rooted and no inferences regarding the basal divergence within Gossypium obtain from the figured topologies.

The relationship of the E-genome clade to the other genomes is ambiguous. In parsimony-optimized topologies, it is sister to the A genome with a decay index of one (Fig. 1A). This relationship, however, is inconsistent with the distance-optimized topologies, where the E-genome species are sister to the D-genome species (Fig. 1B). However, the significantly unequal substitution rates among the genome lineages (discussed below) suggest that the parsimony topology may be more reliable. This then would provide weak evidence that the A and E genomes shared a common ancestor more recently than either did with the D genome.

The Australian Gossypium species (C, G, and K genomes) are also resolved as a monophyletic lineage in the parsimony- and distance-optimized topologies (Fig. 1). Within the Australian clade there is a primary divergence into four subclades: (1) G. australe/G. nelsonii, (2) G. bickii, (3) the K genome, and (4) the C genome.

Among the K-genome species, two well-supported clades are evident (decay index of 3; Fig. 1). The first comprises G. species novum, G. costulatum, G. cunninghamii, G. londonderriense, G. marchantii, G. nobile, G. populifolium, and G. pulchellum; the second comprises G. enthyle, G. exiguum, G. pilosum, and G. rotundifolium. Within the former, or K1 clade, two subclades are evident, G. species novum/G. cunninghamii/G. londonderriense, and G. costulatum/G. populifolium. The relationship of G. nobile to these subclades is unresolved, and G. marchantii is weakly supported as sister to the G. costulatumG. populifolium clade. Gossypium pulchellum is weakly supported as the single extant ancestor of one lineage arising from the basal divergence in the K1 clade. The latter, or K2 clade, sees G. rotundifolium placed sister to an unresolved trichotomy comprising G. enthyle, G. exiguum, and G. pilosum.

The topological placement of G. bickii was unexpected. The two other G-genome species, G. australe and G. nelsonii, are strongly supported as sister species (decay index of 9), but G. bickii resolves as sister to the K-genome clade (Fig. 1). Because this implied that the G genome was paraphyletic, further heuristic searches were undertaken to test the stability of this result. In the first instance, the data matrix was reanalyzed without the E-genome species (because of their own topological instability, discussed above), the C-genome species, or G. australe and G. nelsonii. Subsequently, the E-genome species in combination with the C-genome species or G. australe and G. nelsonii were excluded. In all cases, distance- and parsimony-optimized searches returned consensus topologies congruent with those illustrated in Fig. 1. The same topology was also recovered when the Australian species were analyzed with only the A-genome, D-genome, or the E-genome species.

The topological placement of the tetraploid sequences is consistent with the original assessment that each tetraploid species contained a pair of orthologous loci. The putative D-subgenome sequences from the tetraploid species resolved as a monophyletic clade sister to G. raimondii within a clade of other diploid D-genome species. The putative A-subgenome sequences appear in a strongly supported A-genome clade (decay index of 13), but do not resolve as a monophyletic sublineage. Gossypium mustelinum appears as basal to two subclades: (1) G. arboreum/G. herbaceum and (2) G. barbadense/G. hirsutum/G. darwinii/G. tomentosum.

Relative rates of nucleotide substitution
Because the basal divergence in Gossypium occurred between the ancestor of the A-, D-, E-, and AD-taxa and the ancestor of the C-, G-, and K-genome species (Wendel and Albert, 1992 ; Seelanan et al., 1997, 1999 ), the relative substitution rates (2D test; Tajima, 1993 ) among the Australian species were evaluated using G. somalense, G. raimondii, or G. herbaceum as reference taxa. The pared sequences were used, i.e., areas of ambiguous homology were excluded, and regions with gaps in one or more taxa were excluded from all comparisons. All three analyses demonstrated that nucleotide substitution rates are largely homogeneous among the species. Only G. species novum, G. australe-3, G. londonderriense, and G. rotundifolium returned significant chi-square tests for some species combinations (data not shown). These species were excluded from divergence time estimates (described below).

Conversely, homogeneity of substitution rates among the A-, D-, E-, and AD-genome species were tested using G. costulatum, G. nelsonii, G. robinsonii, or G. sturtianum, respectively, as reference taxa. As above, the pared sequences were used to eliminate ambiguous homology assessments and sites with gaps were excluded from all comparisons. All four tests consistently demonstrated that nucleotide substitution rates are not homogeneous among these clades (Table 2). Particularly notable was that the A- and D-genome lineages, except G. arboreum and G. gossypioides, have accumulated substitutions at significantly different rates.


View this table:
[in this window]
[in a new window]
 
Table 2. Relative rate test (2D; Tajima, 1993) of nucleotide substitution rates among the A-, D-, and E-genome species relative to G. costulatum, calculated using Tajima93 (see Seelanan, Schnabel, and Wendel, 1997). (P = 0.05 > * > 0.01 > ** > 0.005 > ***). Exon sequence, regions of dubious homology, and indels were not included in analyses

 
To more fully understand the basis of these results, the mean Tajima-Nei distances between the A- and between the D-genome taxa and the four major Australian clades were calculated (Table 3). Gossypium arboreum and G. gossypioides were considered separately, and G. australe-3, G. species novum, G. londonderriense, and G. rotundifolium were not included in the calculations. These comparisons demonstrate that D-genome introns accumulated ~1.8 more substitutions per site than did the A-genome introns. If these comparisons are partitioned into tetraploid subgenomic and diploid components, the A-subgenome taxa had a lower mean distance from the taxa in the four major Australian clades than did G. arboreum and G. herbaceum, in contrast to the D-subgenome taxa, which have a higher mean distance from the taxa in the four major Australian clades than is observed in the D diploid taxa. This suggests that the differences in nucleotide substitution rates between the A- and D-genome lineages have been magnified in the two polyploid lineages. These comparisons are consistent regardless of which of the Australian lineages are used as a point of reference.


View this table:
[in this window]
[in a new window]
 
Table 3. Mean Tajima-Nei distances (Tajima and Nei, 1984) between A or D species and the four Australian lineages. Exon sequence, regions of dubious homology, and indels were not included in the distance estimation

 
The comparisons in Table 3 also illustrate why G. arboreum and G. gossypioides appeared as anomalies in the global relative rate tests (Table 2). Both the A-genome diploids have higher mean Tajima-Nei distances relative to the Australian species than the A-subgenome homologs, however this difference is most pronounced for G. arboreum, and it most closely approaches the mean Tajima-Nei distances between the D-genome species and the Australian species. In contrast, the mean Tajima-Nei distances between the Australian species and D-genome diploid species are lower than those for the D-subgenome homologs, and among the D-genome diploids, G. gossypioides has the lowest distance to the Australian species and thus more closely approaches the nucleotide substitution rate of the A-genome species.

Divergence time estimates
Because substitution rates among the Australian species were largely homogeneous, the mean Tajima-Nei distances were used to estimate the time since divergence among the major Australian clades. In each clade, the mean Tajima-Nei distances were calculated from all pairwise comparisons except for the taxa that returned significant relative rate statistics (viz., G. species novum, G. australe-3, G. londonderriense, and G. rotundifolium). The substitution rate was calibrated using Seelanan et al.'s (1999) estimate that the C genome diverged from the K genome 10.5–21 million years before present (mybp). This returned an estimated substitution rate of 0.7–1.7 x 10-9 substitutions per site per year. Although the accuracy of this estimated substitution rate cannot be independently verified, it permits relative comparisons of divergence times among the major Australian clades derived from the chloroplast gene rpl16 and the nuclear gene AdhD (Seelanan et al., 1999 ).

The estimated divergence times derived from the FAD2-1 intron mostly are congruent with estimates derived from rpl16 and AdhD (Table 4). Comparisons are complicated because Seelanan et al. (1999) considered the G-genome species monophyletic, whereas G. bickii resolved as a clade in its own right here. Nonetheless, the estimated divergence times between the C-genome species and G. bickii (8.9–17.8 million years before present or mybp) or the G. australe/G. nelsonii clade (9.5–18.9 mybp) overlaps previous estimated divergence between the C- and G-genome species of 8–15 mybp (Table 4). Similarly, the estimated time of divergence between the G. bickii and G. australe/G. nelsonii clades (8.4–16.9 mybp) overlaps the estimate of 8–15 mybp for the earliest divergence among the G-genome species based on analysis of AdhD sequences (Table 4). There are, however, two striking incongruities. FAD2-1-based estimates for the basal divergence within the K genome and between G. robinsonii and G. sturtianum in the C genome are five- to tenfold and threefold older, respectively, than previously estimated (Table 4).


View this table:
[in this window]
[in a new window]
 
Table 4. Estimated divergence times among the Australian clades. A substitution rate of 0.7-1.7 x 10-9 nucleotide substitutions per site per year was calibrated using the estimated age of divergence (10.5–21 mybp) between the C- and K-genome clades for AdhD reported by Seelanan et al. (1999). Estimated divergence times estimated by dividing the mean Jukes-Cantor distance by twice the nucleotide substitution rate

 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
The A- and D-genome FAD2-1 introns are orthologs but are evolving at significantly different rate
In an earlier paper, the size similarity between the two FAD2-1-specific restriction fragments in G. barbadense and in G. hirsutum and the A-genome G. arboreum and the D-genome G. raimondii fragments suggested that the two allotetraploid FAD2-1 genes were A- and D-subgenome orthologs (Liu et al., 1999 ). The topological placements of the ten intron sequences from the five allotetraploid species support this interpretation (Fig. 1). Five of the allotetraploid sequences, one from each species resolved within a monophyletic D-genome lineage, sister to G. raimondii; the other five resolved as part of a monophyletic A-genome lineage. Support for these topological placements is strong. The A-subgenome sequences share 13 unambiguous substitutions with the two A-genome diploid species, and the D-subgenome sequences share 33 unambiguous substitutions with the D-genome diploid species.

Relative rate tests demonstrate that, except for G. arboreum and G. gossypioides (discussed below), the D-genome FAD2-1 introns have accumulated 1.8 times the number of base substitutions than have the A-genome introns relative to the phylogenetically equidistant Australian Gossypium species (Tables 2, 3). This difference is reflected in the relative branch lengths of the A- and D-genome clades in Fig. 1B. These differences are even greater among the A- and D-subgenome species (Table 3). For example, the mean Tajima-Nei distance of the D-genome diploids from G. bickii is 1.4 and 1.7 times greater than it is for G. herbaceum and G. arboreum, respectively, but the mean Tajima-Nei distance of the D-subgenome intron sequences from G. bickii is 1.9 times greater than for the A-subgenome intron sequences. This pattern is consistent regardless of the reference taxon used (Table 3). Small et al. (1998) made a similar observation based on relative rate tests of AdhC among all five Gossypium allotetraploid species, G. arboreum, and G. raimondii relative to G. robinsonii. Although nucleotide substitution rates for the A- and D-subgenome AdhC introns are roughly equal, the substitution rate in the D-subgenome exons is nearly five times greater than it is in the A subgenome. Overall, the D-subgenome AdhC sequences are accumulating substitutions about 1.8 times faster than their orthologs in the A subgenome (Small et al., 1998 ). The consequences of this differential rate of sequence evolution in phylogenetic reconstruction are evident in Fig. 1. The A subgenome lacks even one defining synapomorphy and is completely unresolved except for one substitution that differentiates G. mustelinum from the other taxa. In contrast, the D subgenome is defined by 11 substitutions and is more fully resolved.

Small et al. (1998) suggest differential selection between the two subgenomes may partially account for different nucleotide substitution rates between orthologous genes in the same nucleus. While this certainly may play a role, the FAD2-1 intron analysis, which includes both A-genome species and six of the D-genome species suggests that the underlying mechanistic differences were active in the A- and D-subgenome progenitors and that these lineage differences have been exaggerated in the allopolyploids. In this regard it is notable that the G. arboreum and G. raimondii AdhC sequences are also evolving at significantly different rates, relative to each other, and relative to their D-subgenome and A-subgenome orthologs, respectively (Small et al., 1998 ). More recent data suggest that the assumption that the A and D genomes are phyletically equidistant from the Australian species may be incorrect (J. F. Wendel, personal communication). However, even if the A-genome and the Australian species share a more recent common ancestor than either does with the D genome, the phyletic distance between the A-genome/Australian and the D-genome lineages would have to be much greater than these data suggest it will be to account for the differences in the absolute number of nucleotide substitutions between the A- and D-genome lineages (Wendel and Albert, 1992 ; Seelanan, Schnabel, and Wendel, 1997 ).

The exceptions to this differential rate of nucleotide substitution between the A and D subgenomes are G. arboreum and G. gossypioides. The substitution rates of these two species, run counter to their genomic compatriots. Gossypium arboreum has the highest substitution rate among the A-genome diploid and allopolyploid species, and G. gossypioides has the lowest rate among the D-genome species. Gossypium gossypioides is accumulating substitutions only 1.2 times faster than G. arboreum compared to a general difference of 1.8 observed between the A and D genomes (Table 3). This observation is intriguing because both of these species have unique evolutionary histories.

Gossypium arboreum is one of the four domesticated Gossypium species and is the only one for which a wild progenitor has never been identified, existing only as a domesticated cultigen or feral derivatives of domesticated forms (Brubaker, Bourland, and Wendel, 1999 ). Although it is tempting to attribute some proportion of G. arboreum's anomolous sequence substitution rate to the effects of human domestication, the degree to which human selection has altered rates of gene evolution in G. arboreum cannot be ascertained without the wild progenitor as a point of reference. Nonetheless, it is plausible that human selection has altered rates of sequence evolution in G. arboreum, an hypothesis worth testing in crop species for which clearly identified wild progenitors are available.

Gossypium gossypioides is distinct among other D-genome diploid species, because it contains a mosaic genome resulting from an ancient hybridization (Wendel, Schnabel, and Seelanan, 1995 ). Despite the fact that there are no A-genome species extant in the New World, the G. gossypioides genome contains A-genome-specific dispersed repeats and a mosaic nuclear ribosomal DNA repeat that combines features of the A-genome rDNA repeat with those of the D-genome rDNA (Wendel, Schnabel, and Seelanan, 1995 ; Zhao et al., 1998 ). Whether there is a causal link between the G. gossypioides mosaic genome and the tardy rate of sequence evolution in its FAD2-1 intron remains to be determined.

D-genome evolution and the origin of the New World allotetraploids
The topological placement of the D-subgenome intron sequences sister to G. raimondii, while G. gossypioides resolves as basal to the entire D-genome lineage, is worthy of note. Gossypium gossypioides and G. raimondii are traditionally considered to be sister species (Brown and Menzel, 1952a, b ; Wendel and Albert, 1992 ; Wendel, Schnabel, and Seelanan, 1995 ). Both species have been implicated in the origin of the allopolyploids (Endrizzi, Turcotte, and Kohel, 1985 ; Wendel, Schnabel, and Seelanan, 1995 ), and G. gossypioides experienced introgressive hybridization with the progenitor of the allotetraploid A subgenome, either directly or indirectly via the nascent allotetraploid (Wendel, Schnabel, and Seelanan, 1995 ). Traditionally, G. raimondii is considered to be the extant D-genome species sister to the D-subgenome progenitor. This is based on leaf developmental patterns, seed hair type, vigor of F1 hybrids with A-genome species, chromosome homologies inferred from multivalent frequencies in synthetic hexaploid hybrids, and genetic segregation in synthetic allohexaploids (reviewed by Endrizzi, Turcotte, and Kohel, 1985 ). As comprehensive as this sounds, G. raimondii is a geographical outlier relative to the other D-genome diploid species. It is indigenous to Peru while the other species are found in Mexico (except for G. klotzschianum in the Galápagos). Thus some question the assumption that G. raimondii is the sister taxon to the D subgenome. This doubt is largely based on the suspicion that the allotetraploid lineage arose in Mexico (specifically the Isthmus of Tehuantepec) rather than South America (Wendel, Schnabel, and Seelanan, 1995 ). If this is true, G. raimondii is indeed an unlikely candidate despite its genomic and genetic similarities to the D subgenome, and, conversely, its putative sister taxon, G. gossypioides becomes a strong candidate, lying as it does closer to the Isthmus of Tehuantepec than any other D-genome species and carrying genomic evidence of introgression with the A-subgenome progenitor (Wendel, Schnabel, and Seelanan, 1995 ).

This reasoning benefits from parsimony as it assumes a single A x D interspecific hybridization that resulted in the mosaic G. gossypioides rDNA internal transcribed spacer and the evolution of an allotetraploid lineage rather than two hybridization events between two geographically separated D-genome species and a single A-genome taxon. Wendel, Schnabel, and Seelanan (1995) are also correct in questioning the reliability of multivalent frequencies and segregation ratios in synthetic hexaploids, the strongest but uncorroborated evidence rejecting G. gossypioides as the sister taxon of the D-subgenome progenitor. However, hypotheses that the allotetraploid lineage arose in Mexico and that G. gossypioides and G. raimondii are sister species may be untenable.

Ano et al. (1982) nominated northeastern Brazil as a probable site for the origin of the allotetraploid lineage. Fryxell (1979) noted that all the wild populations of the tetraploid species occupy littoral or littoral-derived habitats. This observation, in concert with a proposed divergence during the Pleisotocene (Fryxell, 1965 ; Phillips, 1963 ; Wendel, 1989 ), a period of rapidly changing ocean levels and the fact that Gossypium seeds are salt-water tolerant (Stephens, 1958 ; Fryxell, 1979 ), suggests that prevailing ocean currents would be a primary determinant in the direction of migration of the nascent allopolyploids. The prevailing marine currents move from northeastern Brazil along the northeastern coast of South America passing through the Caribbean Sea into the Gulf of Mexico. Ano et al.'s (1982) hypothesis suggests that the allotetraploid lineage arose in northeastern Brazil and that subsequent northerly movement of germplasm with the prevailing marine currents along the coast of South America led to the colonization of northern coastal South America, Gulf coastal Mexico, and the Islands of the Antilles. Under this scenario, G. mustelinum represents the resident descendent of the nascent allopolyploid, consistent with its basal position in gene topologies, while the colonial populations diverged into the other four allotetraploid species (see also Wendel, Rowley, and Stewart, 1994 ).

The convention that G. gossypioides and G. raimondii are sister species is based on gross comparative morphology and the interfertility of the two species (Brown and Menzel, 1952a, b ). Although this conclusion is supported by phylogenetic analyses of chloroplast restriction site mutations (Wendel and Albert, 1992 ; Seelanan et al., 1997 ), it is not unassailable. In the first place, gross morphological comparisons are subject to contradictory interpretations. Although Brown and Menzel (1952a, b) concluded that G. gossypioides and G. raimondii are closely related, they also noted that G. gossypioides is intermediate between G. thurberi and G. raimondii in flower size, shape, and color, leaf shape, and stem texture, having "a marked superficial resemblance to the F1 hybrid" between the two (Brown and Menzel, 1952a, p . 120). Hutchinson, Silow, and Stephens (1947) interpreted the situation differently and classified G. gossypioides with G. thurberi and G. trilobum in section "Thurberana." They did this on the basis that the section "Thurberana" species were glabrous (or mostly so) and had lobed leaves and entire or three-toothed epicalyx bracts, in contrast to the section "Klotzschiana" species, including G. raimondii, which were characterized by pubescent leaves and stems, entire leaves, and laciniate epicalyx bracts.

In contrast to the equivocal conclusions afforded by gross morphological comparisons, phylogenetic analyses of the 5S ribosomal DNA (Cronn et al., 1996 ) and the FAD2-1 (Fig. 1) intron both resolve G. gossypioides as the sole descendent of one lineage from the basal divergence in a monophyletic D-genome lineage. In the case of the 5S ribosomal DNA, a topology rejecting the basal position of G. gossypioides would require another six steps. Although the basal position of G. gossypioides in the FAD2-1 topology is supported by decay index of only 1, collapsing this branch still would not resolve G. gossypioides and G. raimondii as sister taxa. Conversely, the FAD2-1 topology resolves G. raimondii (decay index of 2) as basal to the D-subgenome monophyletic lineage. Although not all the D-genome diploid species are included in this analysis, G. gossypioides and G. raimondii are the only probable candidates (reviewed by Endrizzi, Turcotte, and Kohel [1985 ] and Wendel, Schnabel, and Seelanan [1995] ). Consequently, the weight of the evidence promotes G. raimondii as the most likely extant taxon that is sister to the allotetraploid D-subgenome progenitor. Furthermore, Stephens (1944a, b) , based on a genetic analysis of leaf shape alleles in Gossypium, concluded that genetic control of leaf shape in the allotetraploids most likely reflected the interaction of the leaf shape genes in the Old World A-subgenome progenitor in combination with the leaf shape genes from a D-genome diploid with entire leaves, of which there are only four, viz., G. aridum, G. armourianum, G. klotzschianum, and G. raimondii. Thus, it is worthy of note that in the FAD2-1 topology, G. klotzschianum resolves as basal to the combined G. raimondii-allotetraploid D-subgenome lineage (Fig. 1).

These considerations, however, fail to resolve the evolutionary relationship of G. gossypioides to the other D-genome diploids. Setting aside the mosaic rDNA ITS (Wendel, Schnabel, and Seelanan, 1995 ), G. gossypioides contains a chloroplast genome sister to G. raimondii (Wendel and Albert, 1992 ) but contains a 5S ribosomal repeat and a FAD2-1 intron sequence that resolve as basal to the other D-genome diploids assayed (Wendel, Schnabel, and Seelanan, 1995 ). Cronn et al. (1996) suggest that the 5S ribosomal repeat may also be a mosaic gene and therefore should be expected to resolve as a basal taxon in a phylogenetic analysis but could not identify a single A-genome-specific base pair change. This reasoning may also be applicable to the FAD2-1 intron, but given that it resides in a single copy gene, it is unlikely. Particularly notable are the 33 unambiguous base pair substitutions and indels the G. gossypioides FAD2-1 intron shares with the other D-genome taxa and that differentiate them from the Old World E- and A-genome lineages (Fig. 1). The G. gossypioides FAD2-1 intron contains none of the 13 A-genome-specific substitutions. Thus, chloroplast and nuclear genes place G. gossypioides in incongruent positions within the larger Gossypium topology. This incongruity can be explained by assuming that the chloroplast topology accurately tracks the taxon evolution and that the FAD2-1, the 5S ribosomal repeat, and the ribosomal 18S–26S DNA repeats are mosaics that have recombined with their A-genome orthologs. Alternatively, one can propose that the FAD2-1 and 5S ribosomal genes accurately track the taxon evolution. This implies that G. gossypioides obtained the G. raimondii-like chloroplast via later interspecific introgression rather than by inheritance. At the moment the weight of evidence favors the latter hypothesis. The key may lie in a better understanding of G. raimondii. The chloroplast DNA topology indicates that G. gossypioides and G. raimondii were in physical contact at some point either via interspecific introgression or via a recent common ancestor. Understanding how G. raimondii came to its present geographic isolation from its New World congeners, particularly G. gossypioides, will be vital to finally resolving the incongruence in topological placement of G. gossypioides in chloroplast and nuclear topologies.

Gossypium mustelinum is basal among the five allotetraploid species
The topological placement of both of the G. mustelinum sequences is consistent with the growing consensus that it is the sole descendent of one branch of the earliest divergence within the allotetraploid lineage, the second lineage subsequently diverging into the other four allotetraploid species (Small et al., 1998 ). Within the robustly supported D-subgenome lineage (decay index of 11), the G. mustelinum D-subgenome intron sequence lacks three nonhomoplasious substitutions that unite the remaining four species. The situation is more complex in the A-genome lineage. The A subgenome lacks defining synapomorphies and thus resolves as paraphyletic to a monophyletic A-diploid lineage—probably reflecting the sluggish rate of base pair substitution in the A-genome lineage. Nonetheless, the G. mustelinum intron sequence also resolves as basal relative to the other A-subgenome sequences. Small et al. (1998) obtained an identical result from analyses of the plastid trnT-trnL spacer region and nuclear alcohol dehydrogenase (AdhC) sequences.

Evolution of the Australian Gossypium species
Both the distance- and parsimony-optimized topologies suggest that the original Gossypium entity in Australia very quickly diverged into four main lineages; the C genome, the K genome, G. bickii, and G. australe/G. nelsonii (Fig. 1). Consistent with previous phylogenetic analyses, the FAD2-1 intron topologies support the monophyly of the C- and K-genome lineages (Wendel and Albert, 1992 ; Seelanan, Schnabel, and Wendel, 1997 ; Seelanan et al., 1999 ). Two unambiguous substitutions and a decay index of 1 (Fig. 1) weakly support the C-genome monophyly. This resolution, however, is consistent with topologies derived from rDNA ITS (parsimony optimized) and AdhD sequences (Seelanan, Schnabel, and Wendel, 1997 ; Seelanan et al., 1999 ). Chloroplast restriction site topologies resolve G. robinsonii as basal, with G. sturtianum as basal to the remainder of the Australian species (Wendel and Albert, 1992 ), but when taxa with known reticulate histories are excluded, G. robinsonii and G. sturtianum occur as unresolved basal lineages relative to the other Australian species in strict consensus trees (Seelanan, Schnabel, and Wendel, 1997 ). Thus, the chloroplast restriction data at least do not contradict the inferences based on the FAD2-1 intron and AdhD sequences. The monophyly of the K genome is strongly supported and is consistent with all previous analyses and their distinctive suite of morphological features (Wendel and Albert, 1992 ; Seelanan, Schnabel, and Wendel, 1997 ; Seelanan et al., 1999 ). Based on the estimated age of the basal divergences, the age of the Australian Gossypium clade is probably 9.4–24.1 million years old (Table 4), a figure that coincides with the age of the earliest records of malvaceous pollen in Australia, estimated to have been deposited 12–25 mybp (Muller, 1981, 1984 ).

One striking difference between the phylogenetic analysis of the FAD2-1 intron and previous cladistic analyses is the level of resolution and the estimated age of the earliest divergence within the K genome. Whereas topologies based on rpl16, ITS, and AdhD sequences produced largely unresolved rakes (Seelanan et al., 1999 ), the FAD2-1 identified two primary sublineages (K1 and K2; Fig. 1), both of which are supported by decay indices of 3. The K1 clade comprises plants with mostly upright habits and sporadic coastal or near-coastal distributions in contrast to the K2 clade, which contains species that are mostly prostrate and have more widespread populations inland (Fryxell, Craven, and Stewart, 1992 ). Estimates based on the FAD2-1 intron suggest that the first divergence among the K-genome species occurred 8.1–16.2 mybp (Table 4), a figure that stands in stark contrast to earlier estimates of 0.7–3 mybp (Seelanan et al., 1999 ).

The reticulate evolutionary history of G. bickii
Gossypium bickii is traditionally accepted as a member of a monophyletic G-genome clade based primarily on morphological and cytogenetic similarities (Fryxell, 1979, 1992 ; Stewart, 1995 ), but recent sequence analysis of a portion of the AdhD gene (Seelanan et al., 1999 ) in which G. bickii appears as basal within the Australian Gossypium species undermines this interpretation. This incongruity is particularly interesting because G. bickii is known to have arisen via or experienced a homoploid reticulate event involving the ancestor of G. sturtianum (Wendel, Stewart, and Rettig, 1991 ). The result is that G. bickii carries a G. sturtianum-like chloroplast, but retains a G. australe–G. nelsonii-like nuclear genome (Wendel and Albert, 1992 ; Seelanan, Schnabel, and Wendel, 1997 , Seelanan et al., 1999 ). The hybridization event that resulted in the transfer of the chloroplast would have brought the nuclear genomes into recombinational contact before the proto-G. sturtianum chromosomes were purged from the G. bickii genome, yet there is no evidence, to date, of any C-genome-derived nuclear sequences in the G. bickii genome. This raises the question of whether the G. bickii AdhD sequence represents an orphaned G. sturtianum gene or a mosaic gene that combines features of the original G. bickii gene with the G. sturtianum gene (cf. Cronn et al., 1996 ). If it does not, the incongruity between ITS-derived topologies that place G. bickii sister to G. australe and the AdhD topologies that place it basal to a well-defined G. australe/G. nelsonii lineage may point to a second cryptic homoploid reticulate event in the evolution of G. bickii.

The weight of the evidence favors the latter interpretation. The AdhD sequences place G. bickii basal to the G. australe/G. nelsonii (decay index of 10) and the G. robinsonii/G. sturtianum (decay index of 2) clades (Seelanan et al., 1999 ). The FAD2-1 intron topology also clearly places G. bickii as a basal clade, again lacking the ten substitutions that unite G. australe–G. nelsonii into a single clade (decay index of 9; Fig. 1) and the two base-pair substitutions that unite G. robinsonii and G. sturtianum (decay index of 1; Fig. 1). Similarly, a phylogenetic analysis of allozyme alleles and rDNA 18S–26S restriction site mutations placed G. bickii basal to a strongly supported G. australe-G. nelsonii clade (Wendel, Stewart, and Rettig, 1991 ), and in contrast to the observation that G. australe, G. bickii, and G. sturtianum have 15, 8, and 7 species-specific alleles, respectively, G. bickii shares only three alleles exclusively with G. australe and none with G. sturtianum. Furthermore, if the rDNA 18S–26S restriction site mutations specific to G. australe, G. bickii, and G. nelsonii are mapped onto the topology derived from the ITS sequences in Seelanan, Schnabel, and Wendel (1997) and Seelanan et al. (1999) , the homoplasious loss of the KpnI restriction site in G. australe and G. bickii becomes a synapomorphy uniting the two, and the inferred gain of a BanI restriction site in the immediate ancestor of the G. australe/G. bickii/G. nelsonii clade followed by the immediate loss in the ancestor of G. australe/G. nelsonii becomes a single gain in G. bickii. Thus, of the four nuclear sequences so far analyzed, three (isozyme alleles, AdhD, and FAD2-1) consistently resolve G. bickii as a basal lineage, and only the ITS sequences and restriction site differences suggest that G. bickii shares a more recent ancestor with G. australe than with any other Australian species.

It is highly improbable that this incongruity reflects the presence of intact or mosaic G. sturtianum genes in the G. bickii nucleus. All chloroplast topologies are consistent in suggesting that the transfer of the proto-G. sturtianum chloroplast occurred after the divergence of G. robinsonii and G. sturtianum (Wendel and Albert, 1992 ; Seelanan, Schnabel, and Wendel, 1997 ; Seelanan et al., 1999 ). Thus, if these genomic sequences are intact residual genes from the G. sturtianum genome, they should resolve sister to G. sturtianum, not basal to a monophyletic G. sturtianum–G. robinsonii lineage as they do in the AdhD and FAD2-1 topologies (Fig. 1; Wendel and Albert, 1992 ; Seelanan et al., 1999 ). Furthermore, of the 76 isozyme alleles (at 21 loci) surveyed by Wendel and Albert (1992) , none are shared exclusively by G. bickii and G. sturtianum. The only remaining hypothesis is that the genes sampled to date are G. bickii–G. sturtianum mosaics, but it is unlikely that the AdhD, the FAD2-1, and the eight G. bickii-specific isozyme alleles are all products of intragenic recombination. Thus, based on the data currently at hand, the most probable explanation for the topological inconsistency of G. bickii is that it experienced two homoploid reticulate events, capturing its chloroplast genome from G. sturtianum and its ribosomal DNA repeat from G. australe. It is worthy of note, then, that recent field observations demonstrate that the biological means exist for introgression between G. australe and G. bickii. In 1997, two intermingled populations of G. australe and G. bickii containing fertile hybrids were identified in the Australian central arid zone (unpublished data). This indicates that introgression of the G. australe ribosomal DNA repeat could be recent and may not yet be fixed in G. bickii.

The inference that G. bickii represents a basal lineage among the Australian species may explain the morphological singularities of G. bickii. In gross morphological appearance, particularly pubescence and flower color, it is similar to G. australe and G. nelsonii, with whom G. bickii is classified taxonomically in section Hibiscoidea (Fryxell, 1992 ). However, G. bickii's short stature and multistemmed habit contrast sharply with the taller less profusely branched habit of G. australe and G. nelsonii and is strongly reminiscent of the erect and suberect habit of many of the K-genome species (Fryxell, 1979, 1992 ). Gossypium bickii also lacks the stiff spreading seed hairs of G. australe and G. nelsonii, instead possessing a seed vestiture identical to that of G. robinsonii and G. sturtianum.

Does the FAD2-1 intron enhance gene expression?
The taxon topologies inferred from the phylogenetic analysis of the FAD2-1 microsomal {omega}-6 desaturase large intron sequences, however useful and interesting in reconstructing historical evolutionary events in the genus Gossypium, are merely the first step toward understanding the regulatory role of introns in gene expression in cotton. A positive effect of introns on gene expression has been observed for many plant genes. Expression of reporter genes under the control of the maize Adh1, Sh1, Bx1, or Act promoter is increased up to several hundred fold by the inclusion of an intron (Callis, Fromm, and Walbot, 1987 ; Maas et al., 1991 ; Oard, Paige, and Dvorak, 1989 ; Vasil et al., 1989 ). Similarly, Arabidopsis thaliana genes encoding polyubiquitin (Norris, Meyer, and Callis, 1993 ), transcription factor EF-1{alpha} (Currie et al., 1991, 1993 ) all have an intron in the 5' UTR region that increases the expression of reporter gene fusions 2.5- to 1000-fold relative to intron-less controls. This enhancement of gene expression has been ascribed to intron splicing (Gidekel, Jimenez, and Herrera-Estrella, 1997 ). In view of the remarkable conservation of the size and position of the 5' UTR intron of FAD2-1 across all the 31 Gossypium species examined and the presence of a 5' UTR intron in other species, such as, Arabidopsis thaliana, it is possible that this intron might have an enhancing effect on the expression of FAD2-1. Experiments are currently underway to examine whether the 5' UTR intron on its own or in conjunction with the FAD2-1 promoter contributes to the regulation of expression of FAD2-1 in cotton. These experiments will be all the more informative because they take place within a sound phylogenetic framework.


View this table:
[in this window]
[in a new window]
 
Table 1. Continued

 

    FOOTNOTES
 
1 This work was supported by grants CSP78C and CSP85C from the Cotton Research & Development Corporation. Back

5 Author for reprint requests (qliu{at}pi.csiro.au ) Back


    LITERATURE CITED
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
Ano, G., J. Schwendiman, J. Fersing, and M. Lacape. 1982 Les cotonniers primitifs G. hirsutum race yucatanense de al Pointe des Châteaux en Guadeloupe et l'origine possible des cotonniers tétraploïdes du Nouveau Monde. Coton et Fibres Tropicales 37: 327–332[ISI]

Bremer, K. 1988 The limits of amino acid sequence data in angiosperm phylogenetic reconstruction. Evolution 42: 795–803[CrossRef][ISI]

Brown, M. S., and M. Y. Menzel. 1952a Additional evidence on the crossing behavior of Gossypium gossypioides. Bulletin of the Torrey Botanical Club 79: 285–292[CrossRef]

———, and ———. 1952b The cytology and crossing behavior of Gossypium gossypioides. Bulletin of the Torrey Botanical Club 79: 110–125[CrossRef]

Brubaker, C. L., F. M. Bourland, and J. F. Wendel. 1999 The origin and domestication of cotton. In C. W. Smith and J. T. Cothren [eds.], Cotton: origin, history, technology, and production. Wiley, New York, New York, USA

Callis, J., M. Fromm, and V. Walbot. 1987 Introns increase gene expression in cultured maize cells. Genes and Development 1: 1183–1200[Abstract/Free Full Text]

Cronn, R. C., X. Zhao, A. H. Paterson, and J. F. Wendel. 1996 Polymorphism and concerted evolution in a tandemly repeated gene family: 5S ribosomal DNA in diploid and allopolyploid cottons. Journal of Molecular Evolution 42: 685–705[CrossRef][ISI][Medline]

Currie, C., M. Axelos, C. Bardet, R. Atanassova, N. Chaubet, and B. Lescure. 1993 Molecular organisation and developmental activity of an Arabidopsis thaliana EF-1{alpha} gene promoter. Molecular and General Genetics 218: 78–86

——–, T. Liboz, C. Bardet, E. Gander, C. Médale, M. Axelos, and B. Lescure. 1991 Cis- and trans-acting elements involved in the activation of Arabidopsis thaliana A1 gene encoding the translation elongation factor EF-1{alpha}. Nucleic Acids Research 19: 1305–1310[Abstract/Free Full Text]

Donoghue, M. J., R. G. Olmstead, J. F. Smith, and J. D. Palmer. 1992 Phylogenetic relationships of Dipsacales based on rbcL sequences. Annals of Missouri Botanical Garden 79: 333–345

Endrizzi, J. E., E. L. Turcotte, and R. J. Kohel. 1985 Genetics, cytology, and evolution of Gossypium. Advances in Agronomy 23: 271–375

Eriksson, T., and N. Wikström. 1995 Autodecay ver. 3.0 (program distributed by the authors). Dept. of Botany, Stockholm University, Stockholm, Sweden

Fryxell, P. A. 1965 Stages in the evolution of Gossypium L. Advancing Frontiers of Plant Sciences 10: 31–55

———. 1979 The natural history of the cotton tribe (Malvaceae, Tribe Gossypieae). Texas A&M Press, College Station, Texas, USA

———. 1992 A revised taxonomic interpretation of Gossypium L. (Malvaceae). Rheedea 2: 108–165

——–, L. A. Craven, and J. McD. Stewart. 1992 A revision of Gossypium sect. Grandicalyx (Malvaceae), including the description of six new species. Systematic Botany 17: 91–114

Gidekel, M., B. Jimenez, and L. Herrera-Estrella. 1997 The first intron of the Arabidopsis thaliana gene coding for elongation factor 1ß contains an enhancer-like element. Gene 170: 201–206[CrossRef][ISI]

Hutchinson, J. B., R. A. Silow, and S. G. Stephens. 1947 The evolution of Gossypium and the differentiation of the cultivated cottons. Geoffrey Cumberlege/Oxford University Press, London, UK

Jukes, T. H., and C. R. Cantor. 1969 Evolution of protein molecules. In H. N. Munro [ed.], Mammalian protein metabolism, 21–132. Academic Press, New York, New York, USA

Kumar, S., K. Tamura, and M. Nei. 1993 MEGA: Molecular evolutionary genetic analysis, version 1.01. Pennsylvania State University, University Park, Pennsylvania, USA

Liu, Q., S. P. Singh, C. L. Brubaker, P. J. Sharp, A. G. Green, and D. R. Marshall. 1996 Isolation and characterisation of two different microsomal {omega}-6 desaturase genes in cotton (Gossypium hirsutum L.). Proceedings of the 12th International Symposium on Plant Lipids, 7–12 July, 1996, Toronto, Canada. Kluwer Academic Publishers, Dordrecht, The Netherlands

———, ———, ———, ———, ———, and ———. 1997 Characterization of a large intron in 5'UTR of microsomal {omega}-6 desaturase gene from Gossypium spp. and other plant species. Abstracts of the Fifth International Congress of Plant Molecular Biology, Singapore. 21–27 September, 1997, Singapore. Plant Molecular Biology Reporter 15(3 supplement), Abstract

———, ———, ———, ———, ———, and ———. 1999 Molecular cloning and expression of a cDNA encoding a microsomal {omega}-6 fatty acid desaturase from cotton (Gossypium hirsutum). Australian Journal of Plant Physiology: 26: 101–106

Maas, C., J. Laufs, S. Grant, C. Korfhage, and W. Werr. 1991 The combination of a novel stimulatory element in the first exon of the maize shrunken-1 gene with the following intron enhances reporter gene expression up to 1000-fold. Plant Molecular Biology 16: 199–207[CrossRef][ISI][Medline]

Muller, J. 1981 Fossil pollen records of extant angiosperms. Botanical Review 47: 1–142

———. 1984 Significance of fossil pollen for angiosperm history. Annals of the Missouri Botanical Garden 71: 419–443

Norris, S. R., S. E. Meyer, and J. Callis. 1993 The intron of Arabidopsis thaliana polyubiquitin genes is conserved in location and is a quantitative determinant of chimeric gene expression. Plant Molecular Biology 21: 895–906[CrossRef][ISI][Medline]

Oard, J. H., D. Paige, and J. Dvorak. 1989 Chimeric gene expression using maize intron in cultured cells of breadwheat. Plant Cell Reports 8: 156–160[CrossRef][ISI]

Okuley, J., J. Lightner, K. Feldmann, N. Yadav, E. Lark, and J. Browse. 1994 Arabidopsis FAD2 gene encodes the enzyme that is essential for polyunsaturated lipid synthesis. Plant Cell 6: 147–158[Abstract]

Paterson, A. H., C. L. Brubaker, and J. F. Wendel. 1993 A rapid method for extraction of cotton (Gossypium spp.) genomic DNA suitable for RFLP or PCR analysis. Plant Molecular Biology Reporter 11: 122–127

Phillips, L. L. 1963 The cytogenetics of Gossypium and the origin of New World cottons. Evolution 17: 460–469[CrossRef][ISI]

Sambrook, J., E. F. Fritsch, and T. Maniatis. 1989 Molecular cloning: a laboratory manual, 2nd ed. Cold Spring Harbor Laboratory Press, New York, New York, USA

Seelanan, T., C. L. Brubaker, J. McD. Stewart, L. A. Craven, and J. F. Wendel. 1999 Molecular systematics of Australian Gossypium section Grandicalyx (Malvaceae). Systematic Botany 24: 183–208[CrossRef][ISI]

———, A. Schnabel, and J. F. Wendel. 1997 Congruence and consensus in the cotton tribe (Malvaceae). Systematic Botany 22: 259–290[CrossRef][ISI]

Shanklin, J., and E. B. Cahoon. 1998 Desaturation and related modifications of fatty acids. Annual Review of Plant Physiology and Plant Molecular Biology 49: 611–641[CrossRef][ISI][Medline]

Simpson, G. G., and W. Filipowicz. 1996 Splicing of precursors to mRNA in higher plants: mechanism, regulation and sub-nuclear organisation of the spliceosomal machinery. Plant Molecular Biology 32: 1–41[CrossRef][ISI][Medline]

Small, R. L., J. A. Ryburn, R. C. Cronn, T. Seelanan, and J. F. Wendel. 1998 The tortoise and the hare: choosing between noncoding plastome and nuclear ADH sequences for phylogeny reconstruction in a recently diverged plant group. A