Am. J. Bot. Li-Cor Advertisement
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


(American Journal of Botany. 2008;95:1166-1176.)
doi: 10.3732/ajb.0800133
© 2008 Botanical Society of America, Inc.
  Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Facebook   Add to Reddit   Add to Technorati   Add to Twitter
What's this?
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplementary Data
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Web of Science (3)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Yuan, Y.-W.
Right arrow Articles by Olmstead, R. G.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Yuan, Y.-W.
Right arrow Articles by Olmstead, R. G.
Agricola
Right arrow Articles by Yuan, Y.-W.
Right arrow Articles by Olmstead, R. G.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Facebook   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?

Systematics and Phytogeography

Evolution and phylogenetic utility of the PHOT gene duplicates in the Verbena complex (Verbenaceae): dramatic intron size variation and footprint of ancestral recombination1

Yao-Wu Yuan2 and Richard G. Olmstead

Department of Biology, University of Washington, Seattle, Washington 98195 USA

Received for publication 11 April 2008. Accepted for publication 13 June 2008.

ABSTRACT

A well-resolved species level phylogeny is critically important in studying organismal evolution (e.g., hybridization, polyploidization, adaptive speciation). Lack of appropriate molecular markers that give sufficient resolution to gene trees is one of the major impediments to inferring species level phylogenies. In addition, sampling multiple independent loci is essential to overcome the lineage sorting problem. The availability of nuclear loci has often been a limiting factor in plant species-level phylogenetic studies. Here the two PHOT loci were developed as new sources of nuclear gene trees. The PHOT1 and PHOT2 gene trees of the Verbena complex (Verbenaceae) are well resolved and have good clade support. These gene trees are consistent with each other and previously generated chloroplast and nuclear waxy gene trees in most of the phylogenetic backbone as well as some terminal relationships, but are incongruent in some other relationships. Locus-specific primers were optimized for amplifying and sequencing these two loci in all Lamiales. Comparing intron size in the context of the gene trees shows dramatic variation within the Verbena complex, particularly at the PHOT1 locus. These variations are largely caused by invasions of short transposable elements and frequent long deletions and insertions of unknown causes. In addition, inspection of DNA sequences and phylogenetic analyses unmask a clear footprint of ancestral recombination in one species.

Key Words: ancestral recombination • intron size variation • MITE • PHOT gene duplicates • species-level phylogenetics • Verbena complex

Two major impediments to inferring phylogenetic relationships among recently diverged species include the lack of appropriate molecular markers and the gene tree–species tree discordance caused by incomplete lineage sorting. The fundamental importance of species-level phylogenies has stimulated attention to both problems in recent years (e.g., Small et al., 1998Go; Shaw et al., 2005Go, 2007Go; Hughes et al., 2006Go; addressing the former problem; and Maddison, 1997Go; Degnan and Salter, 2005Go; Degnan and Rosenberg, 2006Go; Maddison and Knowles, 2006Go; Ané et al., 2007Go; Carstens and Knowles, 2007Go; Edwards et al., 2007Go; Kubatko and Degnan, 2007Go; Liu and Pearl, 2007Go; addressing the latter). However, these studies tend to focus on one or the other of the two problems and do not address both problems in a single system. Nevertheless, in practice these two problems often coexist for many groups of organisms, especially those that have been subject to recent and rapid diversification. To address these issues in an empirical system, we are carrying out a series of studies on the Verbena complex (Verbenaceae). The focus of the first stage is examining the utility of various molecular markers, including fast evolving, noncoding chloroplast DNA, large fragments of low-copy nuclear genes, and short transposable element insertions, and then generating multiple independent gene trees from these markers. In the second stage, methods developed in the aforementioned studies (Maddison and Knowles, 2006Go; Ané et al., 2007Go; Carstens and Knowles, 2007Go; Edwards et al., 2007Go; Liu and Pearl, 2007Go) will be incorporated to extract information of species relationships from the independent gene trees.

Besides these two main hurdles, hybridization and introgression can also cause gene tree–species tree discordance, and they add another dimension of complication to species level phylogenetic problems. It could be very challenging to distinguish between hybridization/introgression and incomplete lineage sorting. In fact, our previous study (Yuan and Olmstead, 2008Go) has revealed two potential chloroplast introgressions in the Verbena complex.

The Verbena complex includes three closely related genera, Verbena, Glandularia, and Junellia, with each genus containing 40–50 species (Botta, 1989Go; Sanders, 2001Go). Intriguing evolutionary questions in this group have been summarized in our previous paper (Yuan and Olmstead, 2008Go), which presented the first results of this series of studies from seven noncoding cpDNA regions and a large fragment of the nuclear waxy locus. Among other implications, this prior study suggests a recent radiation in the Verbena complex. The present paper is focused on the evolution and phylogenetic utility of the two PHOT gene duplicates first used here as new sources of nuclear gene trees.

PHOT genes encode phototropin, a blue and ultraviolet-A light receptor of plants that is responsible for phototropism (Christie et al., 1998Go), stomatal opening (Kinoshita et al., 2001Go), and chloroplast relocation (Jarillo et al., 2001Go; Kagawa et al., 2001Go). Most angiosperms have two PHOT loci, PHOT1 and PHOT2, from a duplication that occurred before the divergence of monocots and tricolpates (Briggs et al., 2001Go) and probably before the divergence of all angiosperms (Y.-W. Yuan and R. G. Olmstead, unpublished data). The PHOT gene duplicates were selected in this study for three reasons: (1) Sufficient nucleotide substitutions have accumulated since the ancient duplication such that the two paralogs are easily distinguishable and phylogenetic analyses would not be confounded by the orthology/paralogy issue (Fitch, 1970Go): the two paralogs are so different in nucleotide sequence that the intron regions are not alignable at all between the two loci. (2) The PHOT genes contain many small relatively conserved exons that are separated by variable introns (e.g., Arabidopsis thaliana PHOT1 and PHOT2 have 21 and 23 exons, respectively). For species level phylogenetic studies, the ratio of information output to effort expended is expected to be high at these loci. (3) They provide an opportunity to compare the mode of intron evolution across closely related species between the two paralogs.

In this paper, the phylogenetic utility of the PHOT loci in the Verbena complex was investigated by examining resolution of gene trees from PHOT1 and PHOT2 sequences and comparing congruence and discrepancy between PHOT1 and PHOT2 and previously generated nuclear waxy and chloroplast gene trees (Yuan and Olmstead, 2008Go). Comparing intron size in the context of the gene trees shows dramatic intron size variation in the Verbena complex, particularly at the PHOT1 locus. These variations are largely caused by extensive invasion of short transposable elements and frequent long deletions and insertions of unknown causes. In addition, inspection of DNA sequences and phylogenetic analyses unmask a clear footprint of ancestral recombination in one Junellia species.

MATERIALS AND METHODS

Molecular data collection
Forty taxa of the Verbena complex and one outgroup species (Aloysia virgata Juss.) were included in this study. All but three of these taxa had been used in the previous study (Yuan and Olmstead, 2008Go), where detailed taxon information can be found. Information on the three new taxa is listed in Appendix 1. One of the three taxa is a new accession of Glandularia bipinnatifida Nutt., a species that had been sampled in the previous study. These two accessions are designated as G. bipinnatifida TX1 (the one used in the earlier study) and G. bipinnatifida TX2 (the new accession) in this paper.

Regions from exons 8–14 of both PHOT loci were amplified and sequenced. A pair of degenerate primers (Fig. 1) was initially designed to amplify both loci simultaneously in tricolpates. Once several sequences from the Verbena complex were obtained, locus-specific primers were designed to amplify PHOT1 and PHOT2 separately, and finally a set of semi-universal primers were optimized for amplifying and sequencing Lamiales for each of the PHOT paralogs (Fig. 1). Cloning prior to sequencing was necessary for many taxa due to allelic variation. To reduce erroneous nucleotide incorporation during PCR, we ran the reactions using PfuUltra II Fusion HS DNA Polymerase (Stratagene, La Jolla, California, USA) for amplifying all PHOT2 and some PHOT1 sequences. The other PHOT1 sequences were difficult to amplify using the pfu system, probably due to complex secondary structure and nucleotide composition. The FailSafe PCR system (Epicentre, Madison, Wisconsin, USA) was used to amplify these difficult sequences. Procedures for DNA extraction, PCR amplification, PCR product purification, cloning, and sequencing essentially followed Yuan and Olmstead (2008)Go. Depending on the ploidy level (see Appendix 1 and Yuan and Olmstead [2008]Go for chromosome numbers), 8–24 positive clones were screened by sequencing with one primer. Distinct clones were then sequenced for the entire region in both strands. Sequencing the two loci, particularly PHOT1, takes several other primers besides those shown in Fig. 1 (see Appendix 2) because of the large variation of intron size across taxa. Some of these primers were specifically designed for only a few taxa (e.g., NAVF1, NAVF2) or a single taxon (e.g., VlitF, VlitR). Sequences generated in this study have been deposited in GenBank (EU547314–EU547442).


Figure 1
View larger version (40K):
[in this window]
[in a new window]

 
Fig. 1. Schematic representation of the portion of PHOT gene duplicates of the Verbena complex used in this study. White and shaded boxes represent exon and intron regions, respectively. Numbers in the box indicate exons 8–14. Exon–intron boundary was determined by comparison with annotated Arabidopsis homologs (http://www.arabidopsis.org). Both diagrams were drawn on the same scale, but the intron sizes are shown as the average of the Verbena complex because the variation is too great to depict on the scale here. The exons are conserved in size between the two loci as well as across taxa. The size of corresponding introns at the two loci differs substantially. The top two degenerate primers were used to amplify both paralogs simultaneously. A set of semiuniversal primers optimized for amplifying and sequencing each locus in Lamiales were designed from the exon regions. "F" and "R" indicate forward and reverse direction, respectively. Note that exon 9 is so conserved that the same primers can be used to sequence both loci.

 
Sequence alignment, recombination detection, and phylogenetic analysis
Sequence alignments were performed manually using the program Se-Al version 2.0a11 (Rambaut, 1996Go) based on the similarity criterion (Simmons, 2004Go). While the structure of PHOT2 sequences is relatively simple and the alignment is straightforward, PHOT1 sequences were interspersed by frequent long insertions (i.e., >100 bp) and numerous microsatellite sites as well as polynucleotide (particularly poly A and poly T) regions. The microsatellite and polynucleotide regions (11 regions, positions 585–653, 719–731, 923–937, 1624–1657, 1936–1952, 3492–3502, 3650–3704, 4494–4502, 6202–6287, 6749–6768, 6785–6814) were excluded from phylogenetic analyses due to difficulty in homology assessment.

Both parsimony and Bayesian analyses were performed on the PHOT1 and PHOT2 datasets. Gaps were treated as missing data in these analyses. The analyses of PHOT1 put one particular species in an unexpected position in the gene tree, in conflict with all other evidence, and visual inspection suggested that the sequence of this species was likely a mosaic that resulted from an ancestral recombination between two quite distinct sequences. Two recombination detection algorithms, MaxChi (Smith, 1992Go) and SiScan (Gibbs et al., 2000Go) as implemented in the program RDP3 Beta 27 (http://darwin.uvigo.es/rdp/rdp.html; Martin et al., 2005Go) were employed to verify this initial observation. Both algorithms detected this sequence as a recombinant between two different species when a P value of 0.01 was used as a threshold for significance with the multiple comparison correction option effective. To highlight this point, we partitioned the PHOT1 data set into two parts (the first part designated as partition 1 and the second as partition 2) around the estimated break point of the putative recombination event and performed parsimony analyses on the two partitions.

Parsimony analyses were conducted using the program PAUP* version 4.0b10 (Swofford, 2002Go). Heuristic searches were performed with 1000 random stepwise addition replicates and tree-bisection-reconnection (TBR) branch swapping with the MULTREES option in effect. Nodal support was determined by bootstrap analyses (Felsenstein, 1985Go) of 500 replicates, each with 20 random stepwise addition replicates and TBR branch swapping with MULTREES on.

Bayesian analyses were conducted using the program MrBayes version 3.1.2 (Ronquist and Huelsenbeck, 2003Go). Akaike information criterion (AIC; Akaike, 1974Go) implemented in the program MODELTEST version 3.7 (Posada and Crandall, 1998Go) was used to determine the model of sequence evolution that best fit the data (TVM+G and GTR+G for PHOT1 and PHOT2, respectively). We performed two independent runs of 1 000 000 generations from a random starting tree using the default priors and four Markov chains (one cold and three heated chains), sampling 1 tree every 100 generations. Plots of log likelihood scores were used to determine stationarity and trees from burn-in were discarded.

RESULTS

Sequence alignment and intron size variation
PHOT2 sequence length varies from 2230 to 2851 bp across taxa within the Verbena complex in the amplified region (exon 8 to exon 14). PHOT1 is much more variable in the same region, ranging from 1952 to 4167 bp. The exon sizes are conserved between the two loci as well as across taxa (Fig. 1), so the sequence length variation is entirely from introns. Between the two loci, the pattern of intron size distribution differs greatly. For example, intron 13 is the largest among the six introns sequenced in this study at the PHOT1 locus, but it is distinctively small at the PHOT2 locus; intron 12 is the largest at the PHOT2 locus, but quite small at PHOT1 (Fig. 1). When focusing on each locus and examining intron size variation across taxa, PHOT1 intron 13 is the most variable (Fig. 2). It varies from 193 to 2468 bp within the Verbena complex. All other introns, except intron 9, also have some conspicuous variation (Fig. 2). Introns of PHOT2 do not harbor as much size variation as those of PHOT1. Nonetheless, intron 10 varies from 500 to 1080 bp, caused by a deletion of ~240 bp in Glandularia and an insertion of ~270 bp in a group of Verbena species (Fig. 3). Due to the intron size variation and numerous smaller gaps introduced in the alignments, the final alignments are 6845 bp and 3148 bp long for PHOT1 and PHOT2, respectively (Appendices S1, S2 in Supplemental Data with the online version of this article).


Figure 2
View larger version (33K):
[in this window]
[in a new window]

 
Fig. 2. Cladogram showing intron size variation across taxa at the PHOT1 locus. The individual rectangles are arranged to correspond the terminal branches of the gene tree, so that the variation can be visualized along the phylogeny. This gene tree is the same as the phylogram in Fig. 4.

 

Figure 3
View larger version (35K):
[in this window]
[in a new window]

 
Fig. 3. Cladogram showing intron size variation across taxa at the PHOT2 locus in the same fashion as Fig. 2. The gene tree is the same as the phylogram in Fig. 5.

 
Phylogenetic analysis
Figures 4 and 5 show the PHOT1 and PHOT2 gene trees, respectively. Both gene trees have good and somewhat similar resolution (Table 1). For example, 66% of the maximum number of internal branches in a fully resolved tree have bootstrap (BS) support higher than 80% from the parsimony analyses for both PHOT1 and PHOT2; 76% and 77% of the internal branches have posterior probability (PP) higher than 0.95 from the Bayesian analyses for PHOT1 and PHOT2, respectively (Table 1). Both PHOT1 and PHOT2 gene trees suggest monophyly of Verbena and Glandularia and nonmonophyly of Junellia, but they differ in relationships and phylogenetic positions of the two Junellia groups (Junellia I and II) (Figs. 4, 5). The PHOT1 gene tree recovered a basal grade of Junellia species, and neither Junellia group I nor II is monophyletic. In contrast, the PHOT2 gene tree resolved both Junellia I and II as monophyletic groups, though Junellia I was found at an unexpected position as sister to the genus Verbena.


Figure 4
View larger version (27K):
[in this window]
[in a new window]

 
Fig. 4. One of the two maximum parsimony (MP) trees from PHOT1 sequences. The topology is very similar to the Bayesian consensus tree. Bootstrap values (BS) and Bayesian posterior probabilities (PP) supporting the corresponding branches are shown when BS > 50 or PP > 0.95 (BS/PP). The asterisks indicate that BS < 50 when PP > 0.95 or PP < 0.95 when BS > 50 of the same branch. Clone numbers are designated after species names when the individual sampled is heterozygous at this locus. South American Verbena species are shaded. Verbena halei and V. menthifolia are in boldface. The thickened zigzag line indicates J. uniflora is expected to group with J. seriphioides as suggested by all other evidence. The black bars represent the MITE insertions discussed in the text.

 

Figure 5
View larger version (36K):
[in this window]
[in a new window]

 
Fig. 5. One of the four maximum parsimony (MP) trees from PHOT2 sequences. The topology is very similar to the Bayesian consensus tree. Nodal supports (BS/PP) and clone designations are shown in the same fashion as Fig. 4. South American Verbena are shaded. Verbena halei and V. menthifolia are in boldface.

 

View this table:
[in this window]
[in a new window]

 
Table 1. Resolution of the PHOT1 and PHOT2gene trees.

 
Figure 6 represents the results from our parsimony analyses on the partitioned data sets of PHOT1. Partition 1 (Fig. 6A) resolved Junellia uniflora (Phil.) Moldenke with species of the Junellia II group, albeit not well supported, but clearly not with Junellia I (highlighted in shade), whereas partition 2 resolved this species as sister to J. seriphioides in the Junellia I group with strong support (Fig. 6B).


Figure 6
View larger version (22K):
[in this window]
[in a new window]

 
Fig. 6. Parsimony analyses from the PHOT1 partitioned data sets. (A) One of 201 maximum parsimony (MP) trees from partition 1. (B) One of 43 MP trees from partition 2. The two trees are on the same scale. The parts after the zigzag lines are not in proportion to the rest of the figure. Species of the Junellia I group are shaded. Note that J. uniflora (in boldface) was resolved in conflicting positions by the two partitions. Numbers along the branches indicate bootstrap support.

 
Herterozygosity at PHOT1 and PHOT2
Although the same strategy was used for screening clones of both loci, substantially more alleles were recovered from PHOT2 than PHOT1. The heterozygous individuals are indicated by a clone number following the species name in Figs. 4 and 5. While 35% of the individuals sampled are heterozygous at the PHOT1 locus, 68% are heterozygous at PHOT2. What causes this disparity is unclear. One possibility is that the complex structure and nucleotide composition (e.g., many long insertions and deletions, and numerous microsatellite and polynucleotide regions) of PHOT1 sequences induced severe PCR bias—some of the alleles were amplified in such small quantity relative to the other alleles that they cannot be recovered by the our cloning and screening strategy.

DISCUSSION

Phylogenetic utility of the PHOT duplicates
Resolution
Considering that the three genera of the Verbena complex are closely related and that species within each genus are barely distinguishable when compared using the data set for ~5.3-kb noncoding chloroplast DNA (Yuan and Olmstead, 2008Go), the PHOT gene trees are fairly well resolved (Table 1 and Figs. 4, 5). This well-supported resolution suggests that the PHOT loci can be a good source of data to infer relationships among closely related taxa.

Congruence and incongruence between gene trees
For the backbone of the Verbena complex phylogeny, nuclear PHOT1, PHOT2, and waxy gene trees (for the waxy gene tree, see Yuan and Olmstead, 2008Go) all support the monophyly of Verbena and Glandularia [excluding G. crithmifolia (Gill. & Hook. ex Hook.) Schnack & Covas], which corroborates our chloroplast transfer hypothesis in explaining the nonmonophyly of these genera recovered from chloroplast DNA sequences (Yuan and Olmstead, 2008Go). Meanwhile, they also consistently suggest that the genus Junellia is not monophyletic and that Glandularia crithmifolia is more closely related to Junellia species than Glandularia species. However, the three nuclear gene trees are not completely congruent, even on the phylogenetic backbone. While PHOT1, waxy, and chloroplast gene trees suggest that Glandularia and Verbena are more closely related to each other than either to Junellia, PHOT2 gene trees indicate that a Junellia clade (Junellia I, Fig. 5) is sister to Verbena. The most likely explanation for this unexpected position of the Junellia I clade is incomplete lineage sorting, given the inference that the Verbena complex is a recent and rapidly diversifying group (Yuan and Olmstead, 2008Go). On the other hand, while PHOT2, waxy, and chloroplast gene trees all suggest that both groups of Junellia (Junellia I and II) are monophyletic, the PHOT1 gene tree indicates that neither of these two groups is monophyletic (Fig. 5). The nonmonophyly of Junellia I is due to the unexpected position of J. uniflora (highlighted by a thickened zigzag line in Fig. 4), which can be explained by a putative ancestral recombination (discussed later). The nonmonophyly of Junellia II, however, is most likely to be explained by incomplete lineage sorting again.

On a finer level, incongruence among gene trees is probably due to incomplete lineage sorting and possibly some recent gene flow. However, there are also cases in which different gene trees are congruent. Here we will discuss one exemplar case of each from the genus Verbena. The congruence example comes from a North American (NA) species, V. menthifolia (in boldface in Figs. 4 and 5). Both PHOT gene trees suggest that one of the two alleles of this species is very similar to V. halei, whereas the other allele is somehow quite isolated from the rest of NA species. The waxy gene tree shows the same pattern; cpDNA do not give much resolution in regard with the relationships among these species (Yuan and Olmstead, 2008Go). These nuclear gene trees consistently indicate that V. menthifolia, at least the individual sampled here, is of hybrid origin, with V. halei as one of the putative parental species. An extensive sampling across the distribution range of this species is necessary to test whether the entire species is of hybrid origin. The incongruence example pertains to the relationship between South American (SA) and NA Verbena. In Figs. 4 and 5, SA Verbena species are shaded; the rest of Verbena are NA species. The PHOT1 gene tree (Fig. 4) resolves both SA and NA Verbena as monophyletic, albeit not well supported due to the apparent sequence similarity between V. litoralis (SA) and the isolated V. menthifolia (NA) allele. In contrast, the PHOT2 gene tree shows that neither NA nor SA Verbena is monophyletic; both are polyphyletic. The waxy gene tree shows an intermediate scenario: the NA group is monophyletic, but the SA group is paraphyletic and forms a basal grade (Yuan and Olmstead, 2008Go).

The incongruence on some parts of the gene trees highlight one of the two major impediments to inferring phylogenetic relationships among recently diverged species that we mentioned earlier (see Introduction). The fact that given insufficient evolutionary time, different genes are expected to show different phylogenetic histories in a stochastic fashion due to random lineage sorting, has only recently been widely appreciated in the systematics community (Maddison, 1997Go; Degnan and Salter, 2005Go; Degnan and Rosenberg, 2006Go; Maddison and Knowles, 2006Go; Ané et al., 2007Go; Carstens and Knowles, 2007Go; Edwards et al., 2007Go; Kubatko and Degnan, 2007Go; Liu and Pearl, 2007Go). The critical importance of this issue has stimulated active development of analytic methods in recent years that take stochastic lineage sorting into account to infer phylogenies at species or population level using multilocus data (Maddison and Knowles, 2006Go; Ané et al., 2007Go; Carstens and Knowles, 2007Go; Edwards et al., 2007Go; Liu and Pearl, 2007Go). The gene trees reported here and in the previous study (Yuan and Olmstead, 2008Go) are fundamental bases for our future work of inferring the species tree of the Verbena complex.

Is it still necessary to develop general markers for phylogenetic studies?
As we stated in the introduction, one of the goals of this paper is to develop the PHOT gene paralogs as a new source of data to infer nuclear gene trees. But in the "phylogenomic era" (Delsuc et al., 2005Go), why is it still necessary to develop general markers from only a handful of loci? First, for nonmodel taxa, genomic-level data are still scanty for phylogenetic purposes, particularly in plant phylogenetic studies. The limited plant "phylogenomics" data, essentially all from organelle genomes (i.e., chloroplast and mitochondria) rather than the far more information-rich nuclear genomes, are primarily used to depict the general picture of relationships among major plant groups (e.g., Qiu et al., 2006Go). Although organelle genome sequencing will undoubtedly become routine for inferring phylogenies in the near future, application of organelle genome data at the species or population level, which often requires sampling a large number of individuals, will probably need to wait for several years. But meanwhile, phylogenetic studies at the species and/or population level are emerging as one of the major interests of the systematics community, so there is still a need for general nuclear gene markers. Second, plant organelle genomes may not possess enough variation for inter- and intraspecies level problems. The fact that ~5.3 kb of relatively rapidly evolving noncoding chloroplast DNA barely gives any resolution within the genus Verbena or Glandularia (Yuan and Olmstead, 2008Go) makes us suspect the sufficiency of entire chloroplast genomes, comprised mostly of conserved coding sequences, in resolving relationships between closely related taxa. Finally, and most importantly, even when organelle genome sequences become readily available at the species or population level and if these data are sufficient to generate a well-resolved gene tree, the lack of recombination means that the entire organelle genome is just one "coalescence gene" (Hudson, 1990Go). Multiple loci (i.e., multiple independent gene trees) are essential to increase accuracy of estimating species tree from gene trees (Maddison and Knowles, 2006Go; Knowles and Carstens, 2007Go). Therefore, nuclear loci are indispensable sources of data to infer inter- or intraspecies relationships.

Despite the paramount importance of nuclear loci for inferring relationships among closely related taxa, there are few broadly applicable nuclear DNA regions available to plant systematists who study a specific group. Screening appropriate loci often takes much time and is not of interest to most empirical systematists (but see some recent efforts in identifying large number of conserved ortholog set [COS] markers; Fulton et al., 2002Go; Wu et al., 2006Go). Therefore, beyond our primary focus on the Verbena complex, a set of broadly applicable primers were optimized for each of the PHOT gene duplicates for amplifying and sequencing Lamiales (Fig. 1), an angiosperm clade consisting of some 23 families and 22 000 species, so that systematists who study phylogenetics of closely related taxa in Lamiales can easily access these loci.

Dramatic intron size variation
Nuclear gene intron size evolves rapidly
Five of the six introns of PHOT1 and one of the six introns of PHOT2 sequenced in this study have some substantial size variation; among them intron 13 of PHOT1 represents the most dramatic variation (Figs. 2 and 3). Within such a recently radiated group as the Verbena complex, the size of intron 13 of PHOT1 varies from 193 to 2468 bp—over 12-fold. Even among very closely related species, there is conspicuous variation. For example, one South American Verbena species is 2.5 times longer than other South American Verbena species (638 vs. 250 bp) in intron 8 of PHOT1; one of the two alleles of Glandularia tenera is two times longer than the other allele (939 vs. 493 bp) in intron 12 of PHOT1. These results suggest that nuclear gene intron size evolves rapidly at the PHOT1 locus. Perhaps empirical systematists using nuclear gene sequences need to be cautious when cutting the band of the "expected size" from a gel when the PCR products contain multiple molecules of variable size.

Source of intron size variation
What causes the intron size variation we observed? There appear to be a variety of sources of variation, but extensive invasion of short transposable elements (100–1000 bp) and frequent long deletions and insertions of unknown causes are largely responsible. For example, in PHOT1 intron 8 (Fig. 2), one species has an exceptionally long intron due to a MITE (miniature inverted-repeat transposable element) insertion (MITE1, Fig. 4). In PHOT1 intron 13, a MITE insertion occurred in the common ancestor of the Junellia I, Verbena, and Glandularia clades (MITE2, Fig. 4), which explains why the Junellia II species are shorter than all others (Fig. 2, some Glandularia species had further deletions). Later in the same intron, a long insertion of unknown origin occurred in the common ancestor of Verbena, explaining why Verbena has a bigger intron than others (Fig. 2). Subsequently, a second MITE insertion (MITE3, Fig. 4) occurred in V. litoralis within the originally inserted MITE, and an independent second MITE insertion (MITE4, Fig. 4) occurred in V. hispida within the same original MITE but at a different position. It is notable that these MITEs are all evolutionarily related but are quite different in nucleotide sequence, which suggests there have been active MITE transpositions during the evolution of the Verbena complex. It is also well known that transposable element insertion is a common source of genome size variation (Bennetzen et al., 1998Go; SanMiguel and Bennetzen, 1998Go; Vitte and Panaud, 2005Go; Hawkins et al., 2006Go). The size reduction of this intron in some Glandularia species is caused by a few independent long deletions of unknown mechanisms (Fig. 2). In PHOT1 intron 12, the insertion found in one of the two alleles of Glandularia tenera resulted from a transfer of a piece of chloroplast psaA gene. In PHOT2 intron 10 (Fig. 3), a long deletion occurred in the common ancestor of Glandularia, resulting in Glandularia having a smaller intron than Junellia and Verbena. In the same intron, an ~280-bp duplication of a direct repeat of the neighboring sequence was found in the common ancestor of a clade of Verbena species that includes V. neomexicana var. hirtella (Fig. 5).

Intron size change as evolutionary signature
Every insertion or deletion event mentioned left a signature that can be used to trace evolutionary history and assess phylogenetic relationships. For instance, the original MITE insertion in PHOT1 intron 13 (MITE2, Fig. 4) is a molecular signature defining the clade consisting of Verbena, Glandularia, and Junellia I, corroborating other evidence that Junellia is not monophyletic. Similarly, the big deletion in PHOT2 intron 10 is a synapomorphy for Glandularia and the ~280-bp insertion in PHOT2 intron 12 is a synapomorphy defining the monophyly of that group of Verbena species that have this insertion.

Footprint of ancestral recombination
Phylogenetic analysis of the entire sequence from exons 8–14 of PHOT1 put Junellia uniflora in a very unexpected position (Fig. 4), in conflict with all other evidence that suggests J. uniflora belongs to the monophyletic Junellia I group. Visual inspection of the sequence and algorithm-based detection revealed that the first part (exon 8 to the end of intron 11) is quite similar to the Junellia II sequences, whereas the second part (exons 12–14) is clearly Junellia I type. We have resequenced this taxon to confirm that this "mosaic feature" is not an artifact.

In gene trees from the partitioned data sets, J. uniflora is grouped with Junellia II species using the first part of the sequence (Fig. 6A), but groups with Junellia I species when using the second part (Fig. 6B). Two hypotheses can explain the origin of this mosaic sequence: (1) recent gene flow between J. uniflora (a species belonging to Junellia I clade) and some Junellia II species along with recombination, and (2) ancient recombination between ancestral Junellia alleles followed by lineage sorting.

Multiple lines of evidence favor the ancestral-recombination-followed-by-lineage-sorting over the recent-gene-flow-along-with-recombination hypothesis. First, if there is recent gene flow between J. uniflora and some Junellia II species, one would expect to find some Junellia II alleles introduced to J. uniflora at other loci besides PHOT1, but our PHOT2 and waxy data show a clean J. uniflora genetic background. Second, recent gene flow with recombination should result in two reciprocally mosaic alleles instead of only one allele, as recovered in this study after we amplified and sequenced this taxon twice. Furthermore, the Junellia I clade is more closely related to Verbena and Glandularia than to the Junellia II clade. Recent gene flow is less likely between the two Junellia clades than within each clade. On the other hand, an ancient recombination between ancestral alleles before the divergence of Junellia I and II clades could provide sufficient time for the fixation of one of the recombinant alleles in J. uniflora during subsequent lineage sorting. This "ancient" history could also provide sufficient time to accumulate observed sequence divergences between J. uniflora and both Junellia I and Junellia II species represented by branch lengths in the gene trees (Fig. 6). Based on these concerns, we find it is most plausible that the mosaic feature of J. uniflora PHOT1 sequence is a footprint of an ancestral recombination event. It would be interesting to sample more individuals in the future to examine whether this pattern is found in the entire species.

Appendix 1. List of taxa included in this study but not in Yuan and Olmstead (2008)Go with their sampling locality, voucher information, native continents, and chromosome number counts. Chromosome number counts are from Chromosome numbers of flowering plants (Fedorov, 1969). SA = South America, NA = North America.


View this table:
[in this window]
[in a new window]

 
 
Appendix 2. Extra primers, besides the primers in Fig. 1, that were specifically designed for sequencing large introns of the Verbena complex. For sequencing the most variable intron of PHOT1 (intron 13), in13F1, in13R1, in13F2, and in13R2 were used in Verbena, Junellia, and some Glandularia species; in13F3 and in13R3 were used for sequencing a big insertion unique to Verbena; NAVF1, NAVR1, NAVF2, NAVR2, NAVF3, and NAVR3 were mainly used for sequencing through a microsatellite region with tandem AT repeats and a poly A region in some North American Verbena species. VlitF and VlitR were used for sequencing through a MITE (miniature inverted-repeat transposable element) insertion that was only found in V. litoralis. VbonF and VbonR were used to sequence through a MITE insertion that is unique to V. bonariensis. PHOT2 was easier to sequence than PHOT1. Primers in10F and in10R were used to sequence intron 10 of PHOT2, and in12F1, in12R, and in12F2 were used to sequence intron 12.


View this table:
[in this window]
[in a new window]

 
 

FOOTNOTES

1 The authors thank M. Simmons and two anonymous reviewers for critical comments on this manuscript. This research was supported by a Graduate Fellowship in Molecular Systematics from the University of Washington Department of Biology, an NSF Grant (DEB-0542493) to R.G.O, and an NSF Doctoral Dissertation Improvement Grant DEB-0710026 to R.G.O. for Y.W.Y. Back

2 Author for correspondence (e-mail: colreeze{at}u.washington.edu) Back

LITERATURE CITED

Akaike, H. 1974. A new look at the statistical model identification. IEEE Transactions of Automatic Control 19: 716–723.[CrossRef]

Ané, C., B. Larget, D. A. Baum, S. D. Smith, AND A. Rokas. 2007. Bayesian estimation of concordance among gene trees. Molecular Biology and Evolution 24: 412–426.[Abstract/Free Full Text]

Bennetzen, J. L., P. SanMiguel, M. S. Chen, A. Tikhonov, M. Francki, AND Z. Avramova. 1998. Grass genomes. Proceedings of the National Academy of Sciences, USA 95: 1975–1978.[Abstract/Free Full Text]

Botta, S. M. 1989. Studies in the South American genus Junellia (Verbenaceae, Verbenoideae). I. Delimitation and infrageneric divisions. Darwiniana (San Isidro) 29: 371–396.

Briggs, W. R., C. F. Beck, A. R. Cashmore, J. M. Christie, J. Hughes, J. A. Jarillo, T. Kagawa et al. 2001. The phototropin family of photoreceptors. Plant Cell 13: 993–997.[Free Full Text]

Carstens, B. C., AND L. L. Knowles. 2007. Estimating species phylogeny from gene-tree probabilities despite incomplete lineage sorting: An example from Melanoplus grasshoppers. Systematic Biology 56: 400–411.[Abstract/Free Full Text]

Christie, J. M., P. Reymond, G. K. Powell, P. Bernasconi, A. A. Raibekas, E. Liscum, AND W. R. Briggs. 1998. Arabidopsis NPH1: A flavoprotein with the properties of a photoreceptor for phototropism. Science 282: 1698–1701.[Abstract/Free Full Text]

Degnan, J. H., AND N. A. Rosenberg. 2006. Discordance of species trees with their most likely gene trees. PLOS Genetics 2: 762–768.[Web of Science]

Degnan, J. H., AND L. A. Salter. 2005. Gene tree distributions under the coalescent process. Evolution; International Journal of Organic Evolution 59: 24–37.[Medline]

Delsuc, F., H. Brinkmann, AND H. Philippe. 2005. Phylogenomics and the reconstruction of the tree of life. Nature Reviews. Genetics 6: 361–375.[Web of Science][Medline]

Edwards, S. V., L. Liu, AND D. K. Pearl. 2007. High-resolution species trees without concatenation. Proceedings of the National Academy of Sciences, USA 104: 5936–5941.[Abstract/Free Full Text]

Fedorov, A. A. 1969. Chromosome numbers of flowering plants. Nauka, Leningrad, Russia.

Felsenstein, J. 1985. Confidence limits on phylogenies—An approach using the bootstrap. Evolution 39: 783–791.[CrossRef][Web of Science]

Fitch, W. M. 1970. Distinguishing homologous from analogous proteins. Systematic Zoology 19: 99–113.[Abstract/Free Full Text]

Fulton, T. M., R. Van der Hoeven, N. T. Eannetta, AND S. D. Tanksley. 2002. Identification, analysis, and utilization of conserved ortholog set markers for comparative genomics in higher plants. The Plant Cell 14: 1457–1467.[Abstract/Free Full Text]

Gibbs, M. J., J. S. Armstrong, AND A. J. Gibbs. 2000. Sister-Scanning: a Monte Carlo procedure for assessing signals in recombinant sequences. Bioinformatics (Oxford, England) 16: 573–582.[CrossRef]

Hawkins, J. S., H. Kim, J. D. Nason, R. A. Wing, AND J. F. Wendel. 2006. Differential lineage-specific amplification of transposable elements is responsible for genome size variation in Gossypium. Genome Research 16: 1252–1261.[Abstract/Free Full Text]

Hudson, R. R. 1990. Gene genealogies and the coalescent process. In D. Futuyma, and J. Antonovics [eds.], Oxford surveys in evolutionary biology, vol. 7, 1–44. Oxford University Press, Oxford, UK.

Hughes, C. E., R. J. Eastwood, AND C. D. Bailey. 2006. From famine to feast? Selecting nuclear DNA sequence loci for plant species-level phylogeny reconstruction. Philosophical Transactions of the Royal Society, B. Biological Sciences 361: 211–225.[CrossRef][Web of Science]

Jarillo, J. A., H. Gabrys, J. Capel, J. M. Alonso, J. R. Ecker, AND A. R. Cashmore. 2001. Phototropin-related NPL1 controls chloroplast relocation induced by blue light. Nature 410: 952–954.[CrossRef][Web of Science][Medline]

Kagawa, T., T. Sakai, N. Suetsugu, K. Oikawa, S. Ishiguro, T. Kato, S. Tabata et al. 2001. Arabidopsis NPL1: A phototropin homolog controlling the chloroplast high-light avoidance response. Science 291: 2138–2141.[Abstract/Free Full Text]

Kinoshita, T., M. Doi, N. Suetsugu, T. Kagawa, M. Wada, AND K. Shimazaki. 2001. phot1 and phot2 mediate blue light regulation of stomatal opening. Nature 414: 656–660.[CrossRef][Web of Science][Medline]

Kubatko, L. S., AND J. H. Degnan. 2007. Inconsistency of phylogenetic estimates from concatenated data under coalescence. Systematic Biology 56: 17–24.[Abstract/Free Full Text]

Liu, L., AND D. K. Pearl. 2007. Species trees from gene trees: Reconstructing Bayesian posterior distributions of a species phylogeny using estimated gene tree distributions. Systematic Biology 56: 504–514.[CrossRef][Web of Science][Medline]

Maddison, W. P. 1997. Gene trees in species trees. Systematic Biology 46: 523–536.[Abstract/Free Full Text]

Maddison, W. P., AND L. L. Knowles. 2006. Inferring phylogeny despite incomplete lineage sorting. Systematic Biology 55: 21–30.[Abstract/Free Full Text]

Martin, D. P., C. Williamson, AND D. Posada. 2005. RDP2: Recombination detection and analysis from sequence alignments. Bioinformatics (Oxford, England) 21: 260–262.

Posada, D., AND K. A. Crandall. 1998. MODELTEST: Testing the model of DNA substitution. Bioinformatics (Oxford, England) 14: 817–818.[CrossRef]

Qiu, Y. L., L. B. Li, B. Wang, Z. D. Chen, V. Knoop, M. Groth-Malonek, O. Dombrovska et al. 2006. The deepest divergences in land plants inferred from phylogenomic evidence. Proceedings of the National Academy of Sciences, USA 103: 15511–15516.[Abstract/Free Full Text]

Rambaut, A. 1996. Se-Al: Sequence Alignment Editor. Website http://evolve.zoo.ox.ac.uk/. University of Oxford, Oxford, UK.

Ronquist, F., AND J. P. Huelsenbeck. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics (Oxford, England) 19: 1572–1574.[CrossRef]

Sanders, R. W. 2001. The genera of Verbenaceae in the southeastern United States. Harvard Papers in Botany 5: 303–358.

Sanmiguel, P., AND J. L. Bennetzen. 1998. Evidence that a recent increase in maize genome size was caused by the massive amplification of intergene retrotransposons. Annals of Botany 82: 37–44.[Abstract/Free Full Text]

Shaw, J., E. B. Lickey, J. T. Beck, S. B. Farmer, W. S. Liu, J. Miller, K. C. Siripun et al. 2005. The tortoise and the hare II: Relative utility of 21 noncoding chloroplast DNA sequences for phylogenetic analysis. American Journal of Botany 92: 142–166.[Abstract/Free Full Text]

Shaw, J., E. B. Lickey, E. E. Schilling, AND R. L. Small. 2007. Comparison of whole chloroplast genome sequences to choose noncoding regions for phylogenetic studies in angiosperms: The tortoise and the hare III. American Journal of Botany 94: 275–288.[Abstract/Free Full Text]

Simmons, M. P. 2004. Independence of alignment and tree search. Molecular Phylogenetics and Evolution 31: 874–879.[CrossRef][Web of Science][Medline]

Small, R. L., J. A. Ryburn, R. C. Cronn, T. Seelanan, AND J. F. Wendel. 1998. The tortoise and the hare: Choosing between noncoding plastome and nuclear Adh sequences for phylogeny reconstruction in a recently diverged plant group. American Journal of Botany 85: 1301–1315.[Abstract/Free Full Text]

Smith, J. M. 1992. Analyzing the mosaic structure of genes. Journal of Molecular Evolution 34: 126–129.[Web of Science][Medline]

Swofford, D. L. 2002. PAUP*: Phylogenetic analysis using parsimony (*and other methods), version 4b10. Sinauer, Massachusetts, USA.

Vitte, C., AND O. Panaud. 2005. LTR retrotransposons and flowering plant genome size: emergence of the increase/decrease model. Cytogenetic and Genome Research 110: 91–107.[CrossRef][Web of Science][Medline]

Wu, F. N., L. A. Mueller, D. Crouzillat, V. Pétiard, AND S. D. Tanksley. 2006. Combining bioinformatics and phylogenetics to identify large sets of single-copy orthologous genes (COSII) for comparative, evolutionary and systematic studies: A test case in the euasterid plant clade. Genetics 174: 1407–1420.[Abstract/Free Full Text]

Yuan, Y. W., AND R. G. Olmstead. 2008. A species-level phylogenetic study of the Verbena complex (Verbenaceae) indicates two independent intergeneric chloroplast transfers. Molecular Phylogenetics and Evolution 48: 23–33.[CrossRef][Web of Science][Medline]


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Facebook Facebook   Add to Reddit Reddit   Add to Technorati Technorati   Add to Twitter Twitter    What's this?



This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplementary Data
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Web of Science (3)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Yuan, Y.-W.
Right arrow Articles by Olmstead, R. G.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Yuan, Y.-W.
Right arrow Articles by Olmstead, R. G.
Agricola
Right arrow Articles by Yuan, Y.-W.
Right arrow Articles by Olmstead, R. G.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Facebook   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS