|
|
||||||||
Systemics and Phytogeography |
2Department of Biology, Duke University, Durham, North Carolina 27708 USA; 3Department of Pomology, University of California, One Shields Avenue, Davis, California 95616 USA
Received for publication February 25, 2004. Accepted for publication September 9, 2004.
| ABSTRACT |
|---|
|
|
|---|
Key Words: biogeography chloroplast DNA hybridization Neillieae phylogeny ribosomal DNA second intron of LEAFY
| INTRODUCTION |
|---|
|
|
|---|
The tribe Neillieae (Rosaceae), comprising three taxonomically difficult genera, Neillia D. Don, Physocarpus (Cambess.) Raf., and Stephanandra Siebold & Zucc. (Maximowicz, 1879
; Schulze-Menz, 1964
), is an appropriate system for studying the historical biogeography of the Northern Hemisphere. Neillieae is distributed in eastern Asia and both western and eastern North America. While Neillia, Stephanandra, and P. amurensis are distributed in eastern Asia, P. alternans, P. capitatus, P. malvaceus, and P. monogynus are found in western North America and P. opulifolius occurs in eastern North America. Monophyly of Neillieae has been strongly supported by chloroplast DNA (cpDNA) sequence data, including rbcL (Morgan et al., 1994
) and matK and trnL-trnF genes (Potter et al., 2002
). Morphologically, members of Neillieae are characterized by lobed leaves with persistent or deciduous stipules and ovoid shiny seeds with copious endosperm (Vidal, 1963
). The total number of species of this tribe is relatively small (18 species), making it amenable to analysis using different kinds of character systems and phylogenetic methods. To date, the historical biogeography of Neillieae has not been studied.
In addition, Neillieae has been in need of a comprehensive systematic study using modern methods to analyze both molecular and morphological data. The morphological characters used to distinguish each genus often vary within as well as among genera (Table 1), and because of different interpretations of the morphological variation by many taxonomists, conflicting classification schemes have been proposed. For instance, Bentham and Hooker (1865)
, Greene (1889)
, and Jones (1893)
treated Physocarpus as part of Neillia, with Stephanandra as a separate genus, whereas Kuntze (1891)
classified all species of Neillieae in Physocarpus. Although many modern authors recognize three genera in Neillieae (Rehder, 1940
; Schulze-Menz, 1964
; Robertson, 1974
; Takhtajan, 1997
), no comprehensive systematic or phylogenetic study of all species has been made until now. Vidal (1963)
and Cullen (1971)
published revisionary studies of Neillia and briefly discussed morphological relationships among the three genera. Their studies, however, concentrated on Neillia only, and the characteristics were not evaluated phylogenetically. Both Physocarpus and Stephanandra have been treated in regional floristic manuals (Rydberg, 1908
; Ohwi, 1965
; Fernald, 1970
; Yu and Ku, 1974
; Gleason and Cronquist, 1991
; Holmgren, 1997
).
|
The sequences used in this study are divided into three groups, here designated molecular character systems. The first of these comprises sequence data of five regions of cpDNA. This includes sequences of the trnL-trnF, trnD-trnT, psbA-trnK, and matK-trnK regions, which have been widely used as valuable source of data for studying phylogenetic relationships at the specific and generic levels in angiosperms (Mort et al., 2002
; Smedmark and Eriksson, 2002
; Miller et al., 2003
).
The second character system comprises sequence data for the ETS region in addition to ITS of nuclear ribosomal DNA (rDNA). While the ITS region has been widely used for elucidating phylogenetic relationship among closely related species in angiosperms (reviewed by Baldwin et al., 1995
), the ETS region, flanked by the nontranscribed spacer (NTS) and 18S ribosomal gene, has not been used as widely as ITS because general primers are not available in most groups of angiosperms. ETS has, however, been used as a valuable source of data for phylogenetic studies at lower taxonomic levels in Asteraceae (Baldwin and Markos, 1998
; Linder et al., 2000
; Markos and Baldwin, 2001
; Lee et al., 2003
; Morgan, 2003
; Saar et al., 2003
), Cyperaceae (Starr et al., 2003
), Fabaceae (Bena et al., 1998
), and Malvaceae (Andreasen and Baldwin, 2001
). Phylogenetic studies using ETS sequences have shown that the ETS region has a higher percentage of phylogenetically informative characters than ITS and, when combined with ITS data, ETS data improved phylogenetic resolution and increased bootstrap support compared to a phylogeny based on the ITS region alone (Baldwin and Markos, 1998
; Markos and Baldwin, 2001
; Morgan, 2003
).
The third molecular character system is derived from LEAFY, a nuclear homeotic gene that regulates the establishment of floral meristem identity and flowering time in Arabidopsis (Weigel, 1995
; Blázquez et al., 1997
). The gene is distributed in all plants including mosses, ferns and "fern allies," gymnosperms, and angiosperms (Frohlich and Parker, 2000
; Himi et al., 2001
). Phylogenetic analyses of amino acid sequences of LEAFY suggest that the gene was duplicated on the stem lineage leading to seed plants, but that one copy was lost in angiosperms, making it a single-copy gene in diploid angiosperms (Frohlich and Parker, 2000
; Himi et al., 2001
). The nucleotide sequences of the second intron of the gene have been used in phylogenetic analysis of Amorphophallus (Grob et al., 2004
), Fagopyrum (Nishimoto et al., 2003
), Gnetum (Won and Renner, 2003
), Isoëtes (Hoot and Taylor, 2001
), and Sphagnum (Shaw et al., 2003
), as well as in our previous analyses of Neillia and Stephanandra (Oh and Potter, 2003
).
| MATERIALS AND METHODS |
|---|
|
|
|---|
Phylogenetic analyses of Rosaceae based on various nucleotide sequence data have not resolved the sister group of Neillieae (Morgan et al., 1994
; Potter et al., 2002
). We used Lyonothamnus and Vauquelinia as outgroups because sequences of rDNA and cpDNA from these two taxa are easily aligned to those from Neillieae. Lyonothamnus is sister to the large clade in which Neillieae is nested, and Vauquelinia is nested in the sister clade of the tribe Neillieae (Morgan et al., 1994
; Potter et al., 2002
).
Gene sampling
We examined three regions of DNA: (1) several regions of chloroplast DNA (trnL-trnF, trnD-trnT, matK-trnK, and psbA-trnK), (2) spacer regions of nrDNA (ITS and ETS), and (3) the second intron of LEAFY.
Each region was amplified via polymerase chain reaction (PCR) from total DNA isolated from fresh or silica gel-dried young leaves using a DNeasy Plant Mini kit (Qiagen, Valencia, California, USA). For two accessions (N. sparsiflora and P. alternans 175), we extracted total DNA from herbarium specimens using the CTAB method (Doyle and Doyle, 1987
).
Primer sequences, PCR conditions, cloning, and sequencing procedures for most of the regions (trnL-trnF, trnD-trnT, matK-trnK, ITS, and the second intron of LEAFY) are described in Oh and Potter (2003)
, while those for the ETS region are described separately later. The chloroplast psbA-trnK region was amplified using trn2 and psbA3 primers (Appendix 2, see Supplemental Data accompanying online version of this article) and was sequenced in both directions with the same PCR primer set. Nucleotide sequences of all regions were directly sequenced from PCR products except for LEAFY, in which sequences were primarily determined via cloning (Oh and Potter, 2003
). Nucleotide sequences of trnL-trnF, trnD-trnT, matK-trnK, ITS, and the second intron of LEAFY from Neillia except for N. sparsiflora, Stephanandra, P. amurensis, and P. capitatus accession 082 were derived from Oh and Potter (2003)
; all other sequences were determined in this study (GenBank accession numbers are in Appendix 1).
All sequences were determined at the Division of Biological Sciences sequencing facility on the UC Davis campus, which uses an ABI PRISM 377 DNA Sequencer or an ABI PRISM 3100 Genetic Analyzer (PE Biosystems, Foster City, California, USA). Sequences were edited in Sequencher version 4.1 (Gene Codes Corporation, Ann Arbor, Michigan, USA), and IUPAC ambiguity symbols were used for uncertain and polymorphic sites.
Two divergent types of cloned sequence of the second intron of LEAFY were found in P. opulifolius, but only one of the two types, not both, was discovered in each accession (see Results). The two types differed in the number of XbaI cleavage sites. To test the possibility that both types were present in PCR products but that only one was selected in the cloning procedure, 1 µg of PCR products was digested with XbaI for each accession, and the digested DNA was separated in a 1.5% agarose gel.
ETS primers and amplification
Because universal primers for ETS are not available, we followed the general procedure of Baldwin and Markos (1998)
to develop ETS primers for Neillieae. The entire intergenic spacer (IGS) of rDNA was amplified using primers IGS3 and IGS8 (Fig. 1; Appendix 2) for Physocarpus capitatus and primers 26S-IGS (Baldwin and Markos, 1998
) and IGS88 (Fig. 1; Appendix 2) for Aruncus dioicus (Walter) Fernald, which was included as an exemplar for a distantly related group (Morgan et al., 1994
; Potter et al., 2002
). The PCR primers in the 18S gene (IGS88 and IGS8) are located ca. 300 base pairs (bp) downstream from the 5' end of the 18S rDNA gene (Fig. 1), which allowed us to determine whether or not we had amplified the desired region by checking sequences for the presence of a portion of the highly conserved 18S gene. PCR amplifications were carried out with the Perkin-Elmer GeneAmp II kit with AmpliTaq Gold DNA polymerase (PE Biosystems) and Taq Extender PCR Additive (Stratagene, La Jolla, California, USA) as follows: a hot start at 95°C for 10 min; 40 cycles of denaturation at 95°C for 30 s, primer annealing at 50°C for 1 min, and primer extension at 72°C for 5 min; followed by a final extension at 72°C for 7 min. The complete IGS sequences were determined using five additional nested sequencing primers (two primers for P. capitatus, three for A. dioicus) as well as PCR primers. The nested sequencing primers are not listed in Appendix 2, but sequences for those primers are available from the first author upon request.
|
Sequence alignments
Sequences were aligned using Clustal X (Thompson et al., 1997
) and adjusted manually as needed. All chloroplast sequences were concatenated to make the cpDNA data set. Because the ITS and ETS are parts of the rDNA repeat (Soltis and Soltis, 1998
), sequences of these spacer regions were combined in phylogenetic analyses to make the rDNA data set. For the LEAFY data, all variable cloned sequences from each accession were included; however, nucleotide sequences from the two outgroup species were not included because of alignment problems.
A few sequences of cpDNA were not determined due to difficulties with the PCR or to lack of variability of nucleotide sequences across species as found in the preliminary survey, in which not all accessions in the particular species were sequenced. These sequences were treated as missing data. Of the 108 664 cells in the aligned cpDNA data matrix, 3925 (3.6%) cells were scored as missing. There were no missing cells in the rDNA and LEAFY data. Aligned data matrices along with phylogenetic trees were submitted to the TreeBase database (http://www.treebase.org/).
Phylogenetic analyses
Separate phylogenetic analyses for each data set were conducted employing maximum parsimony (MP) and Bayesian methods. We used PAUP* version 4.0b10 (Swofford, 2002
) for the parsimony analyses. All characters were treated as unordered and weighted equally. Gaps were treated as missing data, and multiple character states at a site were interpreted as uncertainty. Heuristic searches were used in all analyses to find the MP trees with 100 replicates of random taxon addition and tree bisectionreconnection (TBR) branch swapping saving all of the best trees at each step (MulTrees). Bootstrap analyses (Felsenstein, 1985
) with 500 pseudoreplicates were conducted with simple sequence addition and TBR branch swapping. No more than 1000 trees were saved for each pseudoreplication for cpDNA and rDNA data. In the case of LEAFY data, the "fast" bootstrap option in PAUP* (Swofford, 2002
) was used with 10 000 pseudoreplicates. Bayesian phylogenetic analyses were performed with MrBayes 3.0 (Huelsenbeck and Ronquist, 2001
). A Metropolis-coupled Markov chain Monte Carlo (MCMCMC) algorithm was employed for 1 000 000 generations, sampling trees every 100 generations, with four independent chains running simultaneously. For the cpDNA and LEAFY data, the general time-reversal model (GTR; Swofford et al., 1996
) with six rate parameters and the gamma distribution (
) was used, and for the rDNA data, the GTR +
model with two rate parameters was used to estimate the likelihood values. These evolutionary models were determined by the hierarchical likelihood ratio test using Modeltest version 3.06 (Posada and Crandall, 1998
). In each analysis, all 10 001 resulting trees were imported into PAUP*, and a 50% majority-rule consensus tree was generated after discarding the first 201 trees (20 000 generations). These "burn-in" generations, for which the log-likelihood values had not reached a plateau, were determined by plotting a graph of the log-likelihoods of each generation vs. generation numbers (Huelsenbeck and Ronquist, 2001
). Topological incongruence was evaluated based upon relative bootstrap support or Bayesian posterior probability (Mason-Gamer and Kellogg, 1996
). We considered topological conflicts among data partitions to be significant if discordant relationships of a given set of taxa were supported with greater than 70% bootstrap support or 95% posterior probabilities.
In combined analyses of all three data sets, we concatenated all the sequencing results from all taxa. For LEAFY data, we randomly selected one cloned sequence per accession or used the direct sequencing result, if available. However, in some accessions of P. malvaceus and P. monogynus, two distinct sequence types were found within an accession (see Results). For these sequences, we included two representative cloned LEAFY sequences per accession in the combined data set and duplicated the cpDNA and rDNA sequences.
The combined data set was analyzed employing MP, maximum likelihood (ML), and Bayesian methods. The ML analysis utilized the GTR model with six rate parameters, the proportion of invariable sites (I) = 0.5523, and the shape parameter of the gamma distribution (
)
= 0.7984, as determined by the hierarchical likelihood ratio test using Modeltest version 3.06 (Posada and Crandall, 1998
). Heuristic searches with 100 replicates of random taxon addition, TBR branch swapping, and MulTrees options were used to find MP and ML trees with PAUP*. Reliability of each clade was evaluated by bootstrap proportions and Bayesian posterior probabilities. Bootstrap proportions for each clade were obtained only in the MP analysis, using 500 pseudoreplicates of the data with simple sequence addition, TBR branch swapping, and MulTrees options. Bayesian posterior probabilities were estimated in MrBayes. The MCMCMC algorithm was employed for 1 000 000 generations, sampling trees every 100 generations, with four independent chains running simultaneously. We applied two separate models for different partitions: GTR +
with six rate parameters for the cpDNA data and GTR +
with two rate parameters for the rDNA and reduced LEAFY data. The first 14 000 generations were eliminated as the "burn-in" generation, and a 50% majority-rule consensus tree was computed for the rest of trees.
Biogeographic analyses
Ancestral distributions were reconstructed from a reduced species tree from the combined analysis with the DIVA program, version 1.1 (Ronquist, 1997
). This dispersal-vicariance analysis assumes that speciation is caused by vicariance and reconstructs the optimal ancestral distribution using a parsimony criterion to minimize the dispersal and extinction events. Current distribution areas for the species of Neillieae were coded in three categories (eastern Asia, eastern North America, and western North America). Because the sister relationship of Neillieae is unclear (Kalkman, 1988
; Morgan et al., 1994
; Potter et al., 2002
), several possible combinations of outgroup distributions were explored in the reconstructions.
Estimation of divergence time
Divergence times of Neillieae were estimated by the penalized likelihood method implemented in the program r8s (Sanderson, 2002
), which allows evolutionary rates to vary across a phylogeny. This semi-parametric smoothing method uses a smoothing parameter that controls rate smoothing and fitness of the data to the saturation model, in which each lineage is permitted to have a unique rate. If the smoothing parameter is set to zero it becomes the saturation model, while an extremely higher value of smoothing results in a molecular clock model, in which every lineage of a phylogeny has the same rate of change. The optimal smoothing parameter is chosen from cross-validation analysis of the data (Sanderson, 2002
).
For this analysis, we generated the ML tree of Rosaceae from the combined matK and trnL-trnF sequence data in Potter et al. (2002)
to estimate the age of the most recent common ancestor (MRCA) of Neillieae. The ML tree was generated in PAUP* through heuristic searches with 100 replicates of random taxon addition, TBR branch swapping, and MulTrees options. The GTR +
model (Swofford et al., 1996
) with six rate parameters and the gamma shape parameter (
= 0.6716) was used, as determined by the hierarchical likelihood ratio test using Modeltest version 3.06 (Posada and Crandall, 1998
). The ML tree with estimated branch lengths was included as the source tree in the analysis of divergence times, and the outgroups (Rhamnus, Morus, and Ulmus; Potter et al., 2002
) were excluded before the analysis. The age of the Rosaceae was fixed at 76 million years before the present (mya) based on estimation of Wilkström et al. (2001)
. We used the age of fossilized Prunus endocarps with enclosed seeds (Middle Eocene; Cevallos-Ferriz and Stockey, 1991
) to calibrate the rescaled molecular tree. It is, however, uncertain whether the fossilized fruits belong to the crown group of Prunus or represent stem lineages leading to the crown group. In either case, the age of the stem group (Magallón and Sanderson, 2001
), i.e., the age of divergence of Prunus from its sister clade, Maloideae s.l., should be older than the fossil Prunus age. We constrained the minimum age of the MRCA of Prunus and its sister clade to be 44.3 mya. The optimal smoothing parameter, determined by the cross-validation procedure using the truncated Newton (TN) algorithm (Sanderson, 2002
), was set to 10.
Confidence intervals of the divergence times, derived from sampling of a limited number of nucleotide characters, were estimated by the nonparametric bootstrap procedure (Baldwin and Sanderson, 1998
; Sanderson and Doyle, 2001
). Two hundred bootstrap trees were generated using PAUP*, enforcing the original ML topology as a constraint at each bootstrapping step. These bootstrap trees have identical topologies, but their branch lengths vary across trees because data matrices used to estimate branch lengths were bootstrapped. Branch lengths were estimated under the ML criterion using the same model described earlier. These 200 phylograms were used as source trees to estimate divergence times in r8s.
| RESULTS |
|---|
|
|
|---|
|
|
The amplification of the second intron of LEAFY produced a single band in agarose-gel electrophoresis except for P. alternans 253, from which an additional weak-intensity band was generated. All PCR products of LEAFY from species of Neillieae contained sequences of both exon 2 and exon 3 and intron/exon boundary sequences. The multiple alignment of the LEAFY sequences indicated that longer sequence from the additional faint band found in P. alternans 253 resulted from a unique insertion (257 bp) in the intron. With the exception of the longer sequence, the unaligned length of the second intron of LEAFY in Physocarpus ranged from 843 to 860 bp. The range of the length of the intron in Neillia and Stephanandra was from 581 to 622 bp except for the sequences from N. thibetica. The intron sequences from that species were about 1370 bp in length, and a 757-bp insertion was assumed in order to align the sequences with others. All of the LEAFY sequences from Neillieae were reliably aligned when several blocks of gaps were introduced, and the final alignment of the LEAFY data set consisted of 2038 sites, 57 of which were from exons.
Sequences of the second intron of LEAFY from outgroup species were, however, highly divergent from those of Neillieae, resulting in alignment problems (Oh and Potter, 2003
). The outgroups were therefore excluded from analyses of LEAFY data and the trees based on LEAFY sequences were rooted between Physocarpus and Neillia-Stephanandra, based on the results of Potter et al. (2002)
and of our analyses of the rDNA and cpDNA data in this study. The LEAFY data are the most variable among the three data sets (Table 2). The average pairwise sequence divergence of the LEAFY data among species in Neillieae was 1.7 and 9 times higher than that of the rDNA and cpDNA data, respectively.
Separate phylogenetic analyses
Phylogenetic analysis of the cpDNA data set produced 30 MP trees (length = 295 steps, consistency index (CI), excluding uninformative characters = 0.92, retention index (RI) = 0.98), while 480 MP trees were found in the phylogenetic analysis of the rDNA data set (length = 397 steps, CI, excluding uninformative characters = 0.75, RI = 0.95). Both analyses revealed two strongly supported clades (Physocarpus and Neillia-Stephanandra) in Neillieae (Fig. 2). However, the relationship of Stephanandra with respect to Neillia was inconsistent between the two types of data. Stephanandra was supported as a monophyletic group and was nested within Neillia in the analysis of the cpDNA data (Fig. 2A), but this relationship was not resolved by the rDNA. While the MP analysis of rDNA data placed S. tanakae as sister to a weakly supported clade of Neillia, S. incisa and S. chinensis (Fig. 2B), the Bayesian analysis of the rDNA data suggested that both Neillia and Stephanandra are monophyletic (trees not shown). However, these alternative hypotheses of the rDNA data were poorly supported in both cases. Within the Physocarpus clade, the two accessions of P. alternans were sister to the rest of the species in Physocarpus in both rDNA and cpDNA analyses, but one of two accessions of P. monogynus (accession 269) was also placed in this position in cpDNA trees (Fig. 2A). The relationship of the eastern Asian species, P. amurensis, with respect to other species of Physocarpus was not well resolved in either analysis, but it certainly was not the first diverging lineage of Physocarpus.
|
|
|
Because the longer sequences of P. alternans were found only in accession 253 and because one of them, clone E, was sister to other shorter sequences (Fig. 3), we decided to include only the shorter sequence in the combined data set. However, we included two representative cloned LEAFY sequences per accession in the combined data for those accessions of P. malvaceus and P. monogynus that contained two distinct sequence types.
Combined phylogenetic analyses
The parsimony analysis of the combined data produced two MP trees (length = 994 steps, CI, excluding uninformative characters = 0.77, RI = 0.96). The two trees differed only in the placement of Stephanandra, which was sister to either the N. affinis-N. thyrsiflora clade or the N. sinensis-N. uekii clade. One of the two MP trees (Fig. 5) was selected as the best tree (ln L = 14 444.892) in the ML analysis.
|
As in the separate analyses of cpDNA and rDNA data, the combined analyses suggested that the two accessions of P. alternans are the two basal-most lineages in the genus Physocarpus and that, among the remaining species, P. amurensis is sister to the rest. The four accessions of P. opulifolius, as in LEAFY data, formed two distinct clades, which are not closely related to each other, the result of divergent LEAFY sequences in different accessions of that species. For P. malvaceus 266 and both accessions of P. monogynus, in which each accession had two distinct LEAFY sequences and was therefore represented twice in the combined analysis, the results were slightly different from those of the separate analysis of the LEAFY data (Fig. 3). In the combined analysis, as in the separate analysis, the two sequence combinations representing P. monogynus accession 183 were again separated from one another as were those representing P. malvaceus 266; in contrast, however, the two sequence combinations representing P. monogynus 269 formed a clade (Fig. 5).
Biogeographic analysis
An estimated species phylogeny drawn from the combined analysis (Fig. 5) was used in the dispersal-vicariance analysis, which requires fully bifurcate trees (Ronquist, 1997
). Only one terminal node per species was included in the tree with the exception of P. opulifolius. For P. opulifolius, the species was excluded in the biogeographic analysis because we suggest that it may be of hybrid origin, which violates the assumptions of DIVA (Ronquist, 1997
; but see Discussion for our interpretation of the origin of P. opulifolius).
An optimal DIVA reconstruction of the biogeographic history of Neillieae suggested ancestral distributions for the MRCA of Neillieae and that of Physocarpus were equivocal depending on outgroup distributions, but other internal nodes of Neillieae were constant (Fig. 6; Table 4). We explored possible areas of the MRCAs of Neillieae and Physocarpus for several combinations of outgroup distribution (Table 4). The results of the simulation indicated that there were two sets of optimal distributions for the MRCAs of Neillieae and Physocarpus: (1) the MRCA of Neillieae was distributed in eastern Asia and western North America and the MRCA of Physocarpus was in western North America; and (2) the MRCA of Neillieae occurred in eastern Asia and the MRCA of Physocarpus was distributed in eastern Asia and western North America (Table 4). In some combinations of outgroup distributions, both sets were reconstructed, but others generated only the second set of distributions for the MRCAs of Neillieae and Physocarpus. Five dispersals were required in the reconstruction if both outgroups were distributed in all areas, and two dispersals were necessary in other reconstructions in which each outgroup was assumed to occupy only one area (Table 4). No extinction was required in any of the reconstructions.
|
|
| DISCUSSION |
|---|
|
|
|---|
We do not present separate phylogenetic analyses of ITS and ETS regions in this paper because both regions are part of the rDNA repeat (Soltis and Soltis, 1998
), and separate analyses of the two regions generated trees with topologies similar to those based on combined rDNA data (Fig. 2B). Previous studies using the ETS region, especially of Asteraceae (Baldwin and Markos, 1998
; Linder et al., 2000
; Markos and Baldwin, 2001
), have shown that ETS has a higher proportion of phylogenetically informative characters than ITS does. Unlike the previous reports, our study indicates that the ETS region provides a lower percentage of parsimony-informative characters than ITS (Table 3). Combining the two regions, however, improves resolution and increases bootstrap support for clades, which agrees with previous reports (Baldwin and Markos, 1998
; Bena et al., 1998
; Markos and Baldwin, 2001
).
As judged by clade supports in our separate analyses of three data sets, there are some strong topological conflicts among data partitions. For example, our cpDNA data strongly conflict with the LEAFY data in terms of the placement of Stephanandra, and the rDNA data are incongruent with the cpDNA and LEAFY data with respect to the relationship of N. affinis. Potential causes of the conflicting relationships among gene trees are discussed in the next two sections, and should better be explained when more data, especially from additional nuclear genes, are collected.
Phylogeny of Physocarpus
Phylogenetic analyses based on rDNA and combined data suggest that P. alternans is sister to the rest of the species of Physocarpus (Figs. 2B, 5). This species, which occurs in desert mountains of western North America, is morphologically distinct in the genus (Howell, 1931
; Rosatti, 1993
). Unlike other Physocarpus species, which have two or three to five carpels, P. alternans usually has only one carpel, which is a common characteristic in Neillia and Stephanandra. The carpel number character may support the placement of P. alternans as the basal lineage in the genus if the single carpel is a synapomorphy for Neillieae and the 2 5-carpel condition evolved in Physocarpus. It is possible, however, that the unicarpellate conditions in P. alternans and the Neillia-Stephanandra clade evolved independently. Because of lack of resolution regarding outgroup relationships of Neillieae in the Rosaceae (Potter et al., 2002
), it is difficult to establish the polarity of this character in the tribe.
The LEAFY data, on the other hand, place P. amurensis as sister to the other Physocarpus species and P. alternans as sister to P. capitatus and P. opulifolius (Fig. 3). These relationships, however, are not supported in the bootstrap analysis or in the Bayesian analysis. Physocarpus amurensis, an eastern Asian species, was previously considered to be closely related to P. opulifolius and P. capitatus (Rehder, 1940
; Robertson, 1974
) in having three to five carpels that are united at the base. Our molecular data, however, do not support a close relationship between P. amurensis and either P. opulifolius or P. capitatus (Figs. 2, 3, 5). Morphologically, other characteristics in the fruits of P. amurensis differ from those of P. opulifolius and P. capitatus. The follicles of P. amurensis are not highly inflated at maturity and are slightly longer than or as long as the hypanthium and the sepals (Maximowicz, 1859
; Poyarkova, 1939
), whereas those of P. opulifolius and P. capitatus are highly inflated and are more than twice as long as the hypanthium and the sepals. In addition, our close examination of herbarium specimens of P. amurensis, including the possible isotype of the species, indicates that P. amurensis has two, rarely three, carpels, not three to five carpels. We therefore favor the hypothesis of relationships depicted in Fig. 5, in which P. alternans, with one carpel, and P. amurensis, with two to three carpels, are successive sisters to the remaining (2 or 35-carpellate) species of Physocarpus.
Our molecular data show that P. opulifolius, P. capitatus, and P. malvaceus are closely related, but they are separable. In the cpDNA tree, accessions of the three species form a clade with 63% bootstrap support and Bayesian posterior probability of 100 (Fig. 2A), but there is no resolution among the species. All four accessions of P. opulifolius, however, share a unique 2-bp indel in the psbA-trnK region of cpDNA. This indel character, not scored as a separate character, is the only difference in the cpDNA data, but it suggests that P. opulifolius is distinct from P. capitatus and P. malvaceus. Accessions of P. opulifolius and P. capitatus form separate clades in rDNA trees (Fig. 2B), while relationships among the accessions of P. malvaceus are unresolved. Morphologically, P. malvaceus can be easily distinguished from other two species by having two (vs. 35) carpels, which develop into flattened follicles at maturity, while P. capitatus can be distinguished from P. opulifolius in having leaves of the flowering branches that are ovate with truncate to cordate bases and marginal teeth that are acute or acuminate.
The relationships among the LEAFY sequences in P. monogynus and P. malvaceus are complex. Two distinct LEAFY sequence types are found in both accessions of P. monogynus and one accession of P. malvaceus (Fig. 3). The influence of these different LEAFY sequence types was evident in the combined analysis (Fig. 5), in which the different sequence combinations for an accession were separated from one another in P. monogynus 183 and P. malvaceus 266.
Due to lack of phylogenetic resolution, it is unclear what causes the complex pattern of phylogenetic relationships of these LEAFY sequences. It is possible that gene duplication occurred on the stem lineage of the MRCA of P. monogynus and P. malvaceus, if the two species are sister taxa, as is suggested by the placement of some of the sequence combinations in Fig. 5. Gene flow or allelic variation or a combination of both may also result in the relationships observed in the LEAFY data. Doyle (1995)
argued that gene tree topologies might not agree with genealogical relationships among individuals or species, in part because some alleles found in a species are more closely related to alleles in other species than to alleles in the same species.
The situation is even more complex if one considers the cpDNA data. Our results indicate that two distinct cpDNA types occur within P. monogynus because the two accessions of that species did not form a monophyletic group in the phylogenetic analysis of the cpDNA data (Fig. 2A). Accession 269 collected in the Sandia Mountains in New Mexico had the same substitution pattern as P. alternans and shared a 22-bp insertion in the trnL-trnF region with P. alternans 253. Accession 183 from Colorado collapsed in an unresolved trichotomy with P. amurensis and the P. capitatus-P. malvaceus-P. opulifolius clade (Fig. 2A), but it had a different cpDNA profile from P. amurensis. We determined the cpDNA regions from two additional plants from the Sandia Mountain area, and they were identical to those from accession 269. Neither of the two nuclear DNA markers (rDNA and LEAFY) supports a close relationship between P. monogynus 269 and P. alternans (Figs. 2A, 3), and examination of the voucher specimens of the two P. monogynus accessions did not yield any evidence that P. monogynus 269 was more closely related to P. alternans than it is to P. monogynus 183. The presence of two cpDNA haplotypes in P. monogynus (Fig. 2A), along with lack of differentiation in nuclear markers and morphology, suggests that the population from the Sandia Mountains may have maintained the ancestral cpDNA haplotype shared with some of P. alternans populations, known as lineage sorting (Doyle, 1992
) or deep coalescence (Maddison, 1997
), or that they may have been derived from introgressive hybridization between P. alternans and P. monogynus, as has been shown in other plant groups (e.g., Soltis et al., 1991
; Wolfe and Elisens, 1995
). More extensive sampling of individuals and populations of those species with multiple data sets may help to resolve this issue.
The LEAFY data provide interesting implications for the origin of P. opulifolius. Phylogenetic analysis of the LEAFY data show that two divergent types of sequences, placed in distantly related clades, exist in P. opulifolius (Fig. 3). These two groupings are not associated with geographic distribution. For example, both accessions 142 and 156 were collected from North Carolina (Appendix 1); however, sequences from these accessions were placed in distantly related clades.
The presence of more than one type of sequence could result from allelic variation, gene duplications/losses, lineage sorting of ancestral polymorphisms, and/or hybridization/introgression (Doyle, 1992
, 1995
; Maddison, 1997
). We hypothesize that the two distinct types of LEAFY sequences in P. opulifolius represent homeologous genes that were derived by hybridization between P. capitatus and P. monogynus, with type I derived from a gene contributed by P. capitatus or its ancestor and type II derived from a gene contributed by P. monogynus or its ancestor; the latter would also have given rise to the type V sequences of P. monogynus in our analysis. If the two types of sequence were paralogues resulting from gene duplication, both types of sequences (types I and II) should be present in each individual of the species. If the two types of sequences were homeologous alleles, as we propose, they would show a segregation pattern depending on the heterozygosity of a particular individual. Our LEAFY data show that only one type of sequence is present in one accession for all four accessions examined (Figs. 3, 4). This does not necessarily mean that the putative homeologous LEAFY alleles are fixed in all accessions of P. opulifolius; we examined two cultivated individuals of P. opulifolius that contain both types of sequences (types I and II) in each individual verified by a phylogenetic analysis and XbaI digestion (data not shown). We, however, excluded the two accessions because the geographic origin of the samples was unknown.
In summary, lineage sorting and interspecific gene flow may all have been important in the evolution of North American Physocarpus. Accessions of P. monogynus and P. malvaceus share alleles of LEAFY that have closer relationships between than within species, while the Sandia Mountain population of P. monogynus harbors a cpDNA type identical to that of some P. alternans. Finally, it appears that P. opulifolius was derived by hybridization between P. monogynus and P. capitatus, or their respective ancestors.
Phylogeny of the Neillia and Stephanandra clade
Phylogenetic relationships in the Neillia-Stephanandra clade are relatively well resolved compared to those in Physocarpus (Figs. 2, 3, 5). Our molecular data suggest that N. sinensis and N. thibetica are closely associated with N. uekii. A possible morphological synapomorphy supporting a close relationship of the three species is the short petiole (less than 1 cm long); other species of Neillia, S. tanakae, and Physocarpus have petioles longer than 1 cm.
Neillia sinensis and N. thibetica together form a clade with strong support in separate analyses of the cpDNA and LEAFY data and the combined analyses (Figs. 2A, 3, 5), whereas the rDNA data suggest that N. sinensis is sister to N. uekii (Fig. 2B). Neillia sinensis, widely distributed in China, is morphologically very similar to N. thibetica; both species have racemes of pink flowers with long cylindrical hypanthia (Cullen, 1971
; Yu and Ku, 1974
). Neillia uekii, endemic to Korea and northeastern China, also has racemose inflorescences, but the flowers are in creamy colors with campanulate hypanthia. The inflorescence rachis of N. uekii is pubescent with stellate trichomes, whereas those of N. sinensis and N. thibetica are glabrous or pubescent with simple unicellular trichomes.
Our separate analyses of cpDNA and LEAFY data suggest that N. affinis, N. gracilis, and N. sparsiflora (the latter was not represented in the LEAFY data because we were unable to amplify the region from the DNA isolated from an herbarium specimen), all of which are distributed in western China, form a well-supported clade (Figs. 2A, 3). This relationship is also strongly supported in our combined analyses (Fig. 5). One potential synapomorphy of the three species is the aggregation of flowers at the apex of the inflorescence, although some variation can be found in N. affinis. Neillia sparsiflora and N. gracilis are morphologically very distinct in the genus (Cullen, 1971
). Neillia sparsiflora is characterized by having capitate-glandular trichomes on the flowering branches, veins of the lower leaf surfaces, petioles, stipule margins, bracts, and inflorescence, while plants of N. gracilis are suffrutescent and rhizomatous, reaching only 0.5 m in height. Interestingly, N. affinis is placed in a well-supported clade with N. sinensis, N. thibetica, and N. uekii by our rDNA data (Fig. 2B). More data are necessary to reconcile these discordant results.
Both cpDNA and LEAFY data, separately and in combination, support N. thyrsiflora as sister to the N. affinis-N. gracilis clade, although the rDNA data alone do not resolve the relationship (Figs. 2, 3, 5). There are a few morphological characters that can define the clade. The paniculate inflorescence is a possible synapomorphy, but it would have to have been lost in N. sparsiflora.
The genus Stephanandra consists of only three eastern Asian species. Species of Stephanandra form a monophyletic group, being nested within Neillia in both cpDNA and LEAFY trees, while the rDNA data do not provide any consistent relationship with confidence. The topologies of the cpDNA and LEAFY trees are nearly identical when Stephanandra is excluded; only the phylogenetic placement of Stephanandra is different between the two trees (Figs. 2A, 3). Oh and Potter (2003)
have postulated that Stephanandra may have originated via hybridization between two lineages of Neillia (N. uekii and the N. affinis-N. thyrsiflora clade), although other processes, such as gene duplication/loss and lineage sorting, cannot be ruled out as causes of the incongruent positions of Stephanandra supported by the two data sets. Under the hybridization model, the ovulate progenitor of Stephanandra may have been N. uekii or its recent ancestor because the cpDNA tree, which reflects the maternal phylogeny of the two genera, places Stephanandra as sister to N. uekii (Fig. 2A). On the other hand, LEAFY trees, representing nuclear gene trees, do not support the sister relationship between Stephanandra and N. uekii, but place Stephanandra as sister to the N. affinis-N. thyrsiflora clade (Fig. 3), which suggests that the stem lineage leading to this clade might have been the pollen progenitor of Stephanandra. Oh and Potter (2003)
have further suggested that the paternal homeologous allele of LEAFY in Stephanandra was fixed, resulting in the disparity between cpDNA and LEAFY trees. It is difficult to draw conclusions regarding this hybrid hypothesis from the rDNA data because the phylogenetic relationship between Neillia and Stephanandra is poorly resolved (Fig. 2B).
We propose that Neillia and Stephanandra should be merged into one genus, in which case the name Neillia should be used because it has priority over Stephanandra. Our analyses of cpDNA and LEAFY data (Figs. 2A, 3) and the combined analysis (Fig. 5) suggest that Stephanandra is nested within Neillia, while rDNA data alone did not resolve the relationship. The two genera, however, form a strongly supported monophyletic group in all analyses. The close relationship between Neillia and Stephanandra is also supported by morphological characteristics, such as the acuminate to caudate leaf apex, racemose or paniculate inflorescence, and possession of a single carpel.
Biogeographic history of Neillieae
The DIVA reconstructions indicate that the ancestral distributions for the MRCA of Neillieae (node 2; Fig. 6) and the MRCA of Physocarpus (node 1; Fig. 6) are ambiguous depending on outgroup distribution (Table 4). In all cases, however, the MRCAs of Neillieae and Physocarpus are not distributed in eastern North America, even if one or both outgroups are assumed to have occupied the area (Fig. 6; Table 4). The biogeographic analysis of Neillieae suggests that species of Neillia and Stephanandra evolved in eastern Asia and diversified in the same area (Fig. 6; Table 4).
There are two optimal biogeographic scenarios for Physocarpus suggested by DIVA depending on outgroup distribution (Fig. 7; Table 4). If the MRCA of Neillieae was widely distributed in both western North America and eastern Asia, the MRCA of Physocarpus originated in western North America (Fig. 7A). In this case, vicariance resulted in the split of Physocarpus and the Neillia-Stephanandra clade and one dispersal event to eastern Asia must be assumed at an early stage during the evolution of Physocarpus. The other optimization, in which the MRCA of Neillieae was distributed only in eastern Asia, requires two independent dispersal events to western North America from eastern Asia during the evolution of Physocarpus (Fig. 7B). The former biogeographic scenario is more likely because it requires fewer dispersal events in the Physocarpus lineage than the latter hypothesis and is consistent with other reconstructions with certain combinations of outgroup distribution (e.g., A, W; E, E; or W, W; Table 4).