|
|
||||||||
Invited Special Papers |
2Section of Evolution and Ecology, University of California, Davis, California 95616 USA; 3Bioinformatics Research Center, North Carolina State University, Campus Box 7566, Raleigh, North Carolina 27695 USA; 4Department of Systematic Botany, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-752 36 Uppsala, Sweden
Received for publication December 30, 2003. Accepted for publication June 24, 2004.
| ABSTRACT |
|---|
|
|
|---|
Key Words: adaptive radiation biogeography divergence time molecular clock phylogeny rates
| INTRODUCTION |
|---|
|
|
|---|
However, renewed enthusiasm has been tempered somewhat by a realization that inference methods are still far from perfect. Methods clearly need improvement when they produce conflicting answers to the same question. Although dramatic conflicts between ages based on sequence divergences vs. fossils may have raised the most concerns about the use of molecular data to estimate ages, other lines of evidence have also raised questions. These include conflicting estimates derived from different data sets, such as different taxon samples (Sanderson and Doyle, 2001
), sequence samples (Heckman et al., 2001
), and sequence partitions (Sanderson and Doyle, 2001
; Yang and Yoder, 2003
); conflicts between dates estimated using different calibration points (Soltis et al., 2002
); and conflicts between different inference methods themselves applied to precisely the same data. The rapidly increasing number of case studies in plants promises to provide useful insights to sort through these conflicts and help improve methodology. In this paper, we examine the interplay between methodology and data and review how new methods have shed light on the timing of important events in plant evolution.
| METHODOLOGICAL ISSUES |
|---|
|
|
|---|
Methods assuming a clock
Inferring divergence times with a known topology and a molecular clock is relatively straightforward. Calibration points based on fossil evidence can dictate the time duration of a branch, and molecular sequence data can provide information about its length. A branch length divided by its time duration yields the average evolutionary rate on the branch. Under the assumption of a clock, rates in one part of the tree, derived from solid paleontological or biogeographical evidence, permit dating of nodes in other parts of the tree that lack paleontological evidence.
Because the number of interior nodes on a rooted tree is guaranteed to be less than the number of branches, the clock assumption together with calibration points allow branch length information to be converted into estimates of interior node times. In fact, when rates are constant, divergence times can be estimated "consistently" using maximum likelihood, meaning the error decreases as more data are included. This follows as a special case of known consistency results for maximum likelihood (Chang, 1996
). Importantly, however, T. Britton (personal communication, Stockholm University) has shown that maximum likelihood estimation is not necessarily consistent when rates vary across the tree, making it imperative to understand the extent and impact of such rate variation.
Most available evidence suggests that variation in the rate of molecular evolution across lineages is commonplace in plants, in addition to wide variation in rate across genes in the same lineage (Clegg et al., 1994
; Doyle and Gaut, 2000
). The textbook example is faster rates in grasses than palms (Wilson et al., 1990
; Eyre-Walker and Gaut, 1997
), but many other cases have been uncovered recently, including dramatic differences in rate among lineages of seed plants, especially Gnetales (Chaw et al., 2000
), accelerations in parasitic angiosperm plastid genes (Nickrent et al., 1998
; or achlorophyllous ones: Caddick et al., 2002
), and phenomenal 10100 fold accelerations in angiosperm mitochondrial genes in a few lineages (Palmer et al., 2000
). Usually the application of relative rates or likelihood ratio tests to sequences that span more than a handful of taxa result in rejection of a clocklike model of evolution (Barrier et al., 2001
; Zhang et al., 2001
; Sanderson and Doyle, 2001
; Soltis et al., 2002
; Sanderson, 2003
; Bremer et al., in press
; Hamilton et al., 2003
). Exceptions generally occur when sequence divergence is low and the power of the tests are correspondingly low (Sorhannus and Bell, 1999
; Bromham et al., 2000
; Baldwin and Sanderson, 1998
; Richardson et al., 2001a
; Xiang et al., 2000
). In these instances, the variance of age estimates may be relatively large anyway, irrespective of potential biases from rate heterogeneity, making it essential to estimate error rates.
If biases from rate variation across lineages are random in their direction and magnitude, concatenation of a large number of genes or combination of a large number of independent estimates obtained assuming a clock might lead to reasonable estimates of divergence times (e.g., Heckman et al., 2001
). Another possibility is to assume that all genes share a common set of divergence times while allowing information about node ages to be contributed by multiple genes even when the pattern of rate variation over time differs among genes (Thorne and Kishino, 2002
; Yang and Yoder, 2003
). However, rate variation often affects sequences in different genes/genomes in the same direction: for example, grasses have higher rates in nuclear, plastid, and mitochondrial genes than do palms (Eyre-Walker and Gaut, 1997
).
Methods that do not assume a clock
Newer methods of inference that account for rate variation among lineages face a common basic problem: to reduce the dimensionality of the most general, suitable model, which postulates one distinct rate of evolution for each branch on the tree. Each of the three methods described next accomplishes this reduction in a different way.
Local clock methods
These methods postulate some small number k > 1 of fixed but different rates and place each branch into one of these k rate categories (Hasegawa et al., 1989
; Uyenoyama, 1995
; Yoder and Yang, 2000
; Yang and Yoder, 2003
). One problem with this approach is that some models are not "identifiable" (Yoder and Yang, 2000
), meaning they do not permit unambiguous estimation of times and rates. However, Aris-Brosou and Yang (2002)
analyzed one data set with several different models of rate change (as well as with a molecular clock) and found that divergence time estimates were not very sensitive to which model of rate change was employed, as long as a strict clock was not used (see also Aris-Brosou and Yang, 2003
).
The other problem with this approach has to do with model selection. Models corresponding to the placement of a small number of branches on a large tree can be done in a combinatorially huge number of ways. One way to refine model selection is to restrict the problem to quartet trees and examine only a two-rate model corresponding to each sister group descended from the root (Rambaut and Bromham, 1998
). These methods have not yet been applied to plants. Another approach is to use tests for rate differences at each node in the tree to identify subtrees that can be pooled to have different rates (Britton et al., 2002
). Vinnersten and Bremer (2001)
used this approach to estimate the age of nodes in the monocot clade Liliales.
Bayesian methods
These methods effectively reduce dimensionality by imposing structure on the parameters through assumptions about their prior probabilities. Preassignment of branches into rate categories is avoided by adopting an explicit probabilistic model of how evolutionary rates change over time.
Because evolutionary rates are largely determined by the biological systems that they affect and because these biological systems themselves diverge during evolution, closely related lineages should have more similar rates of evolution than more distantly related lineages: evolutionary rates should be autocorrelated over time (Gillespie, 1991
). There are diverse modeling strategies that can achieve autocorrelation of rates over time, but the range of possible modeling strategies has not yet been carefully explored. For example, one model has rates changing in discrete jumps (Huelsenbeck et al., 2000
), whereas another uses a continuous process of rate change (e.g., Thorne et al., 1998
; Kishino et al., 2001
).
Bayesian statistical inference is performed by examining the posterior distribution (i.e., the probability density of the parameters given the data). The posterior distribution depends on the prior distribution (i.e., the probability density of the parameters before seeing the data) and the likelihood (i.e., the probability density of the data given the parameters). To estimate divergence time from molecular sequence data, the key parameters are the rates and the interior node times. Given the branch lengths, all the information for separating rates and times comes from their prior distribution.
Because both paleontological and molecular evidence is combined to date interior nodes, the likelihood would ideally be based on probabilistic models that explain how fossil data are generated and collected as well as how sequence change occurs. Existing techniques for divergence time estimation with sequence data do not effectively exploit fossil evidence. They incorporate this evidence in the form of calibration points (Aris-Brosou and Yang, 2002
, 2003
) or follow Sanderson (1997)
by placing constraints on node times. More effective use of stratigraphic information is possible (e.g., Huelsenbeck and Rannala, 1997
; Tavaré et al., 2002
).
One way to formulate a prior distribution for divergence times is to explicitly describe the processes of speciation, extinction, and taxon sampling (e.g., Yang and Rannala, 1997
; Aris-Brosou and Yang, 2002
). An advantage of such a formulation is a direct connection between the parameter and its biological meaning. This sort of "biologically based" description would be especially desirable if the processes of speciation, extinction, and taxon sampling were well enough understood to be accurately modeled. However, when systems are poorly understood, biologically based treatments run the risk of simultaneously being parameter rich and unrealistic.
Another possibility is to specify the divergence time prior with a "statistically based" treatment (e.g., Kishino et al., 2001
). The downside of such a treatment is that biological interpretations of parameters in the divergence time prior are unavailable. An advantage is that the prior is potentially adequate for a wide range of biologically plausible situations while not being parameter rich.
Strengths and weaknesses of Bayesian approaches for evolutionary inference have been enumerated elsewhere (e.g., Lewis, 2001
), and we do not attempt to give an exhaustive overview here. One general weakness of the Bayesian paradigm is that it can be difficult to satisfactorily specify prior distributions. Because of the dimensionality problem in converting rates and times to branch lengths, this weakness is especially pertinent to divergence time estimation. A prior distribution should summarize knowledge about a system before data are observed. In the case of molecular evolution, patterns of rate change are relatively poorly characterized, and the little knowledge that exists is difficult to quantify. As a result, it is challenging to construct satisfactory models of rate change. Similarly, prior distributions for divergence times potentially have a large impact on the posterior distributions, but the appropriate prior distributions for divergence times are typically not obvious.
Another concern with the inference of Bayesian divergence time is that heavy computational demands can compromise the quality of the posterior distribution approximations. Bayesian divergence time estimations are based on Markov chain Monte Carlo techniques (MCMC: e.g., Metropolis et al., 1953
; Hastings, 1970
). With MCMC techniques, the quality of the approximation of a posterior distribution improves as the amount of computation devoted to the approximation increases. However, it is hard to know whether the invested amount of computation is sufficient to get a good approximation. Diagnostic techniques are generally employed to determine whether the amount of computation has been sufficient. Unfortunately, these diagnostic techniques can sometimes fail.
MCMC techniques require the likelihood to be evaluated for a potentially large number of sets of branch lengths, adding considerably to its computational overhead. One shortcut is to approximate the likelihood with a function that is quicker to evaluate. However, divergence time estimates can suffer if the likelihood approximation is poor. In his penalized likelihood procedure, Sanderson (2002)
treats the estimated number of sequence changes on a branch as if it were an observation from a Poisson distribution. The mean of the Poisson distribution for the branch is set equal to the time duration of the branch multiplied by its average rate of change. Although it has apparently not yet been employed with Bayesian divergence time estimation, this Poisson approximation strategy could work well when the actual number of changes on a branch can be accurately determined. For data sets in which the error in estimating the actual number of changes cannot be neglected, a multivariate normal approximation that is proportional to the likelihood is attractive (Thorne et al., 1998
). Unfortunately, the multivariate normal approximation is apt to be particularly poor when branch lengths are extremely short. Korber et al. (2000)
have improved upon the multivariate normal approximation by careful consideration of branches where the maximum likelihood estimate of length is zero.
Strengths of the Bayesian dating implementations also should not be neglected. Bayesian methods are based on explicit assumptions. The explicit nature of the Bayesian methods makes their flaws obvious and makes it relatively easy to determine which assumptions should be examined in more detail. Biologically implausible or statistically unsupported assumptions can gradually be replaced by better ones. Posterior distributions generated directly by Bayesian analysis provide natural and relatively intuitive summaries of uncertainty that do not require additional procedures such as nonparametric bootstrapping. The posterior probability distributions of rates and times yielded by Bayesian procedures can be summarized in any desired way. Smoothing methods and local clock-based maximum likelihood procedures report the combination of rates and times that jointly optimize a function. For a Bayesian analysis, the posterior density could serve as the function being optimized. However, rather than finding a single combination of rates and times and other parameters that maximize the posterior density, a node time can be estimated in a Bayesian framework with the mean of its posterior distribution. Because posterior means are averages over the posterior distribution, a posterior mean may be more representative than the maximum a posteriori point when the posterior distribution is asymmetric.
Smoothing methods
Smoothing methods (Sanderson, 1997
, 2002
) reduce the dimensionality by imposing an autocorrelation among the parameters. They allow one rate for every branch but then restrict how much difference in rate is allowed between branches. Nonparametric rate smoothing (NPRS; Sanderson, 1997
) and penalized likelihood (Sanderson, 2002
) both penalize the parameter estimates of rates by comparing them to their phylogenetically immediate neighbors. NPRS uses a naïve estimator of local rate as the inferred number of substitutions on a branch divided by the inferred time duration. Combination of these local rates across the tree yields an overall optimality function which, when minimized, provides estimates of divergence times, because these are part of each individual expression. Penalized likelihood (PL) is a more sophisticated approach, which combines nearly the same penalty term as in NPRS with a likelihood expression based on the joint probability of the substitutions per branch conditional on some model of substitution. One of the significant advantages of PL over NPRS is it can be "tuned" to allow greater or lesser rate variation, so that at one extreme, constant rates are inferred, whereas at the opposite extreme, a high level of rate variation is allowed. A useful data-driven statistical procedure called cross-validation can be used to infer the optimal level of rate smoothing in this approach. Cross-validation checks the predictive power of subsets of the data when data are removed. In PL, one terminal branch is removed in turn, all parameters are estimated, and the model's ability to predict the observed number of substitutions along the pruned branch is checked. Generally, intermediate, non-clocklike levels of smoothing have the best cross-validation scores (Sanderson, 2002
).
Smoothing methods have strengths and weaknesses. Non-clock methods can be too "relaxed." In NPRS and PL, for example, when only shallow recent, nested nodes, are specified as calibrations, solutions can sometimes degenerate so that it is as equally optimal for the root of the tree to move off to infinity as it is to retain a more realistic value. This can be solved either by scaling rates differently (e.g., the log scale used in Kishino and Thorne's Bayesian approach) or by adding maximum age constraints deeper in the tree (approximately the same as a prior on the root time). There is also a limit to how much rate variation can be inferred. When branch lengths of the input tree are highly chaotic, no method is likely to be able to infer the true pattern of rate variation or the true divergence times. This is reflected in the simulation findings in Sanderson (1997)
in which simply assuming a clock led to better age estimates than more elaborate methods when the time scale for rate variation was short. On the other hand, cross-validation appears to be a robust procedure for assaying the level of rate variation in clades. It is not necessarily limited to penalized likelihood methods; it can be used to evaluate the predictive ability of any inference procedure and therefore represents a possible tool for evaluating the various procedures outlined for handling rate variation.
Smoothing methods effectively impose an autocorrelation in neighboring rates across the tree. As currently implemented, Bayesian methods assume that such an autocorrelation exists. Both methods estimate the level of this autocorrelation, and therefore neither unusually high nor low autocorrelations are excluded. Biological mechanisms can be posited for either scenario. High autocorrelation is expected when molecular rates depend on features of the molecular structure or function or whole organism biology that evolve slowly over time. Low autocorrelation is expected when there are frequent shifts in life history (e.g., generation time), DNA repair mechanisms, or selective regimes. One of the primary motivations for the development of non-clock divergence time methods is to provide tools to reconstruct the timescale for variation in molecular rates.
Sources of error
As with any inductive inference in statistics, divergence time estimation entails error. Sources of error have been reviewed at length elsewhere (Waddell and Penny, 1996
; Sanderson and Doyle, 2001
, and the general reviews cited earlier). Many have been appreciated for some time, such as the error introduced by finite sequence lengths or model misspecification (such as neglect of multiple hits, rate-across-sites variation, and within codon variation; see e.g., Yang and Yoder, 2003
; violation of Poisson process assumptions: Gillespie, 1991
; Cutler, 2000
). Other sources, such as the impact of mistaken inferences about tree topology (Sanderson and Doyle, 2001
; Smith and Peterson, 2002
), have been less widely considered, because most studies up to a few years ago relied on distance-based rather than tree-based divergence time methods in the context of an assumed clock. Smith and Peterson (2002)
noted that divergence time analyses are especially sensitive to rooting because rerooting can easily change the perceived unevenness of the distribution of rates among lineages. To account for tree uncertainty, one can use trees (and associated branch lengths) obtained in a bootstrap analysis (Sanderson and Doyle, 2001
) or use Bayesian posteriors on these same quantities as priors for divergence time procedures.
Calibration
Several other sources of error have received critical attention in recent years and are highly relevant to divergence time studies in plants. Foremost among these is calibration. One issue is the proper integration of fossil information into phylogenetic analyses of molecular data. This involves decisions about the node to which a fossil refers. Usually fossils only impose minimum age constraints, rather than fixed points in time, or maximum age constraints. Moreover, sensitivity analyses have shown that basing divergence time scenarios on a single calibration is often problematic (Kress et al., 2001
; Renner and Meyer, 2001
; Soltis et al., 2002
), and recent studies are moving in the direction of multiple calibrations and assessments of the sensitivity of results to inclusion of different fossils (Springer et al., 2003
; Bremer et al., in press
).
Another issue is the uncertainty associated with the fossil dates themselves. Geological dating can be problematic, and preservation biases can inject significant uncertainty into the true minimum ages of clades (see e.g., Morely and Dick, 2003
on Melastomataceae). When used as internal calibration points, their age uncertainty is magnified as progressively deeper node ages are estimated. In extreme cases, multiple "secondary calibrations" are sometimes constructed and then used in subsequent analyses of other taxa, a procedure that has been called into question recently (see Shaul and Graur, 2002
, who developed a useful set of consistency checks on such calibrations; Graur and Martin, 2004
). On the whole, few phylogenetically oriented studies of plant divergence times have used such secondary calibrations (although see Heckman et al., 2001
).
Orthology/paralogy
With the increasing availability of nuclear markers, which are often found in multigene families, the problem of correctly identifying orthologous loci for comparison is becoming increasingly important. To date, the only major study of plant divergence times using many nuclear loci is that of Heckman et al. (2001)
, which used such loci because of their reliance on calibrations outside of plants (plastid genes would thus not be useful). Mistaken labeling of paralogs as orthologs can be problematic for divergence time studies because gene duplications often (though not always) long predate speciation events that separate the taxa of interest (Martin and Burg, 2002
), leading to a downward bias in age estimates. Careful orthology assessment can avoid this problem, but the sparse and sporadic sample of gene families in large sequence databases (Sanderson and Driskell, 2003
) makes such assessments risky in the absence of the kind of detailed genome-scale information available for a few taxa (see e.g., Rokas et al., 2003
, for an example in yeast).
| DIVERGENCE TIMES FOR MAJOR CLADES OF PLANTS |
|---|
|
|
|---|
Plastids
A mosaic history of primary, secondary, and even more complex endosymbioses in photosynthetic eukaryotes clouds divergence time estimation at this level. However, extensive phylogenetic reconstruction using plastid genes provides some clues. In the most taxonomically diverse analysis to date, Yoon et al. (2004)
used five- and six-gene plastid data sets combined with a set of six fossil calibrations to estimate divergence times of major groups of plastid-containing eukaryotes. They inferred that the primary endosymbiosis event occurred prior to 1558 million years ago (mya), which is the estimated crown group age of all plastids. The split between red and green algal plastids was estimated to occur slightly later at 1474 mya.
Green plants
Yoon et al. (2004)
also estimated the age of the split between Charophytes and land plants within green plants at 646792 mya, depending on data partition. The only explicit attempt at dating the crown group age of all green plants (i.e., the split between land plants and chlorophyte green algae) was by Heckman et al. (2001)
. They used 41 protein sequences to estimate the pairwise distance between "chlorophytan green algae" and "higher plants." Three distant, external calibration points were used, the closest being the crown group node of metazoans, fungi, and plants dated at 1600 mya. These three calibrations were "secondary," having been derived from a previous analysis of 75 nuclear-protein-coding genes (Wang et al., 1999
) calibrated in turn using the vertebrate fossil record (Wang et al., 1999
; Graur and Martin, 2004
). Employing a multigene approach, in which times are averaged across genes, and an average-distance approach, concatenating distances among genes prior to the age estimation, they estimated the green plant crown group age at 1111 and 1010 mya, respectively. Standard errors for the multigene approach were estimated to ±109 my.
These estimates are, from a paleobotanical point of view, largely uncontested. The fossil record of groups traditionally included in the green algae is simply too sparse to challenge these estimates (Kenrick and Crane, 1997
).
Land plants
A greater challenge to the Heckman et al. (2001)
results, both from fossil evidence and other estimates based on sequence divergence, concerns their age estimate for the crown group of land plants. Using 50 protein sequences, Heckman et al. (2001)
estimated a Precambrian age for land plants of 703 ± 45 mya. This result contrasts sharply with both palaeobotanical estimates (Kenrick and Crane, 1997
; Wellman et al., 2003
) and a more recent divergence-based estimate (Sanderson, 2003
). Sanderson (2003)
used 27 plastid-protein-coding genes (all different from the Heckman et al. data set) and a sample of 10 land plants and one green algal outgroup in an independent test of the Heckman et al. (2001)
results. Unlike Heckman et al. (2001)
, Sanderson (2003)
used two alternative and internal calibration points, either fixing the crown group seed plant clade at 320 mya or the crown group eudicot clade at 125 mya. The calibration points were based on the first occurrence of stem group conifers in the fossil record and the first fossil evidence for tricolpate pollen, a distinct pollen type characteristic of eudicots (Sanderson, 2003
).
Using a penalized likelihood approach, Sanderson (2003)
estimated the land plant crown group to be of Ordovician age (483 or 490 mya depending on calibration point used). These ages are considerably more recent than those reported by Heckman et al. (2001)
and much more in line with age estimates based on the fossil record (Kenrick and Crane, 1997
; Wellman et al., 2003
). To assess the impact of using an analytical approach that allows for rate changes between lineages, Sanderson (2003)
also conducted analyses invoking a molecular clock assumption. This increased the incongruence with the Heckman et al. (2001)
results, pushing the land plant origin forward into the Early Silurian (435 and 425 mya, depending on calibration point).
Unequivocal fossil evidence for the occurrence of taxa of the land plant crown group is based on stem group vascular plant taxa from the Middle Silurian, about 420430 mya (Kenrick and Crane, 1997
). However, microscopic spores from the Ordovician (Wellman and Gray, 2000
) were recently associated with plant fragments that indicate a liverwort relationship of these spores (Wellman et al., 2003
), and if a liverwort affinity is accepted, the fossil-based age estimate is pushed back in time about 50 my. This age estimate is congruent with that obtained by Sanderson (2003)
.
Seed plants
Two early analyses that specifically addressed seed plant age by estimating pairwise distances and assuming a molecular clock were Savard et al. (1994)
and Goremykin et al. (1997)
. Savard et al. (1994)
used divergences in plastid rbcL and nuclear 18S rDNA, four different landmark events to calibrate their results, and estimated the age of seed plants to be Late Carboniferous (275290 mya). Goremykin et al. (1997)
used a more comprehensive data set including 58 plastid proteins from the six completely sequenced plastid genomes available at the time to calculate pairwise distances. They used a single calibration point (Marchantiavascular plant divergence at 450 mya) and dated seed plants as Early Carboniferous (350 ± 35 mya).
Soltis et al. (2002)
adopted a different approach. They used nonparametric rate smoothing for their analyses and a four-gene (plastid rbcL, atpB, rps4 and nuclear 18S rDNA) data set covering a sample of 35 land plant taxa. Their prime objective was not to obtain absolute age estimates, but to explore various aspects of available methods and their effects on the resulting age estimates. The range of results they obtained, by varying parameters such as calibration point, amount and type of data used, and method for inferring branch lengths, were compared and discussed in relation to available fossil evidence over a broad range of land plant groups included in their analyses. With respect to the age of extant seed plants, their analyses produced three distinct patterns depending on choice of calibration point and choice of method (maximum parsimony, MP, or maximum likelihood, ML) for inferring branch lengths. Using MP and calibrating their results with available fossil evidence on the occurrence of tree ferns or Marattiaceae produced seed plant estimates above 900 mya, whereas using the other four calibration points, they resolved seed plants around the DevonianCarboniferous boundary (340367 mya). The third distinct pattern resulted from using ML instead of MP to infer branch lengths. Compared to corresponding MP-based analysis, ML pushed the seed plant age from 344 ± 25 mya back to 465 ± 45 mya.
The available fossil evidence for seed plants was discussed by Soltis et al. (2002)
in their supplementary material. The first unequivocal seeds are from the uppermost Devonian, about 365 mya (Rothwell and Scheckler, 1988
). However, these seeds may represent stem lineage seed plants, whereas the molecular-based estimates all concern the crown group of extant lineages. To establish a fossil-based minimum age estimate for extant seed plants, we must turn to the first appearance of an extant seed plant lineage, which would push our estimate at least into the Late Carboniferous, about 305 mya (Soltis et al., 2002
).
Angiosperms
Hindered by limited data and uncertain calibrations, early estimates for the age of angiosperms, or what we now know to be a poor proxy for this, the split between monocots and eudicots, ranged from around 200 mya to greater than 300 mya (Wolfe et al., 1989
; Brandl et al., 1992
; Martin et al., 1993
; and Martin et al., 1989
). Goremykin et al. (1997
; see earlier) estimated the monocoteudicot divergence to 160 mya. Sanderson (1997)
used rbcL only, but he had a larger sample of 22 angiosperms. With the same calibration point of 450 mya and NPRS, he estimated the angiosperm crown age to be 165 mya. Later, Sanderson and Doyle (2001)
extended this analysis to include 18S rDNA data and explored various error sources in the dating. Age estimates differed considerably depending on underlying assumptions regarding site-to-site rate variation and usage of 1st and 2nd vs. 3rd codon positions only. Most estimates were 140190 mya for the crown group of angiosperms. Wikström et al. (2001
, 2003
) dated a three-gene (rbcL, atpB, 18S rDNA) phylogeny of 560 angiosperms and seven outgroup taxa also using NPRS and calibrating by setting the split between Fagales and Cucurbitales to 84 mya, the age of fossil Fagales cupules. Depending on the method of branch length calculation (maximum likelihood or parsimony variants), they estimated the crown age of angiosperms to 158179 mya.
The different attempts at molecular dating of the age of angiosperms all have their strengths and weaknesses, and they differ considerably with respect to the underlying molecular data, the taxon sampling, the calibration points, and the methods used. Nevertheless, all more recent age estimates have indicated that the crown node of the angiosperms is from the Jurassic (145208 mya). Considerable uncertainty remains, and sampling and calibration effects in particular deserve further exploration.
Monocots
Bremer (2000)
estimated divergence times with 91 rbcL sequences representing most monocot families. Times were estimated using mean path lengths (Britton et al., 2002
) and calibration with eight Cretaceous monocot fossils. The crown age of monocots was estimated at 134 mya, in the range of the 127141 mya found for the monocot crown group by Wikström et al. (2001
, 2003
). This correspondence between the two studies disappears, however, for subgroups within monocots. The crown age for the major subgroup of commelinids was estimated at 116 my by Bremer (2000)
and 9199 my by Wikström et al. (2001
, 2003
). Apparently, the absence of monocot fossil calibration points in the latter study underestimated the divergence times within monocots. The importance of fossil calibration points is furthermore illuminated by Kress et al. (2001)
, who pointed out an alleged >80 mya Zingiberaceae fossil not included in Bremer's (2000)
analysis. Its inclusion would push back the crown age of the order Zingiberales to the Mid-Cretaceous around 100 mya, although the Zingiberales crown age is Tertiary <65 mya, according to the dating by Bremer (2000)
as well as by Wikström et al. (2001
, 2003
). Within monocots, two further analyses of divergence times for two orders, Liliales (Vinnersten and Bremer, 2001
) and Poales (Bremer, 2002
), provide information on divergence times for families within these orders. The two analyses were based on Bremer's (2000)
dating by setting the crown ages for Liliales and Poales from that study to 82 mya and 115 mya, respectively.
Eudicots
The crown age of eudicots was estimated at 131147 mya by Wikström et al. (2001
, 2003
), somewhat older than the appearance of the tricolpate type of eudicot pollen in the fossil record 125 mya (Crane et al., 1995
). The two major eudicot groups of rosids and asterids are of about the same age in the dating by Wikström et al. (2001
, 2003
), 108 117 mya and 107117 mya, respectively. The calibration point in their tree, at the stem node of Fagales, is within the rosids and fairly close to the crown node of the rosids. For asterids, there is an independent dating by Bremer et al. (in press)
based on a 111-taxon tree representing all major groups and orders and 83 of the 102 families of asterids, with an underlying data set comprising six plastid DNA markers and dating by penalized likelihood. Six reference fossils were used for calibration. The latter yielded a mean crown age of 128 mya for asterids. Within the asterids, there is also an independent dating of the order Asterales, based on 21 rbcL sequences, mean path lengths, and calibration by setting the crown age of the family Asteraceae to 38 mya. This study gave a crown age of 96 mya for Asterales, close to the 93 mya found by Bremer et al. (in press)
, but older than the 8290 mya found by Wikström et al. (2003)
. As pointed out by the latter authors, their results tend to underestimate ages for nodes toward the terminals of the tree, perhaps because of sparse taxonomic sampling that would tend to underestimate branch lengths. Inclusion of more reference fossils will certainly push divergence times further back in time. It appears that major groups of angiosperms such as monocots, eudicots, rosids, and asterids had already diverged during the Early Cretaceous.
Recent radiations
Understanding the underlying causes for the perceived uneven taxonomic and geographic distribution of species richness has been a driving force in botanical research ever since the Origin of Species (Darwin, 1859
). However, before inferring correlates and potential causes behind heterogeneity, it is important to show that perceived patterns reflect real differences in evolutionary processes or persisting differences in speciation and extinction rates.
Application of molecular dating techniques has helped identify shifts in speciation rates and address if, and in what way, ecological, geographical, and temporal factors have influenced the rates of speciation through time. A comprehensive review of the theoretical developments relating to this new species-level approach was given by Barraclough and Nee (2001)
. Here we describe a few examples indicating how the establishment of an absolute time scale has been used to (1) infer absolute diversification rates, (2) establish temporal changes in rates of speciation within individual groups, and (3) infer temporal correlations between shifts in speciation rates and a range of external factors.
Baldwin and Sanderson (1998)
examined diversification in the Hawaiian silversword alliance (Asteraceae), traditionally considered a "textbook" example of insular adaptive radiation in plants, but in the absence of an absolute time scale, there has been no way to assess how gradual or rapid this radiation had been. Using ITS rDNA sequences, Baldwin and Sanderson (1998)
were able to establish a maximum age for the origin of the alliance on the Hawaiian islands at around 5 mya. They estimated that the average diversification rate over the 5 million year radiation was high compared to many other taxonomic groups (Baldwin and Sanderson, 1998
).
The Cape flora, with its more than 9000 indigenous flowering plant species, has been the focus of much research seeking to explain the underlying reasons behind this exceptional diversity (Linder, 1991
; Cowling et al., 1992
; Cowling and Hilton-Taylor, 1992
). A range of factors, including ecological gradients, geographical isolation, pollinator shifts, and factors associated with the onset of a Mediterranean climate in the region, have been hypothesized to explain the observed pattern. However, the time scales involved have been unclear, and in the absence of an absolute time scale, temporal correlations between species divergences and external factors have been difficult to establish. Recent work has tried to address this through species- or near species-level phylogenetic analyses by inferring an absolute time scale using rate smoothing methods (Sanderson, 1997
). Richardson et al. (2001b)
, for example, adopted this approach in a study of Phylica (Rhamnaceae). He demonstrated a continental speciation rate comparable to those previously associated with oceanic island radiations and also demonstrated a temporal correlation between the onset of species diversification in Phylica (around 78 mya) and aridification considered to have occurred at the same time following the development of a Mediterranean climate in the region.
A similar, and perhaps even more rigorous, attempt was undertaken by Barraclough and Reeves (in press)
on Protea (Proteaceae). Two stages have been considered important with respect to the diversification of Protea in the region. The shift to a cooler and drier climate triggered by the formation of the circum-Antarctic current around 1510 mya, and the final onset of a "true" Mediterranean climate within the last 5 my. Using a near species-level phylogeny of Protea species and by estimating their divergence times, Barraclough and Reeves (in press)
produced lineage-through-time plots searching for increased rates of speciation around the times predicted from the Mediterranean climate hypothesis. To their surprise, they failed to corroborate the Mediterranean climate hypothesis, and instead their lineage-through-time plots showed a significant slowdown in speciation rate towards the present (Barraclough and Reeves, in press
). Additional examples from the Cape floristic region include work on Ehrharta (Poaceae) by Verboom et al. (2003)
, Moraea (Iridaceae) by Goldblatt et al. (2002)
, Pelargonium by Bakker et al. (in press)
, and Aizoaceae by Klak et al. (2004)
.
Highly diverse tropical regions have also received attention. Richardson et al. (2001a)
, for example, evaluated two alternative hypotheses concerning the origin of tropical species diversity: the "museum model," entailing a stable tropical climate that allowed for species to accumulate over time, and the "cyclical glacial model," entailing recent instabilities in climate leading to expansions and contractions of tropical forests and allopatric speciation by differentiation of populations in separate refugia. The two models predict different temporal, phylogenetic, and branch length patterns. Richardson et al. (2001a)
examined Inga (Fabaceae) from South and Central America. They used nuclear ITS rDNA and plastid trnL-trnF spacer/intron sequences to infer phylogenetic relationships among Inga species, and, not being able to refute a molecular clock, they discussed a range of alternative calibrations and their implications. They concluded that the data indicated a recent diversification of the genus (concentrated in the last 10 my) that likely coincided with the uplift of the Andes, the bridging of the Panama Isthmus, and quaternary glacial cycles (Richardson et al., 2001a
).
Divergence times and biogeography
Historical biogeography has for a period of about 25 years been dominated by a search for general area patterns common to several taxon area cladograms within the framework of cladistic biogeography or vicariance biogeography (Nelson and Platnick, 1981
; Nelson and Rosen, 1981
; Humphries and Parenti, 1999
). The absolute time axis has been mostly ignored in this research program, except insofar as it could be roughly correlated with events in Earth history. With more efficient methods for estimating divergence times and hence timing vicariance events in individual groups, historical biogeography is moving into a new phase. Donoghue and Moore (2003)
discussed the integration of divergence times in cladistic biogeography, and Lavin et al. (2001)
argued that assessment of divergence times together with phylogeny provides an alternative picture to traditional cladistic vicariance analysis, which emphasizes area relationships alone. Many studies of historical biogeography of individual groups are now being published, which combine estimation of divergence times by molecular data with biogeographical analysis, especially dispersal-vicariance analysis (Ronquist, 1997
). A few examples will be given here.
Biogeography of the northern hemisphere has recently received increased attention (e.g., Sanmartin et al., 2001
and references therein). Trans-Beringian dispersals have long been a focus of attention regarding biogeographical connections between Eurasia and North America. Increased interest is now being directed towards northern trans-Atlantic dispersals during the Early Tertiary when Europe and North America were closer and the climate was much warmer. Trans-Beringian and trans-Atlantic or "boreotropical" distributions (Wolfe, 1975
; Tiffney, 1985
; Sanmartin et al., 2001
) have been investigated via molecular dating and biogeographical analyses in several groups (Wen et al., 1996
, 1998
; Xiang et al., 1998
, 2000
; Donoghue et al., 2001
; Lavin et al., 2001
, 2003
; Renner et al., 2001
; Renner and Meyer, 2001
; Davis et al., 2002a
, b
).
The historical biogeography of several tropical angiosperm families has been investigated in light of their divergence times. Examples include Gentianaceae (Yuan et al., 2003
), Malpighiaceae (Davis et al., 2002a
, b
), Melastomataceae (Renner et al., 2001
; Renner and Meyer, 2001
), and Rapateaceae (Givnish et al., 2000
). In all cases, the families are too young to have attained their trans-oceanic distributions from a Gondwanan ancestry. Davis et al. (2002a
, b)
and Renner et al. (2001
; also Renner and Meyer, 2001
) postulated that the pantropical distributions of Malpighiaceae and Melastomataceae, respectively, were attained via trans-Atlantic dispersal routes during the Eocene when conditions were suitable for tropical groups also in the northern hemisphere. Conti et al. (2002)
investigated the sister group of Melastomataceae. It consists of five small families, Crypteroniaceae in tropical Asia (Sri Lanka and Southeast Asia), Alzateaceae in South America, and Oliniaceae, Penaeaceae, and Rhynchocalycaceae in South Africa. Crypteroniaceae are sister to the other four families and Alzateaceae to the three African families. By using a time-calibrated analysis, they showed that the divergence time for the split between Crypteroniaceae and the other families is old enough to be related to the Gondwanan breakup. Furthermore, they argued that Crypteroniaceae are indeed of Gondwanan origin and that they reached their current tropical Asian distribution by transportation from Gondwana on the Indian "raft."
Groups with southern hemisphere distributions have long attracted attention due to their presumed Gondwanan origin (e.g., Raven and Axelrod, 1974
). With a more precise timing of Gondwanan breakup (McLoughlin, 2001
) and new data on the divergence times of tropical and southern hemisphere angiosperm groups, the alleged Gondwanan origins and relations to Gondwanan breakup (in particular, the breakup Africa and South America dated to c. 100 my [McLoughlin, 2001
]) have become suspect, as described earlier. Further examples include Adansonia in Bombacaceae (Baum et al., 1998
), Tetrachondra in Tetrachondraceae (Wagstaff et al., 2000
), and genera of Atherospermataceae (Renner et al., 2000
). For the old family Lauraceae, Chanderbali et al. (2001)
hypothesized Gondwanan vicariance for old subgroups as well as later long-distance dispersals for younger subgroups.
Several monocot groups are now associated with a South Gondwanan origin. South Gondwana comprises South America (at least southern South America), Antarctica, and Australasia. These areas were connected well into the Tertiary and their breakup is estimated to have occurred 5035 mya (McLoughlin, 2001
). Vicariance events involving the breakup of the trans-Antarctic connection between Australia and South America are postulated for sister groups at the family level in Liliales (Vinnersten and Bremer, 2001
) and Poales (Bremer, 2002
). Bremer (2002)
also proposed a South Gondwanan origin for the entire order Poales, based on a biogeographical analysis of a dated tree of the order. Within eudicots, there are probably several larger groups above the family level with a South Gondwanan history. One example is the order Asterales. Bremer and Gustafsson (1997)
postulated an East Gondwanan (Australasian) ancestry followed by trans-Antarctic expansion to South America based on a time-calibrated analysis of family interrelationships within the order. Similar examples are likely to be revealed with future analyses of various eudicot orders.
McDaniel and Shaw (2003)
obtained a time-calibrated tree for populations of the trans-Antarctic moss species Pyrrhobryum mnioides, which occurs in Australia, New Zealand, and South America. They dated the split between the Australian New Zealand and the South American populations at 80 mya, consistent with a Gondwanan vicariance. Contrary to the most obvious assumption, viz. that a moss species with populations in both Australia and South America attained this distribution by long-distance dispersal, the dating shows that the species is old enough to display a true relict distribution from South Gondwana (Australia and South America). The Australian and New Zealand populations were shown to share a much more recent history; however, the AustralianNew Zealand distribution of Pyrrhobryum mnioides is thus certainly the result of long-distance dispersal.
| CONCLUSIONS AND PROSPECTS |
|---|
|
|
|---|
Clearly, more work and data are need to resolve some of these cases of incongruence, but the important question is where such efforts should be concentrated. With sequencing efficiency increasing and cost decreasing, it is tempting to simply apply ever increasing quantities of data to any given divergence time problem. This philosophy is exemplified by large multigene divergence time studies, such as Heckman et al. (2001)
or Sanderson (2003)
. If departures from a constant rate of evolution are more or less random across the tree, this strategy makes good sense and ultimately will yield more accurate estimates of divergence times. However, the fact that the two studies just cited, using entirely independent but large sets of genes, obtained estimates for the age of land plants that differed by 50% suggests that one or both data sets has consistent biases that are as yet unexplained.
Ferreting out such biases will probably require advances in some of the following directions. First, more powerful tests for departures from rate constant evolution are needed, especially tests that are sensitive to concerted increases or decreases across a clade, which seem to be plausible possibilities during major evolutionary radiations (Bromham and Hendy, 2000
). Second, the fossil record must be incorporated into the problem in a more constructive fashionnot merely as a source of calibration or perhaps as an object of derision when conflicts emergebut rather as an equal partner in the estimation procedure. Perhaps the best way to do this is to develop "model systems" in parts of plant phylogeny with a relatively rich record, and then use these as testbeds for emerging procedures to estimate divergence time. Cross-validation using multiple fossils (rather than sequences) may pave the way for evaluating progress in methodology within these more empirically rich taxa. Although divergence time methods are getting better at incorporating fossil evidence, it may also be useful to leave out most of this information at some stages of analyses to test the level of agreement between molecular data alone and the fossil record. This is ultimately the best way to validate divergence time methods. Contrasting results from unconstrained and fossil-constrained analyses are always illuminatingin both directions. Finally, work evaluating the correlates of rate variation across lineages (e.g., Barraclough and Savolainen, 2001
; Whittle and Johnston, 2003
) must be pursued vigorously. If it eventually becomes feasible to place prior probabilities on rate variation based on the known biology or life history of extant species, this will only improve inferences about deep time.
| FOOTNOTES |
|---|
| LITERATURE CITED |
|---|
|
|
|---|
Aris-Brosou S. Z. Yang 2002 Effects of models of rate evolution on estimation of divergence dates with special reference to the metazoan 18S ribosomal RNA phylogeny. Systematic Biology 51: 703-714[CrossRef][ISI][Medline]
Aris-Brosou S. Z. Yang 2003 Bayesian models of episodic evolution support a late Precambrian explosive diversification of the Metazoa. Molecular Biology and Evolution 20: 1947-1954
Ayala F. J. A. Rzhetsky F. J. Ayala 1998 Origin of the metazoan phyla: molecular clocks confirm paleontological estimates. Proceedings of the National Academy of Sciences, USA 95: 606-611
Bakker F. T. E. M. Marais M. Gibby In press Nested radiation in Cape Pelargonium. In F. T. Bakker, L. W. Chatrou, B. Gravendeel, and P. B. Pelser [eds.], Plant species-level systematics: new perspectives on pattern and process, Regnum Vegetabile 142. Koeltz, Königstein, Germany
Baldwin B. G. M. J. Sanderson 1998 Age and rate of diversification of the Hawaiian silversword alliance (Compositae). Proceedings of the National Academy of Sciences, USA 95: 9402-9406
Barraclough T. G. S. Nee 2001 Phylogenetics and speciation. Trends in Ecology and Evolution 16: 391-399
Barraclough T. G. G. Reeves In press The causes of speciation in flowering plant lineages: species-level DNA trees in the African genus Protea. In F. T. Bakker, L. W. Chatrou, B. Gravendeel, and P. B. Pelser [eds.], Plant species-level systematics: new perspectives on pattern and process. Regnum Vegetabile 142. Koeltz, Königstein, Germany
Barraclough T. G. V. Savolainen 2001 Evolutionary rates and species diversity in flowering plants. Evolution 55: 677-683[CrossRef][ISI][Medline]
Barrier M. R. Robichaux M. Purugganan 2001 Accelerated regulatory gene evolution in an adaptive radiation. Proceedings of the National Academy of Sciences, USA 98: 10208-10213
Baum D. A. R. L. Small J. F. Wendel 1998 Biogeography and floral evolution of baobabs (Adansonia, Bombacaceae) as inferred from multiple data sets. Systematic Biology 47: 181-207[CrossRef][ISI][Medline]
Benton M. J. F. J. Ayala 2003 Dating the tree of life. Science 300: 1698-1700
Brandl R. W. Mann M. Sprintzl 1992 Estimation of the monocot dicot age through tRNA sequences from the chloroplast. Proceedings of the Royal Society of London, series B 24: 13-17
Bremer K. 2000 Early Cretaceous lineages of monocot flowering plants. Proceedings of the National Academy of Sciences, USA 97: 4707-4711
Bremer K. 2002 Gondwanan evolution of the grass alliance of families. Evolution 56: 1374-1387