|
|
||||||||
Brief Communications |
Section of Evolution and Ecology, One Shields Avenue, University of California, Davis, California 95616 USA
Received for publication October 17, 2002. Accepted for publication January 10, 2003.
| ABSTRACT |
|---|
|
|
|---|
Key Words: divergence times land plants molecular clock
Heckman et al. (2001)
used sequence data obtained from GenBank to infer divergence times in fungi and green plants and estimated that the crown group age of land plants is Precambrian, at 703 ± 45 million years ago (mya). Other molecular divergence time studies of plants have assumed the much more recent age of about 450 mya (Martin et al., 1993
; Goremykin et al., 1997
; Sanderson and Doyle, 2001
), which is just slightly prior to the first appearance of embryophyte spores, but after putative stem group relatives (Kenrick and Crane, 1997
). Heckman et al. (2001)
suggested that their molecular analysis might force a reconsideration of other ages within plants as well. For example, if land plants date to 700 mya, molecular divergences of angiosperms relative to other land plants might well imply an angiosperm origin in the Carboniferous or earlier. These conclusions are so clearly at odds with the plant fossil record that they raise concerns about the analysis, the fossil record, or both. However, rather than focusing on possible explanations for the discrepancy between "clocks and rocks" (Rodriguez-Trelles et al., 2002
), this paper presents a new analysis of data from organellar genes not included in Heckman et al.'s study, samples a broader set of taxa, uses a calibration point phylogenetically closer to the origin of land plants, and uses inference methods that do not assume a molecular clock. This new analysis estimates the age of land plants to be close to what the fossil evidence implies.
A data set of 100 000 green plant protein sequences extracted from GenBank release 127.0 (Sanderson et al., in press
) was sampled to construct a concatenated data set of 27 single-copy plastid genes for 10 land plants and one outgroup, the single-celled green alga, Mesostigma. The data set contained no missing sequences and had a total sequence length of 6266 amino acids. Amino acid sequences were aligned with default options in ClustalW (Thompson et al., 1994
). The data set is available as Supplementary Data accompanying the online version of this paper. A phylogenetic tree was constructed using "protein parsimony" (Swofford et al., 1996
) in PAUP* 4.0 (Swofford, 2002
). Most clades were strongly supported by high bootstrap values and agreed with well-known relationships of major clades of land plants (Fig. 1). Estimates of numbers of amino acid replacements along branches were obtained on this tree using maximum likelihood PAML 3.12 (Yang, 1997
), assuming a Poisson substitution model with gamma-distributed site-to-site rate variation. A likelihood ratio test indicates a significant departure from a molecular clock (2ln(LR) = 572; df = 9; P << 0.001), with the herbaceous angiosperm lineages having a higher rate than other land plants or Mesostigma.
|
Calibration is necessary to convert the results from these analyses to an absolute time scale. The crown group node of seed plants (Fig. 1) can be dated with more confidence than many other nodes in land plant phylogeny (not to mention deeper nodes outside of land plants) because of the abundant record of stem group and crown group seed plants in the Carboniferous. Crown group seed plants (probably stem group conifers) first appear at about 310320 mya (Doyle, 1998
), so a conservative calibration (favoring an older age of land plants) is 330 mya. A secondary, if somewhat more distant, calibration is provided by crown group eudicot angiosperms, whose distinctive tricolpate pollen enters the record at about 125 mya and soon becomes ubiquitous (Doyle, 1992
; Magallon et al., 1999
).
Assuming a molecular clock and using just the primary (seed plant) calibration, land plants are inferred to have a crown group age of 435 mya (Early Silurian)close to the first appearance of embryophyte megafossils and consistent with a conservative interpretation of the spore record. Relaxing the clock assumption with penalized likelihood and using the cross-validated optimal level of smoothing (smoothing parameter estimated to be approximately 100) leads to an inferred age of 483 mya (Fig. 1). Adding the secondary calibration changes the dates only slightly, to an age of 425 mya with a clock assumption or 490 mya with penalized likelihood.
The nearness of these molecular age estimates to the first fossil evidence for land plants contrasts sharply with the results of Heckman et al. (2001)
. Because the agreement with the fossil record is largely independent of whether or not a clock was assumed, the differences between these results and those of Heckman et al. may be due to the calibration or the data rather than to the estimation procedure. Heckman et al. relied on distant external calibrations, the closest being the crown group node of animals, plants, and fungi, estimated to be 1600 mya, itself determined by extrapolation from a molecular divergence time study of vertebrates. The data sets are also quite different. Heckman et al. sampled taxa heterogeneously between genes (although each gene spanned the relevant land plant node of the tree). Moreover, the number of taxa represented in these samples was quite small, ranging from two to six land plant taxa per gene in their land plant data set (mean of 3.4). Since the concatenated lengths were about the same in the two studies (6266 in the present data set; 5131 in theirs), the present data set, representing 10 land plants for every gene, is about three times larger. Increased taxon sampling can improve assessments of the level of rate heterogeneity in deeply diverged clades (Sanderson and Doyle, 2001
). Their study included more genes, but the mean sequence length for their study was about 100 residues, vs. 230 for the data set reported here. Relative rate tests lack power with short sequences (Sorhannus and Van Bell, 1999
; Bromham et al., 2000
), which may have led to Heckman et al.'s conclusion that 50 of 54 genes in their study were clocklike. However, mistakenly assuming a clock when it is absent might not necessarily inflate the ages. It had the opposite effect in the present data set.
A systematic bias in divergence times might arise in gene family data if paralogs are mistaken for orthologs (Martin and Burg, 2002
). A divergence time based on paralogs corresponds to the age of a gene duplication rather than species lineage split and is therefore usually older than one based on orthologs for gene families with significant diversity. Heckman et al.'s data were all based on nuclear genes, many of which belong to large and complex gene families. The sporadic and usually sparse sampling of genes from gene families in sequence databases can make identification of orthologs problematic (Page and Charleston, 1997
; Remm et al., 2001
). The present data set consisted of single copy plastid genes, for which sampling was complete.
However, any data set can be criticized post hoc. The point is that two large and very different sequence data sets conflict dramatically with respect to the inferred age of origin of land plants. It seems premature therefore to add land plants to the growing list of anomalous case studies in which molecular divergence time estimates strikingly predate fossil evidence, as has been found for dates for mammals, birds, and metazoans (Rodriguez-Trelles et al., 2002
). The present compilation of sequence data leads to an estimate that agrees closely with the fossil record.
| FOOTNOTES |
|---|
| LITERATURE CITED |
|---|
|
|
|---|
Doyle J. A. 1992 Revised palynological correlations of the lower Potomac Group (USA) and the Cocobeach sequence of Gabon (Barremian-Aptian). Cretaceous Research 13: 337-349[CrossRef][ISI]
Doyle J. A. 1998 Phylogeny of vascular plants. In D. G. Fautin [ed.], Annual review of ecology and systematics, 567599. Annual Reviews, Palo Alto, California, USA
Goremykin V. V. S. Hansmann W. F. Martin 1997 Evolutionary analysis of 58 proteins encoded in six completely sequenced chloroplast genomes: revised molecular estimates of two seed plant divergence times. Plant Systematics and Evolution 206: 337-351[CrossRef][ISI]
Hastie T. R. Tibshirani J. H. Friedman 2001 The elements of statistical learning: data mining, inference, and prediction. Springer, New York, New York, USA
Heckman D. S. D. M. Geiser B. R. Eidell R. L. Stauffer N. L. Kardos S. B. Hedges 2001 Molecular evidence for the early colonization of land by fungi and plants. Science 293: 1129-1133
Kenrick P. P. R. Crane 1997 The origin and early diversification of land plants: a cladistic study. Smithsonian Institution Press, Washington, D.C., USA
Magallon S. P. R. Crane P. S. Herendeen 1999 Phylogenetic pattern, diversity, and diversification of eudicots. Annals of the Missouri Botanical Garden 86: 297-372[CrossRef][ISI]
Martin A. T. Burg 2002 Perils of paralogy: using HSP70 genes for inferring organismal phylogenies. Systematic Biology 51: 570-587[CrossRef][ISI][Medline]
Martin W. D. Lydiate H. Brinkmann G. Forkmann H. Saedler R. Cerff 1993 Molecular phylogenies in angiosperm evolution. Molecular Biology and Evolution 10: 140-162[Abstract]
Page R. D. M. M. A. Charleston 1997 From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. Molecular Phylogenetics and Evolution 7: 231-240[CrossRef][ISI][Medline]
Remm M. C. E. V. Storm E. L. L. Sonnhammer 2001 Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. Journal of Molecular Biology 314: 1041-1052[CrossRef][ISI][Medline]
Rodriguez-Trelles F. R. Tarrio F. J. Ayala 2002 A methodological bias toward overestimation of molecular evolutionary time scales. Proceedings of the National Academy of Science, USA 99: 8112-8115
Sanderson M. J. 1997 A nonparametric approach to estimating divergence times in the absence of rate constancy. Molecular Biology and Evolution 14: 1218-1231[ISI]
Sanderson M. J. 2002 Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. Molecular Biology and Evolution 19: 101-109
Sanderson M. J. 2003 R8s: inferring absolute rates of evolution and divergence times in the absence of a molecular clock. Bioinformatics (Oxford) 19: 301-302
Sanderson M. J. J. A. Doyle 2001 Sources of error and confidence intervals in estimating the age of angiosperms from rbcL and 18S rDNA data. American Journal of Botany 88: 1499-1516
Sanderson M. J. A. C. Driskell R. H. Ree O. Eulenstein S. Langley 2003 Obtaining maximal concatenated data sets from large sequence databases. Molecular Biology and Evolution, in press .
Sorhannus U. C. Van Bell 1999 Testing for equality of molecular evolutionary rates: a comparison between a relative-rate test and a likelihood ratio test. Molecular Biology and Evolution 16: 849-855[ISI]
Swofford D. L. 2002 PAUP*:. phylogenetic analysis using parsimony (*and other methods). Sinauer, Sunderland, Massachusetts, USA
Swofford D. L. G. J. Olsen P. J. Waddell D. M. Hillis 1996 Phylogenetic inference. In B. K. Mable [ed.], Molecular systematics, 407514. Sinauer, Sunderland, Massachusetts, USA
Thompson J. D. D. G. Higgins T. J. Gibson 1994 Clustal W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22: 4673-4680
Yang Z. 1997 PAML: a program package for phylogenetic analysis by maximum likelihood. Cabios 13: 555-556
This article has been cited by other articles:
![]() |
Y. Yang, E. Yang, Z. An, and X. Liu Evolution of nematode-trapping cells of predatory fungi of the Orbiliaceae based on evidence from rRNA-encoding DNA and multiprotein sequences PNAS, May 15, 2007; 104(20): 8379 - 8384. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. W. Taylor and M. L. Berbee Dating divergences in the Fungal Tree of Life: review and new analyses Mycologia, November 1, 2006; 98(6): 838 - 849. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Rodriguez-Navarro and F. Rubio High-affinity potassium and sodium transport systems in plants J. Exp. Bot., March 1, 2006; 57(5): 1149 - 1160. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Demidov, D. Van Damme, D. Geelen, F. R. Blattner, and A. Houben Identification and Dynamics of Two Classes of Aurora-Like Kinases in Arabidopsis and Other Plants PLANT CELL, March 1, 2005; 17(3): 836 - 848. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. J. P. Douzery, E. A. Snell, E. Bapteste, F. Delsuc, and H. Philippe The timing of eukaryotic evolution: Does a relaxed molecular clock reconcile proteins and fossils? PNAS, October 26, 2004; 101(43): 15386 - 15391. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Lutzoni, F. Kauff, C. J. Cox, D. McLaughlin, G. Celio, B. Dentinger, M. Padamsee, D. Hibbett, T. Y. James, E. Baloch, et al. Assembling the fungal tree of life: progress, classification, and evolution of subcellular traits Am. J. Botany, October 1, 2004; 91(10): 1446 - 1480. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. Lewis and R. M. McCourt Green algae and the origin of land plants Am. J. Botany, October 1, 2004; 91(10): 1535 - 1556. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Sanderson, J. L. Thorne, N. Wikstrom, and K. Bremer Molecular evidence on plant divergence times Am. J. Botany, October 1, 2004; 91(10): 1656 - 1665. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. S. Renner Bayesian analysis of combined chloroplast loci, using multiple calibrations, supports the recent arrival of Melastomataceae in Africa and Madagascar Am. J. Botany, September 1, 2004; 91(9): 1427 - 1435. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |