Am. J. Bot. Li-Cor Advertisement
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental Data
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (32)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Sanderson, M. J.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Sanderson, M. J.
GeoRef
Right arrow GeoRef Citation
Agricola
Right arrow Articles by Sanderson, M. J.
(American Journal of Botany. 2003;90:954-956.)
© 2003 Botanical Society of America, Inc.


Brief Communications

Molecular data from 27 proteins do not support a Precambrian origin of land plants1

Michael J. Sanderson

Section of Evolution and Ecology, One Shields Avenue, University of California, Davis, California 95616 USA

Received for publication October 17, 2002. Accepted for publication January 10, 2003.


    ABSTRACT
 TOP
 ABSTRACT
 LITERATURE CITED
 
Heckman et al. (Science 293: 1129–1133) used sequences obtained from GenBank to infer divergence times in fungi and green plants. They estimated that the crown group of land plants originated in the Precambrian, at 703 ± 45 mya, a date much older than dates implied by the fossils, which are no older than about 450 mya. This paper presents an analysis of an entirely different set of sequence data from 27 plastid protein-coding genes in 10 land plants and a green algal outgroup. It uses a calibration point closer to the origin of land plants and inference methods that do not assume a molecular clock. This leads to estimates ranging from 425 to 490 mya, which brackets the age suggested by the fossil record. Possible explanations for the differing conclusions in the two studies include differences in calibration points and use of single-copy plastid genes rather than nuclear gene families.

Key Words: divergence times • land plants • molecular clock

Heckman et al. (2001) used sequence data obtained from GenBank to infer divergence times in fungi and green plants and estimated that the crown group age of land plants is Precambrian, at 703 ± 45 million years ago (mya). Other molecular divergence time studies of plants have assumed the much more recent age of about 450 mya (Martin et al., 1993 ; Goremykin et al., 1997 ; Sanderson and Doyle, 2001 ), which is just slightly prior to the first appearance of embryophyte spores, but after putative stem group relatives (Kenrick and Crane, 1997 ). Heckman et al. (2001) suggested that their molecular analysis might force a reconsideration of other ages within plants as well. For example, if land plants date to 700 mya, molecular divergences of angiosperms relative to other land plants might well imply an angiosperm origin in the Carboniferous or earlier. These conclusions are so clearly at odds with the plant fossil record that they raise concerns about the analysis, the fossil record, or both. However, rather than focusing on possible explanations for the discrepancy between "clocks and rocks" (Rodriguez-Trelles et al., 2002 ), this paper presents a new analysis of data from organellar genes not included in Heckman et al.'s study, samples a broader set of taxa, uses a calibration point phylogenetically closer to the origin of land plants, and uses inference methods that do not assume a molecular clock. This new analysis estimates the age of land plants to be close to what the fossil evidence implies.

A data set of 100 000 green plant protein sequences extracted from GenBank release 127.0 (Sanderson et al., in press ) was sampled to construct a concatenated data set of 27 single-copy plastid genes for 10 land plants and one outgroup, the single-celled green alga, Mesostigma. The data set contained no missing sequences and had a total sequence length of 6266 amino acids. Amino acid sequences were aligned with default options in ClustalW (Thompson et al., 1994 ). The data set is available as Supplementary Data accompanying the online version of this paper. A phylogenetic tree was constructed using "protein parsimony" (Swofford et al., 1996 ) in PAUP* 4.0 (Swofford, 2002 ). Most clades were strongly supported by high bootstrap values and agreed with well-known relationships of major clades of land plants (Fig. 1). Estimates of numbers of amino acid replacements along branches were obtained on this tree using maximum likelihood PAML 3.12 (Yang, 1997 ), assuming a Poisson substitution model with gamma-distributed site-to-site rate variation. A likelihood ratio test indicates a significant departure from a molecular clock (–2ln(LR) = 572; df = 9; P << 0.001), with the herbaceous angiosperm lineages having a higher rate than other land plants or Mesostigma.



View larger version (18K):
[in this window]
[in a new window]
 
Fig. 1. Maximum parsimony phylogenetic analysis of 27 plastid genes (psaA, psaB, psaC, psaJ, psbA, psbB, psbC, psbD, psbE, psbF, psbL, psbN, atpI, atpH, petA, petB, petG, rpl2, rpl14, rpl16, rpl20, rbcL, rps7, rps11, rps14, rps19, ycf9) for 10 land plants and Mesostigma. Taxon names are prefixed with GenBank taxon ID numbers. Bootstrap values (500 replicates) are indicated next to nodes. Times of nodes on tree were estimated at the optimal level of rate smoothing in a penalized likelihood analysis. Arrow points to crown group node of seed plants (330 mya), used as calibration for the times on the rest of the tree. Note that Mesostigma's role in the divergence time analysis is to permit accurate estimation of the sequence divergence along the two branches descended from the land plant root node. It is pruned from the analysis prior to age estimation, and therefore it is not possible to estimate the age of the next deeper node below land plants (additional outgroups would be necessary)

 
Ages of nodes in the tree were estimated (1) assuming a molecular clock and (2) using penalized likelihood rate smoothing (Sanderson, 2002 ), which permits deviations from a molecular clock. Penalized likelihood is a semi-parametric inference procedure that strikes a balance between fitting the data to a model (using maximum likelihood) and preventing an unacceptably high level of variation among the parameters of the model (using a nonparametric penalty on this parameter variation). In the present application, the model consists of a separate rate of substitution on each branch, and the penalty consists of a least squares term that increases as differences in rate on neighboring branches increase (Sanderson, 1997 ). The trade-off between model fit and parameter smoothness can be tuned by the user, but a more objective criterion is to use a cross-validation procedure driven by the data. Cross validation uses subsets of the data to predict the remainder; as such it determines the best level of smoothing with respect to the predictive ability of the model (Hastie et al., 2001 ). A model that is either too smooth (clocklike in this case) or two rough (every branch having a very different rate) will have difficulty explaining the distribution of substitutions observed along branches when rates vary moderately across the tree. The author's program r8s version 1.5 (Sanderson, 2003 ; available at http://ginger.ucdavis.edu/r8s) was used for divergence time analyses.

Calibration is necessary to convert the results from these analyses to an absolute time scale. The crown group node of seed plants (Fig. 1) can be dated with more confidence than many other nodes in land plant phylogeny (not to mention deeper nodes outside of land plants) because of the abundant record of stem group and crown group seed plants in the Carboniferous. Crown group seed plants (probably stem group conifers) first appear at about 310–320 mya (Doyle, 1998 ), so a conservative calibration (favoring an older age of land plants) is 330 mya. A secondary, if somewhat more distant, calibration is provided by crown group eudicot angiosperms, whose distinctive tricolpate pollen enters the record at about 125 mya and soon becomes ubiquitous (Doyle, 1992 ; Magallon et al., 1999 ).

Assuming a molecular clock and using just the primary (seed plant) calibration, land plants are inferred to have a crown group age of 435 mya (Early Silurian)—close to the first appearance of embryophyte megafossils and consistent with a conservative interpretation of the spore record. Relaxing the clock assumption with penalized likelihood and using the cross-validated optimal level of smoothing (smoothing parameter estimated to be approximately 100) leads to an inferred age of 483 mya (Fig. 1). Adding the secondary calibration changes the dates only slightly, to an age of 425 mya with a clock assumption or 490 mya with penalized likelihood.

The nearness of these molecular age estimates to the first fossil evidence for land plants contrasts sharply with the results of Heckman et al. (2001) . Because the agreement with the fossil record is largely independent of whether or not a clock was assumed, the differences between these results and those of Heckman et al. may be due to the calibration or the data rather than to the estimation procedure. Heckman et al. relied on distant external calibrations, the closest being the crown group node of animals, plants, and fungi, estimated to be 1600 mya, itself determined by extrapolation from a molecular divergence time study of vertebrates. The data sets are also quite different. Heckman et al. sampled taxa heterogeneously between genes (although each gene spanned the relevant land plant node of the tree). Moreover, the number of taxa represented in these samples was quite small, ranging from two to six land plant taxa per gene in their land plant data set (mean of 3.4). Since the concatenated lengths were about the same in the two studies (6266 in the present data set; 5131 in theirs), the present data set, representing 10 land plants for every gene, is about three times larger. Increased taxon sampling can improve assessments of the level of rate heterogeneity in deeply diverged clades (Sanderson and Doyle, 2001 ). Their study included more genes, but the mean sequence length for their study was about 100 residues, vs. 230 for the data set reported here. Relative rate tests lack power with short sequences (Sorhannus and Van Bell, 1999 ; Bromham et al., 2000 ), which may have led to Heckman et al.'s conclusion that 50 of 54 genes in their study were clocklike. However, mistakenly assuming a clock when it is absent might not necessarily inflate the ages. It had the opposite effect in the present data set.

A systematic bias in divergence times might arise in gene family data if paralogs are mistaken for orthologs (Martin and Burg, 2002 ). A divergence time based on paralogs corresponds to the age of a gene duplication rather than species lineage split and is therefore usually older than one based on orthologs for gene families with significant diversity. Heckman et al.'s data were all based on nuclear genes, many of which belong to large and complex gene families. The sporadic and usually sparse sampling of genes from gene families in sequence databases can make identification of orthologs problematic (Page and Charleston, 1997 ; Remm et al., 2001 ). The present data set consisted of single copy plastid genes, for which sampling was complete.

However, any data set can be criticized post hoc. The point is that two large and very different sequence data sets conflict dramatically with respect to the inferred age of origin of land plants. It seems premature therefore to add land plants to the growing list of anomalous case studies in which molecular divergence time estimates strikingly predate fossil evidence, as has been found for dates for mammals, birds, and metazoans (Rodriguez-Trelles et al., 2002 ). The present compilation of sequence data leads to an estimate that agrees closely with the fossil record.


    FOOTNOTES
 
1 The author thanks J. A. Doyle for comments on the manuscript. Back


    LITERATURE CITED
 TOP
 ABSTRACT
 LITERATURE CITED
 
Bromham L. D. Penny A. Rambaut M. D. Hendy 2000 The power of relative rates tests depends on the data. Journal of Molecular Evolution 50: 296-301[Web of Science][Medline]

Doyle J. A. 1992 Revised palynological correlations of the lower Potomac Group (USA) and the Cocobeach sequence of Gabon (Barremian-Aptian). Cretaceous Research 13: 337-349[CrossRef][Web of Science]

Doyle J. A. 1998 Phylogeny of vascular plants. In D. G. Fautin [ed.], Annual review of ecology and systematics, 567–599. Annual Reviews, Palo Alto, California, USA

Goremykin V. V. S. Hansmann W. F. Martin 1997 Evolutionary analysis of 58 proteins encoded in six completely sequenced chloroplast genomes: revised molecular estimates of two seed plant divergence times. Plant Systematics and Evolution 206: 337-351[CrossRef][Web of Science]

Hastie T. R. Tibshirani J. H. Friedman 2001 The elements of statistical learning: data mining, inference, and prediction. Springer, New York, New York, USA

Heckman D. S. D. M. Geiser B. R. Eidell R. L. Stauffer N. L. Kardos S. B. Hedges 2001 Molecular evidence for the early colonization of land by fungi and plants. Science 293: 1129-1133[Abstract/Free Full Text]

Kenrick P. P. R. Crane 1997 The origin and early diversification of land plants: a cladistic study. Smithsonian Institution Press, Washington, D.C., USA

Magallon S. P. R. Crane P. S. Herendeen 1999 Phylogenetic pattern, diversity, and diversification of eudicots. Annals of the Missouri Botanical Garden 86: 297-372[CrossRef][Web of Science]

Martin A. T. Burg 2002 Perils of paralogy: using HSP70 genes for inferring organismal phylogenies. Systematic Biology 51: 570-587[CrossRef][Web of Science][Medline]

Martin W. D. Lydiate H. Brinkmann G. Forkmann H. Saedler R. Cerff 1993 Molecular phylogenies in angiosperm evolution. Molecular Biology and Evolution 10: 140-162[Abstract]

Page R. D. M. M. A. Charleston 1997 From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. Molecular Phylogenetics and Evolution 7: 231-240[CrossRef][Web of Science][Medline]

Remm M. C. E. V. Storm E. L. L. Sonnhammer 2001 Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. Journal of Molecular Biology 314: 1041-1052[CrossRef][Web of Science][Medline]

Rodriguez-Trelles F. R. Tarrio F. J. Ayala 2002 A methodological bias toward overestimation of molecular evolutionary time scales. Proceedings of the National Academy of Science, USA 99: 8112-8115[Abstract/Free Full Text]

Sanderson M. J. 1997 A nonparametric approach to estimating divergence times in the absence of rate constancy. Molecular Biology and Evolution 14: 1218-1231[Web of Science]

Sanderson M. J. 2002 Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach. Molecular Biology and Evolution 19: 101-109[Abstract/Free Full Text]

Sanderson M. J. 2003 R8s: inferring absolute rates of evolution and divergence times in the absence of a molecular clock. Bioinformatics (Oxford) 19: 301-302[Abstract/Free Full Text]

Sanderson M. J. J. A. Doyle 2001 Sources of error and confidence intervals in estimating the age of angiosperms from rbcL and 18S rDNA data. American Journal of Botany 88: 1499-1516[Free Full Text]

Sanderson M. J. A. C. Driskell R. H. Ree O. Eulenstein S. Langley 2003 Obtaining maximal concatenated data sets from large sequence databases. Molecular Biology and Evolution, in press .

Sorhannus U. C. Van Bell 1999 Testing for equality of molecular evolutionary rates: a comparison between a relative-rate test and a likelihood ratio test. Molecular Biology and Evolution 16: 849-855[Web of Science]

Swofford D. L. 2002 PAUP*:. phylogenetic analysis using parsimony (*and other methods). Sinauer, Sunderland, Massachusetts, USA

Swofford D. L. G. J. Olsen P. J. Waddell D. M. Hillis 1996 Phylogenetic inference. In B. K. Mable [ed.], Molecular systematics, 407–514. Sinauer, Sunderland, Massachusetts, USA

Thompson J. D. D. G. Higgins T. J. Gibson 1994 Clustal W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22: 4673-4680[Abstract/Free Full Text]

Yang Z. 1997 PAML: a program package for phylogenetic analysis by maximum likelihood. Cabios 13: 555-556




This article has been cited by other articles:


Home page
Proc. Natl. Acad. Sci. USAHome page
Y. Yang, E. Yang, Z. An, and X. Liu
Evolution of nematode-trapping cells of predatory fungi of the Orbiliaceae based on evidence from rRNA-encoding DNA and multiprotein sequences
PNAS, May 15, 2007; 104(20): 8379 - 8384.
[Abstract] [Full Text] [PDF]


Home page
MycologiaHome page
J. W. Taylor and M. L. Berbee
Dating divergences in the Fungal Tree of Life: review and new analyses
Mycologia, November 1, 2006; 98(6): 838 - 849.
[Abstract] [Full Text] [PDF]


Home page
Phil Trans R Soc BHome page
S. Conway Morris
Darwin's dilemma: the realities of the Cambrian 'explosion'
Phil Trans R Soc B, June 29, 2006; 361(1470): 1069 - 1083.
[Abstract] [Full Text] [PDF]


Home page
J Exp BotHome page
A. Rodriguez-Navarro and F. Rubio
High-affinity potassium and sodium transport systems in plants
J. Exp. Bot., March 1, 2006; 57(5): 1149 - 1160.
[Abstract] [Full Text] [PDF]


Home page
Proc R Soc BHome page
C. C Davis, W. R Anderson, and K. J Wurdack
Gene transfer from a parasitic flowering plant to a fern
Proc R Soc B, November 7, 2005; 272(1578): 2237 - 2242.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
D. Demidov, D. Van Damme, D. Geelen, F. R. Blattner, and A. Houben
Identification and Dynamics of Two Classes of Aurora-Like Kinases in Arabidopsis and Other Plants
PLANT CELL, March 1, 2005; 17(3): 836 - 848.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
E. J. P. Douzery, E. A. Snell, E. Bapteste, F. Delsuc, and H. Philippe
The timing of eukaryotic evolution: Does a relaxed molecular clock reconcile proteins and fossils?
PNAS, October 26, 2004; 101(43): 15386 - 15391.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Bot.Home page
F. Lutzoni, F. Kauff, C. J. Cox, D. McLaughlin, G. Celio, B. Dentinger, M. Padamsee, D. Hibbett, T. Y. James, E. Baloch, et al.
Assembling the fungal tree of life: progress, classification, and evolution of subcellular traits
Am. J. Botany, October 1, 2004; 91(10): 1446 - 1480.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Bot.Home page
L. A. Lewis and R. M. McCourt
Green algae and the origin of land plants
Am. J. Botany, October 1, 2004; 91(10): 1535 - 1556.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Bot.Home page
M. J. Sanderson, J. L. Thorne, N. Wikstrom, and K. Bremer
Molecular evidence on plant divergence times
Am. J. Botany, October 1, 2004; 91(10): 1656 - 1665.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Bot.Home page
S. S. Renner
Bayesian analysis of combined chloroplast loci, using multiple calibrations, supports the recent arrival of Melastomataceae in Africa and Madagascar
Am. J. Botany, September 1, 2004; 91(9): 1427 - 1435.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental Data
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (32)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Sanderson, M. J.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Sanderson, M. J.
GeoRef
Right arrow GeoRef Citation
Agricola
Right arrow Articles by Sanderson, M. J.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS