|
|
||||||||
Invited Special Papers |
2Department of Biology, Duke University, Durham, North Carolina 27708-0338 USA; 3Department of Plant Biology, University of Minnesota, St. Paul, Minnesota 55108 USA; 4Department of Biology, Clark University, Worcester, Massachusetts 01610 USA; 5Institute of Botany, Karl-Franzens-University Graz, A-8010 Graz, Austria; 6Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon 97331-2902 USA; 7Plant Taxonomy and Nature Conservation, Gdansk University, Al. Legionow 9, 80-441 Gdansk, Poland; 8Biodiversity (Mycology and Botany), Agriculture and Agri-Food Canada, Ottawa, Ontario K1A 0C6 Canada; 9Department of Botany, The Field Museum, Chicago, Illinois 60605-2496 USA; 10Microbial Genomics Research Unit, National Center for Agricultural Utilization Research, U.S. Department of Agriculture, Agricultural Research Service, Peoria, Illinois 61604-3999 USA; 11National Natural History Museum, 25 rue Munster, L-2160 Luxembourg, Luxembourg; 12Department of Bryophytes-Thallophytes, National Botanic Garden of Belgium, B-1860 Meise, Belgium; 13Harvard University Herbaria, Cambridge, Massachusetts 02138 USA; 14Institute of Systematic Botany, New York Botanical Garden, New York 10458-5126 USA; 15Current address: Department of Wood Science, University of British Columbia, Vancouver, British Columbia V6T 1Z4 Canada; 16Genomic Sciences Center, The Institute of Physical and Chemical Research (RIKEN), Yokohama 230-0045 Japan; 17Department of Plant Pathology, Washington State University, Pullman, Washington 99164-6430 USA; 18Systematic Botany and Mycology Laboratory, U.S. Department of Agriculture, Agricultural Research Service, Beltsville, Maryland 20705 USA; 19Botanischer Garten und Botanisches Museum Berlin-Dahlem, Freie Universität Berlin, Berlin D-14191 Germany; 20The University of Tokyo, Tokyo 101-0041 Japan
Received for publication February 27, 2002. Accepted for publication July 1, 2004.
| ABSTRACT |
|---|
|
|
|---|
Key Words: fungal classification fungal morphology and ultrastructure fungal phylogenetics fungal systematics mitochondrial small subunit ribosomal DNA (mitSSU rDNA) nuclear small and large subunit ribosomal DNA (nucSSU and nucLSU rDNA) RNA polymerase subunit (RPB2)
| INTRODUCTION |
|---|
|
|
|---|
Mycology has traditionally been a subdiscipline of botany, but phylogenetic analyses of both ribosomal DNA and protein-coding genes suggest that fungi are actually more closely related to animals than plants (Wainright et al., 1993
; Baldauf and Palmer, 1993
; Berbee and Taylor, 2001
; Lang et al., 2002
). Molecular analyses have also demonstrated that some heterotrophic eukaryotes that have been classified as Fungi, such as the plasmodial and cellular slime molds and the water molds (Myxomycota, Dictyosteliomycota, and Oomycota, respectively) are outside of the group. At the same time, some unicellular eukaryotes previously classified among the "protists" have been shown to be Fungi, including Pneumocystis carinii, which is a serious pathogen of immunocompromised humans, and the Microsporidia, which are amitochondriate intracellular parasites of animals (Edman et al., 1988
; Keeling, 2003
). The exact phylogenetic placements of several fungal lineages, such as Microsporidia and Asellariales, are uncertain, though they are included in the Fungi in a recent classification by Cavalier-Smith (2001)
. Throughout this manuscript, the term "Fungi" refers to the monophyletic "true fungi" (also considered as a Kingdom of Eukaryota). In contrast, we use the more general term "fungi" to encompass all organisms traditionally studied by mycologists (i.e., true fungi, slime molds, water molds).
The major groups (phyla) that have traditionally been recognized within the true Fungi are the Chytridiomycota, Zygomycota, Ascomycota, and Basidiomycota. Molecular evidence suggests that the Chytridiomycota and Zygomycota are not monophyletic. Collectively, the Zygomycota and Chytridiomycota form a paraphyletic assemblage representing the earliest diverging lineages of Fungi. Chytridiomycota include unicellular or filamentous forms that produce flagellated cells at some point in the life cycle and which occur in aquatic and terrestrial habitats. It is plausible that the unicellular, flagellated, aquatic form is plesiomorphic in the Fungi as a whole, although the lack of resolution at the base of the fungal phylogeny makes it difficult to resolve this point. Traditionally, the Zygomycota comprise a diverse assemblage of taxa that include soil saprobes (Mucorales), symbionts of arthropods (Trichomycetes), and the widespread arbuscular mycorrhizae of plants (Glomerales; now recognized as a separate phylum Glomeromycota; Schüßler et al., 2001
). They are primarily filamentous and lack flagella; the latter condition is also true for all Ascomycota and Basidiomycota. Therefore, understanding the pattern of relationships between Zygomycota and Chytridiomycota is important to resolving the number of losses of flagella and transitions to land in the evolution of Fungi.
The Ascomycota and Basidiomycota are generally resolved as monophyletic and are sister taxa (Bruns et al., 1992
). Both feature the production of a dikaryotic (binucleate, functionally diploid) stage in the life cycle, albeit expressed to significantly different extents. The clade that contains these groups has been called the Dicaryomycota (Schaffer, 1975
). Ascomycota and Basidiomycota display remarkable diversity in morphology and life cycles, ranging from single-celled yeast to extensive mycelial forms. The latter include the "humongous fungus" Armillaria gallica, which is a basidiomycete forest pathogen whose mycelial networks may occupy areas as great as 15 hectares, and which may live for 1000 years or more (Smith et al., 1992
). The most complex life cycles in Fungi are those of the plant pathogenic rusts (Uredinales), which are basidiomycetes that may have two separate hosts and produce as many as five different kinds of sporulating structures during their life cycle. Many Ascomycota and Basidiomycota produce complex macroscopic fruiting bodies, such as gilled mushrooms, cup fungi, coral fungi, and other forms. Thus, Fungi represent an independent origin of true multicellularity in the eukaryotes.
Fungi play pivotal ecological roles in virtually all ecosystems. Saprotrophic Fungi are important in the cycling of nutrients, especially the carbon that is sequestered in wood and other plant tissues. Pathogenic and parasitic Fungi attack virtually all groups of organisms, including bacteria, plants, other Fungi, and animals, including humans. The economic impact of such Fungi is massive. Other Fungi function as mutualistic symbionts, including mycangial associates of insects, mycorrhizae, lichens, and endophytes. Through these symbioses, Fungi have enabled a diversity of other organisms to exploit novel habitats and resources. Indeed, the establishment of mycorrhizal associations may be a key factor that enabled plants to make the transition from aquatic to terrestrial habitats (Pirozynski and Malloch, 1975
). Interest in the evolution of ecosystems (as well as historical biogeography) has fueled attempts to estimate the timing of appearance of the major fungal groups. Minimum age estimates are provided by a limited number of fossils, including spores of Glomerales (Glomeromycota) from the Ordovician (460 million years ago [mya]; Redecker et al., 2000
), Chytridiomycota and Ascomycota (including lichens) from the Devonian (400 mya; Taylor et al., 1992
, 1995
, 1999
), hyphae with clamp connections (which are diagnostic for Basidiomycota) from the Pennsylvanian (290 mya; Dennis, 1970
), and fruiting bodies of Basidiomycota from the Cretaceous (Hibbett et al., 1995
; Smith et al., 2004
).
Fossils and other lines of evidence have been used for calibration purposes in molecular clock analyses aimed at providing absolute age estimates for the major fungal groups. Using genes for nuclear small subunit ribosomal RNA, Berbee and Taylor (2001)
suggested that the earliest divergences in the Fungi occurred about 800 mya and the Ascomycota-Basidiomycota divergence occurred about 600 mya. In contrast, an analysis using multiple protein-coding genes in both Fungi and plants by Heckman et al. (2001)
suggested that the Fungi originated as long as 1.5 billion years ago, and the Ascomycota-Basidiomycota divergence occurred about 1.2 billion years ago. Sanderson (2003
; Sanderson et al., 2004
, in this issue) performed an analysis of multiple plastid-encoded genes that suggested that the dates proposed by Heckman et al. (2001)
for plant divergences may be too early. By extrapolation, this would be also true for the Fungi, but there has not been a corresponding reanalysis of the fungal age estimates.
One goal of the study presented here is to synthesize progress since 1990 in our continuing endeavor to reconstruct the fungal tree of life, and to analyze all available data for four of the five most commonly sequenced loci for the Fungi (nuclear small and large subunit ribosomal DNA [nucSSU rDNA, nucLSU rDNA], mitochondrial small subunit ribosomal DNA [mitSSU rDNA] and the second largest subunit of RNA polymerase II [RPB2]). A related objective of this study is to summarize and integrate current knowledge regarding fungal subcellular features within this new phylogenetic framework.
Molecular phylogenetic studies of the Fungi
Examination of fungal sequence data in GenBank for the five most commonly sequenced loci revealed that 21 075 ITS, 7990 nucSSU, 5373 nucLSU, 1991 mitSSU, and 349 RPB2 sequences were available as of early January 2004. As impressive as these numbers are in terms of our collective effort to generate DNA sequence data for the Fungi, none of these loci alone can resolve the fungal tree of life with a satisfactory level of phylogenetic confidence (Kurtzman and Robnett, 1998
; Tehler et al., 2000
; Berbee, 2001
; Binder and Hibbett, 2002
; Moncalvo et al., 2002
; Tehler et al., 2003
). Combining sequence data from multiple loci is an integral part of large-scale phylogenetic inference and is central to assembling the fungal tree of life. Therefore, the utility of existing data can be better described by assessing the taxonomic overlap among single-locus data sets. Among the 8025 sequences of nucSSU and 5442 sequences of nucLSU available for this project, 3279 and 2781, respectively, were from taxa for which only that locus had been sequenced. Of the remaining sequences, only 1010 represented taxa for which both nucSSU and nucLSU data were available. Of these species, 573 had sequence lengths, or overlap, >600 bp for both loci and were identified at the species level. Of these 573 taxa, mitSSU sequences were also available for 253 taxa, and RPB2 sequences were available for 161 taxa. NucSSU, nucLSU, mitSSU, and RPB2 sequences were available for 107 taxa. Despite the very large number of ITS sequences available in GenBank, the low degree of overlap with taxa sequenced for other loci is even more pronounced: only 145 taxa also were available for both nucSSU and nucLSU. In part, the lack of overlap between taxa sequenced for ITS and those sequenced for other loci reflects the generation of many ITS sequences from environmental PCR studies, where it is not possible with most of the current methods to obtain a second amplicon from the same individual or species, and from survey data in which species names are not assigned. The disparity between taxa sequenced for ITS vs. other loci also reflects the popularity of this locus for population-level and single locus, species-level studies.
Together, these data suggest that most phylogenetic studies published to date have sought to maximize the number of fungal taxa by restricting their analyses to one locus. To quantify this observation, we surveyed 560 publications reporting fungal phylogenetic trees published from 1990 through 2003 (Fig. 1). Of the 595 trees considered in these studies, 489 (82.2%) were based on a single locus (Fig. 1A; see also Appendix 1, in Supplemental Data accompanying the online version of this article, for the complete list of papers used in this survey and the data extracted from each). Only 77 trees were based on two combined loci, 19 on three combined loci, and 10 on four or more combined loci (Appendix 1). Seven of the latter 10 studies were restricted to closely related species or strains within a species. Exceptions include Binder and Hibbett (2002)
, with 93 species representing most major clades of Homobasidiomycetes; Binder et al. (2001)
, with 15 species representing 10 orders; and Hibbett and Binder (2001)
, with 45 species representing nine orders.
|
To our knowledge, phylogenetic studies including members from all four traditionally recognized phyla of Fungi (Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota) and the Glomeromycota, based on at least two combined loci and explicitly directed toward resolving the fungal tree of life, have not yet been published (but see Keeling et al., 2000
). Although much effort has been invested in defining orders (compared to families, for example), few studies have focused on resolving relationships among orders of Fungi: 354 of 595 trees examined (59.5%) conveyed relationships within single orders (Fig. 1C, bottom panel). The largest number of orders considered in a single study (N = 62) resulted in a tree based only on nucSSU data (Tehler et al., 2000
; Fig. 1C, top panel). The fungal trees based on combined data from multiple loci and encompassing the largest number of orders included 38 species representing 25 orders (Bhattacharya et al., 2000
), 52 species representing 20 orders (Lutzoni et al., 2001
), and 108 species representing 19 orders (Miadlikowska and Lutzoni, 2004
). All of these studies focused on ascomycetes and were based on nucSSU and nucLSU rDNA. A study by Keeling (2003)
is exceptional, covering 16 orders of fungi (34 species) using a combined analysis of two protein-coding genes (
- and ß-tubulin) to infer the phylogenetic placement of Microsporidia.
In part due to the recent proliferation of studies restricted to taxa within single orders, the mean number of orders per tree was significantly lower in studies published in 20012003 compared to those published in 19931995. Accordingly, there does not seem to be a correlation between improvements in technologies and progress toward resolving the deepest nodes in the fungal tree of life, reflecting the slow accumulation of studies combining multiple data partitions, multiple orders, and large numbers of species. This points to a lack of coordination in the past among mycology laboratories when sequencing different loci and various groups of fungi. As demonstrated by the results presented here, the recently funded (NSF) "Deep Hypha" coordination network and Assembling the Fungal Tree of Life (AFTOL) project have already contributed toward a more united effort in the choice of loci and taxa that are appropriate for small- and large-scale phylogenetic studies. However, the lack of overlap among existing data partitions just described also results from the fact that most phylogenetic studies have focused on closely related species. Many loci have been used by mycologists for evolutionary studies at that level, but few of these loci are appropriate to resolve relationships among the main lineages of the Fungi.
Even when trees are inferred using multiple loci, the phylogenetic signal may be limited strongly by the loci selected. Our survey data indicate that more than 83.9% of fungal phylogenies are based exclusively on sequences from the ribosomal RNA tandem repeats. The few protein-coding genes that have been sequenced for phylogenetic studies of fungi (e.g., RPB2; Liu et al., 1999
) have demonstrated clearly that such genes can contribute greatly to resolving deep phylogenetic relationships with high support and/or increase support for topologies inferred using ribosomal RNA genes. To our knowledge, Matheny (2004)
, Reeb et al. (2004)
, and Wang et al. (2004)
are the only studies to combine RPB2 with other loci for inferring fungal relationships. In general, the use of protein-coding genes remains rare in fungal studies (but see Nam et al., 1997
; Geiser et al., 1998
; Kretzer and Bruns, 1999
; Thon and Royse, 1999
; Yun et al., 1999
; Craven et al., 2001
; Landvik et al., 2001
; O'Donnell et al., 2001
; Matheny et al., 2002
; Myllys et al., 2002
; Thell et al., 2002
; Keeling, 2003
; Liu and Hall, 2004
; Tanabe et al., 2004
). In general, there is a great need for housekeeping protein-coding genes to be sequenced and combined with other loci to assemble the fungal tree of life.
Fungal subcellular characters
Phylogenetic application of subcellular data in the Fungi became important in the early 1960s (Bracker, 1967
), and improved chemical fixation techniques led to a subsequent outpouring of data (Beckett et al., 1974
; Fuller, 1976
). Since that time, continued improvements in cell preservation, especially freeze substitution (Hoch, 1986
) and cytochemical analyses (Beckett, 1981
; Read and Beckett, 1996
; Müller et al., 1998
), have made assessments of structural characters, such as membrane changes during nuclear division, reliable as phylogenetic markers. Nevertheless, structural aspects of fungal cells remain very incompletely known, as indicated by recent discoveries of new types of septa (Adams et al., 1995
; Bauer et al., 1995
), haustoria (Bauer et al., 1997
), and nuclear division (Swann et al., 1999
). Molecular sequence data are providing a clearer understanding of the diversity of the Fungi and of the many gaps in our knowledge of subcellular structure in unstudied and understudied groups. The phylogenetic significance of subcellular structure can be difficult to determine in the absence of an independent data set (Berbee and Taylor, 1995
; McLaughlin et al., 1995a
); however, guidance for their phylogenetic interpretation can be obtained from sequence data.
In conjunction with biochemical data (Bartnicki-Garcia, 1970
, 1987
), subcellular characters have provided insight into the phylum-level relationships of the Fungi and were used to distinguish Fungi from other organisms with fungal lifestyles before molecular sequence data were available. Biosynthetic pathways and cell wall composition not only separated Oomycota, Hyphochytriomycota, and Plasmodiophoromycota from the Chytridiomycota, but also supported modern phylum-level subdivision of the Fungi (Bartnicki-Garcia, 1970
, 1987
). Similarly, organization of the transition zone of the flagellar apparatus (i.e., the region lying between the flagellum proper and the kinetosome; Barr, 1992
) and of the flagella rootlets (i.e., the microtubules and microfibrils associated with the kinetosome; Barr, 1981
), clearly separate Chytridiomycota from other fungal groups with motile cells (Oomycota, Hyphochytriomycota, and Plasmodiophoromycota) that are more closely related to heterokont algae or other protists (Braselton, 2001
; Cavalier-Smith, 2001
; Dick, 2001
; Fuller, 2001
). Within the Chytridiomycota, the great diversity in flagella rootlet organization may indicate that this is a fungal group that diverged early during fungal evolution (Barr, 1981
, 2001
). These characters combined with the arrangement of other cellular components of motile cells, such as the microbodylipid-globule complex (Powell, 1978
), identify clades and orders within the phylum (Barr, 2001
) and agree with subsequent molecular phylogenetic analysis (James et al., 2000
).
Spindle pole body (SPB, an organelle that organizes microtubules during nuclear division; Alexopoulos et al., 1996
) and nuclear division characters are diverse within the Fungi (Heath, 1980
, 1986
; McLaughlin et al., 1995b
). In Chytridiomycota, centrioles are associated with SPBs. Except in Basidiobolus, which has a centriole-like structure (McKerracher and Heath, 1985
), centrioles are absent from fungi that lack flagella. In the latter, SPB forms and behaviors typically become more elaborate. Nuclear division characters, including nuclear envelope changes, SPBnuclear-envelope interactions, and chromatin and nucleolus behavior, along with SPB characters, have been used in phylogenetic analyses (Heath, 1986
; Tehler, 1988
; McLaughlin et al., 1995a
; Swann et al., 1999
), but the incompleteness of the data and problems with some earlier phylogenetic analyses (McLaughlin et al., 1995a
) indicate the need for better and more complete data sets.
With the loss of motile cells, alternative methods of spore release evolved in Fungi (Alexopoulos et al., 1996
; Cavalier-Smith, 2001
). Sporangiospores and zygospores, both of which are internally formed, were retained in most Zygomycota (Alexopoulos et al., 1996
; Benny et al., 2001
). New mechanisms for conidium and meiospore formation and ballistosporic discharge have evolved in the Ascomycota and Basidiomycota. The substructure of the ascus wall, especially the ascus apex, has systematic value at higher taxonomic levels; however, dehiscence mechanisms are ecologically adaptive and probably of more restricted taxonomic significance (Bellemère, 1994
). In the Basidiomycota, considerable progress has been made in understanding the ballistosporic discharge mechanism with its characteristic droplet (Money, 1998
), but structural variations in basidiospore development and the hilar appendix (a small projection at the basidiospore base associated with droplet formation; McLaughlin et al., 1985
; Yoon and McLaughlin, 1986
; Miller, 1988
) are still too incompletely studied to assess their potential for phylogenetic analysis. The diversity of meiospore and meiosporangium characters and specialized cell types (e.g., sterile cells such as paraphyses and cystidia) are likely to be of systematic utility at lower taxonomic levels within these phyla (McLaughlin, 1982
; Bellemère, 1994
; Clémençon, 1997
; Pfister and Kimbrough, 2001
).
Yeasts are derived from filamentous taxa in three phyla (Benny et al., 2001
; Fell et al., 2001
; Kurtzman and Sugiyama, 2001
). Ascomycetous and basidiomycetous yeasts may be differentiated using a number of phenotypic and molecular traits (Fell et al., 2001
). In terms of cell division, these two phyla have been separated based on whether mitosis is initiated in the bud or parent, but both types of mitosis occur in basidiomycetous yeasts. However, other mitotic characters also separate these phyla (Frieders and McLaughlin, 1996
; McLaughlin et al., 2004
).
The subcellular structure of the septal pores has developmental and systematic significance but varies within major groups (Bracker, 1967
; Beckett et al., 1974
; McLaughlin et al., 2001
). At the phylum level, Ascomycota generally have been thought to be separable from Basidiomycota based on differences in the uniperforate septal pore apparatus, but the possibility that a septal type may be plesiomorphic for these phyla has not been resolved.
Objectives
Despite the numerous technological advancements available to fungal systematists, progress in understanding the deepest nodes in the fungal tree of life will be limited without a new approach to conducting large-scale multilocus phylogenetic studies and phenotype-based comparative studies on Fungi. This novel approach will require concerted data acquisition by focusing sequencing efforts on specific loci and fungal taxa, by conducting phenotypic studies on specific fungal traits, by improving interaction among fungal systematists, and by the automation of data acquisition and analysis coupled with data bases accessible through the World Wide Web. These goals form the framework of AFTOL, which seeks to infer the phylogenetic relationships among 1500 species representing all fungal phyla based on eight loci (
10 kb). Here, we report phylogenetic studies for the maximal number of species across all known fungal phyla for which DNA sequence data from two, three, and four loci are available. The resulting phylogenetic trees are based on sequences available in GenBank and unpublished sequences generated by various laboratories or by the AFTOL project. We then assess current knowledge regarding the evolution and potential phylogenetic signal of septal characters in Fungi.
| MATERIALS AND METHODS |
|---|
|
|
|---|
nucSSU + nucLSU + mitSSU
MitSSU sequences for 105 taxa were obtained from the AFTOL project. For each of the remaining taxa not available directly from AFTOL but present in the combined nucSSU + nucLSU data set, we queried GenBank for mitSSU using the EPU. One hundred forty-eight taxa were retrieved, such that the final nucSSU + nucLSU + mitSSU data set consisted of 253 unique taxa. In contrast to the nucSSU + nucLSU data set, sequences from these three loci were not available for any Chytridiomycota, Zygomycota, or Glomeromycota.
nucSSU + nucLSU + RPB2
RPB2 sequences for 19 taxa were obtained from the AFTOL project and laboratories associated with this study. We queried GenBank using the EPU for RPB2 data for each of the remaining taxa present in the combined nucSSU + nucLSU data set, but not available from AFTOL. One hundred forty-two taxa were retrieved from GenBank, such that the nucSSU + nucLSU + RPB2 data set consisted of 161 taxa. Because sequences from these three loci were not available for taxa outside the Ascomycota and Basidiomycota, analyses were restricted to members of these two phyla.
nucSSU + nucLSU + mitSSU + RPB2
Taxa common to the three preceding data sets were combined, resulting in 107 unique taxa representing only the Ascomycota and Basidiomycota.
Sources of sequences
Voucher information and GenBank accession numbers for the new sequences deposited in GenBank as part of this study have been archived in Supplemental Data (Appendix 2) accompanying the online version of this article. Appendix 2 also contains GenBank identification numbers for all sequences used in our analyses, as well as accession numbers and general information for sequences obtained from genome centers (Duke Center for Genome Technology, Stanford Genome Technology Center, and The Institution for Genomic Research).
Molecular data
From a total of 1533 sequences included in this study, 283 (18%) are published here for the first time. Laboratory protocols used to generate these new sequences can be found in Hopple and Vilgalys (1999)
, Reeb et al. (2004)
, Schmitt et al. (2003)
, Sung et al. (2001)
, and Hofstetter et al. (2002)
. The five regions targeted for this study were
1.0 kb at the 5' end of the nucSSU (NS17-nssu1088),
1.4 kb at the 5' end of the nucLSU (LROR-LR7),
0.8 kb from universally conserved regions U2U6 that form the minimal core secondary structure of mitSSU (Cummings et al., 1989
; Zoller et al., 1999
), and
2.1 kb from conserved regions 511 of RPB2 (Liu et al., 1999
; Reeb et al., 2004
). Most primers used in this study can be found at these websites: http://www.biology.duke.edu/fungi/mycolab/primers.htm, http://www.lutzonilab.net/pages/primer.shtml, http://faculty.washington.edu/benhall/, http://plantbio.berkeley.edu/
bruns/primers.html, and http:// ocid.nacse.org/research/aftol. Most sequences were subjected to BLAST searches for a first verification of their identities. They were assembled using Sequencher 4.1 (Gene Codes Corporation, Ann Arbor, Michigan, USA) and aligned manually with MacClade 4.06 (Maddison and Maddison, 2001
) and SeaView (Galtier et al., 1996
). Alignments of nucSSU, nucLSU, and mitSSU rDNA sequences and delimitation of ambiguously aligned regions were done accordingly to Lutzoni et al. (2000)
and Reeb et al. (2004)
using the secondary structure model (Kjer, 1995
) of Saccharomyces cerevisiae (U53879, V00704, X07799, X07800, X14966) provided by Cannone et al. (2002)
on the Comparative RNA Web Site (http://www.rna.icmb.utexas.edu/). The protein-coding gene RPB2 was aligned with MacClade using the option nucleotides with amino acid colors to facilitate manual alignment. Ambiguously aligned regions were delimited manually (Lutzoni et al., 2000
), taking into account the exchangeability of protein residues according to their chemical properties (Grantham, 1974
). Sequences obtained from GenBank that could not be successfully aligned (i.e., those of doubtful homology or sequences that have diverged so much that they were virtually not alignable) were removed from the alignment (Appendix 3; see supplemental data accompanying the online version of this article).
Phylogenetic analyses
Bayesian Metropolis coupled Markov chain Monte Carlo (B-MCMCMC) analyses were conducted with MrBayes v3.0b4 (Huelsenbeck and Ronquist, 2001
). All B-MCMCMC analyses were conducted using four chains, and a gamma distribution, if applicable, was approximated with four categories. In addition to posterior probabilities (PP), phylogenetic confidence was estimated with weighted maximum parsimony bootstrap proportions (MPBP), neighbor joining bootstrap proportions (NJBP) with maximum likelihood (ML) distance implemented using PAUP* 4.0b.10 (Swofford, 2002
), and by analyzing bootstrapped data sets with B-MCMCMC (i.e., Bayesian bootstrap proportions, BBP; Douady et al., 2003
). Step matrices for weighted parsimony analyses were generated using stepmatrix.py (written by F. Kauff and available upon request from FK or FL) as outlined in Gaya et al. (2003)
. Uninformative characters were excluded from all bootstrapped data sets analyzed with MP. Parsimony ratchet search strategies (PAUPRat; Nixon, 1999
; Sikes and Lewis, 2001
, http://www.ucalgary.ca/
dsikes/software2.htm) were implemented in PAUP*. Bootstrapped data sets subjected to B-MCMCMC analyses were generated with P4 0.78 (Foster, 2003
). For each data partition and for the combined data set, a hierarchical likelihood ratio test (Modeltest 3.06; Posada and Crandall, 1998
) was used to determine the appropriate model (nucleotide substitution and rate heterogeneity parameters). For each NJ analysis, parameter values were fixed to the optimal values calculated for the optimal model. For the RPB2 data set, each codon position was subjected to a separate model in the B-MCMCMC analysis.
Following the recommendation in Reeb et al. (2004)
, we used NJBP (500 replicates) to detect topological conflicts among data partitions. A conflict was assumed to be significant if two different relationships (one monophyletic, the other nonmonophyletic) for the same set of taxa were both supported with bootstrap values
70% (Mason-Gamer and Kellogg, 1996
). The program compat.py (written by F. Kauff and available upon request from FK or FL) was used to detect such topological incongruences. Taxa causing conflicts were removed (Appendix 3), and the test was reimplemented until no conflicts were detected. Each locus in the combined data sets was subjected to this incongruence test for all possible pairwise comparisons prior to inclusion.
Due to the poor level of resolution and support, single-gene trees are not presented here. The gene combinations (nucSSU + nucLSU, nucSSU + nucLSU + mitSSU, nucSSU + nucLSU + RPB2, and nucSSU + nucLSU + mitSSU + RPB2) were chosen to maximize the number of species, coverage of fungal diversity, as well as phylogenetic resolution and confidence. Because of the large size of the trees presented here and the amount of information associated with each tree, phylograms are only presented as archived supplementary material accompanying the online version of this article (see Appendices 46). For these three phylograms, lengths for each branch were averaged over all trees in the Bayesian posterior probability distribution after removal of the "burn-in phase" (sumt option in MrBayes v3.0b4).
nucSSU + nucLSU
Of 573 taxa, 15 had conflicting phylogenetic placements when the nucSSU and nucLSU NJ bootstrap trees were compared. Consequently, these species were excluded from further analyses (Appendix 3). The combined data set for the remaining 558 species was subjected to B-MCMCMC, and NJ bootstrap. For the B-MCMCMC analysis, we started six independent runs for 10 000 000 generations, sampling every 500th generation with starting trees obtained by randomly resolving dichotomies in the six best trees found by a weighted MP ratchet analysis with 200 iterations using PAUPRat. For both data partitions (nucLSU and nucSSU), we used a six-parameter model for the nucleotide substitution (GTR; Rodríguez et al., 1990
) with a gamma shape distribution. A proportion of sites was assumed to be invariable. In the nucLSU partition, nucleotide frequencies were set to be equal. After verifying that all runs had converged on the same average likelihood level, the last 4000 trees (2 000 000 generations) of each run were used to calculate a 50% majority-rule consensus tree using PAUP* (Fig. 2). The NJ bootstrap was performed with 1000 replicates using ML distances, implementing a six-parameter model for the nucleotide substitution (GTR) with equal base frequencies, gamma shape distribution, and a proportion of sites assumed to be invariable.
|
nucSSU + nucLSU + RPB2
Phylogenetic positions were incongruent among data partitions for four of the 161 taxa for which these sequence data were available (Appendix 3). This three-locus data set for the remaining 157 species was subjected to B-MCMCMC, NJ, and MP bootstrap analysis. For the B-MCMCMC analysis, we ran six independent analyses of 5 000 000 generations, sampling every 500th generation, with random starting trees. For each of the five data partitions (nucLSU, nucSSU, RPB2 1st, 2nd, 3rd position), we applied a six-parameter model for the nucleotide substitution (GTR) with a gamma shape distribution and a proportion of sites assumed to be invariable. For the nucLSU and nucSSU data sets, the nucleotide frequencies for the nucSSU were assumed to be equal. Five of the six initial runs converged at the same average likelihood level, and after discarding the specific burn-in for each of these five runs, we used a total of 20 000 trees to calculate a 50% majority-rule consensus tree using PAUP* (Fig. 4). The NJ bootstrap was performed with 1000 replicates using ML distances with a six-parameter model (GTR) for the nucleotide substitution, with unequal base frequencies, a gamma shape distribution, and a proportion of sites assumed to be invariable. For weighted MP bootstrap analyses, we analyzed 115 bootstrap replicates with 500 random addition sequences (RAS) per bootstrap replicate. This estimate of 500 RAS was based on the minimum number of RAS, of 1000, needed to find the most parsimonious tree(s) in the weighted MP search on the original data set. To this number, we added more RAS (up to 500) to maximize the probability of finding the most parsimonious tree(s) when analyzing bootstrapped data sets.
|
|
|
|
|
|
|
| RESULTS |
|---|
|
|
|---|
nucSSU + nucLSU
Of 1838 characters included in the phylogenetic analyses of this combined data set, 442 were constant (180 nucSSU sites and 262 nucLSU) and 1396 were variable (742 nucSSU sites and 654 nucLSU). A total of 1073 were potentially parsimony informative (561 nucSSU and 512 nucLSU characters).
nucSSU + nucLSU + mitSSU
Of 2173 characters included in phylogenetic analyses of this combined data set, 968 were constant (450 nucSSU, 448 nucLSU, and 70 mitSSU sites) and 1205 were variable (472 nucSSU, 468 nucLSU, and 265 mitSSU sites). A total of 830 sites were potentially parsimony informative (298 nucSSU characters, 329 nucLSU characters, and 203 mitSSU characters).
nucSSU + nucLSU + RPB2
Of 3632 characters included in phylogenetic analyses of this data set, 1459 were constant (469 nucSSU, 486 nucLSU and 504 RPB2 sites) and 2173 were variable (453 nucSSU, 430 nucLSU, and 1290 RPB2). A total of 1748 characters were potentially parsimony informative (296 nucSSU, 322 nucLSU and 1130 RPB2).
nucSSU + nucLSU + mitSSU + RPB2
Of 3967 characters included in phylogenetic analyses of this combined data set, 1756 were constant (555 nucSSU, 529 nucLSU, 103 mitSSU, and 569 RPB2 sites) and 2211 were variable (367 nucSSU, 387 nucLSU, 232 mitSSU, and 1225 RPB2 sites). A total of 1574 sites were potentially parsimony informative (196 nucSSU, 260 nucLSU, 183 mitSSU, and 935 RPB2 characters).
Interpretation of support values
Posterior probabilities provide complementary information to bootstrap proportions (Alfaro et al., 2003
; Douady et al., 2003
; Reeb et al., 2004
). Bayesian MCMC methods are more efficient in recovering accurate support values (i.e., require fewer data to converge on the correct answer relative to parsimony and NJ nonparametric bootstrap [Alfaro et al., 2003
; Wilcox et al., 2002
; Hillis et al., 1994
]), and high posterior probabilities can be obtained for wrong topological bipartitions with current programs implementing Bayesian MCMC, especially when internodes are very short (Alfaro et al., 2003
; Buckley et al., 2002
; Douady et al., 2003