Am. J. Bot. Li-Cor Advertisement
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplementary Data
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (49)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Wojciechowski, M. F.
Right arrow Articles by Sanderson, M. J.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Wojciechowski, M. F.
Right arrow Articles by Sanderson, M. J.
Agricola
Right arrow Articles by Wojciechowski, M. F.
Right arrow Articles by Sanderson, M. J.
(American Journal of Botany. 2004;91:1846-1862.)
© 2004 Botanical Society of America, Inc.


Systematics

A phylogeny of legumes (Leguminosae) based on analysis of the plastid matK gene resolves many well-supported subclades within the family1

Martin F. Wojciechowski2,5, Matt Lavin3 and Michael J. Sanderson4

2School of Life Sciences, Arizona State University, Tempe, Arizona 85287-4501 USA; 3Department of Plant Sciences, Montana State University, Bozeman, Montana 59717 USA; 4Section of Evolution and Ecology, University of California, Davis, California 95616 USA

Received for publication December 18, 2003. Accepted for publication June 10, 2004.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
Phylogenetic analysis of 330 plastid matK gene sequences, representing 235 genera from 37 of 39 tribes, and four outgroup taxa from eurosids I supports many well-resolved subclades within the Leguminosae. These results are generally consistent with those derived from other plastid sequence data (rbcL and trnL), but show greater resolution and clade support overall. In particular, the monophyly of subfamily Papilionoideae and at least seven major subclades are well-supported by bootstrap and Bayesian credibility values. These subclades are informally recognized as the Cladrastis clade, genistoid sensu lato, dalbergioid sensu lato, mirbelioid, millettioid, and robinioid clades, and the inverted-repeat-lacking clade (IRLC). The genistoid clade is expanded to include genera such as Poecilanthe, Cyclolobium, Bowdichia, and Diplotropis and thus contains the vast majority of papilionoids known to produce quinolizidine alkaloids. The dalbergioid clade is expanded to include the tribe Amorpheae. The mirbelioids include the tribes Bossiaeeae and Mirbelieae, with Hypocalypteae as its sister group. The millettioids comprise two major subclades that roughly correspond to the tribes Millettieae and Phaseoleae and represent the only major papilionoid clade marked by a macromorphological apomorphy, pseudoracemose inflorescences. The robinioids are expanded to include Sesbania and members of the tribe Loteae. The IRLC, the most species-rich subclade, is sister to the robinioids. Analysis of the matK data consistently resolves but modestly supports a clade comprising papilionoid taxa that accumulate canavanine in the seeds. This suggests a single origin for the biosynthesis of this most commonly produced of the nonprotein amino acids in legumes.

Key Words: caesalpinioid legumes • Leguminosae • matK • mimosoid legumes • papilionoid legumes • phylogeny


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
The legume family is the third largest family of angiosperms (Mabberley, 1997 ) with approximately 730 genera and over 19 400 species worldwide (Lewis et al., in press). Legumes are second only to Poaceae (the grasses) in agricultural and economic importance. The family includes horticultural varieties and many species harvested as crops and for oils, fiber, fuel, timber, medicines, and chemicals. Ranging in habit from large trees to annual herbs, the family is well represented throughout temperate and tropical regions of the world (Rundel, 1989 ). The Leguminosae is particularly diverse, however, in tropical forests with a seasonally dry aspect and temperate shrublands tailored by xeric climates. Legumes are noticeably absent to poorly represented in mesic temperate habitats, including many arctic and alpine regions and the understory of cool temperate forests. The predilection of legumes for semi-arid to arid habitats is related to a nitrogen-demanding metabolism, which is thought to be an adaptation to climatically variable or unpredictable habitats whereby leaves can be produced economically and opportunistically (McKey, 1994 ). Indeed, nitrogen fixation via root-nodulating symbiotic bacteria is just one of several ways (in addition to associations with arbuscular mycorrhizae, ectomycorrhizae, and uptake of inorganic nitrogen compounds) in which legumes obtain high levels of nitrogen to meet the demands of their metabolism (Sprent, 1994 , 2001 ). All legumes play an important role in the terrestrial nitrogen cycle regardless of whether they form root nodules (Sprent, 2001 ). Considered to be a tropical family with perhaps a late Cretaceous origin (65–70 Mya), the Leguminosae has an abundant and continuous fossil record since the Tertiary (Crepet and Taylor, 1985 , 1986 ; Crepet and Herendeen, 1992 ; Herendeen et al., 1992 ). The occurrence of diverse assemblages of taxa representing all three subfamilies at multiple localities dating from the middle to upper Eocene, especially the Mississippi Embayment of southeastern North America, suggests that most major lineages of woody legumes (except for the tribe Cercideae) were present and had diversified extensively by this time (Herendeen et al., 1992 ).

Reconstructing the phylogenetic relationships of the Leguminosae is essential for understanding the origin and diversification of this ecologically and economically important family of angiosperms. Comprehensive phylogenetic analyses of Leguminosae began with the plastid gene rbcL (Doyle, 1995 ; Käss and Wink, 1995 , 1996 ; Doyle et al., 1997 ) following the early, widespread use of this gene for phylogenetic studies of land plant relationships (e.g., Chase et al., 1993 ). Among the conclusions that emerged, the monophyly of the Fabales (sensu Angiosperm Phylogeny Group, 2003 ) and the sister relationship of legumes to Polygalaceae, Surianaceae, and the rosaceous genus Quillaja Molina were very strongly supported (Doyle et al., 2000 ). Second, the monophyly of Leguminosae is consistently resolved although not as strongly as for the Fabales (Doyle et al., 2000 ; Kajita et al., 2001 ). Third, while monophyly of mimosoid legumes (subfamily Mimosoideae) is well supported by the rbcL data (Doyle et al., 2000 ), a more extensive sampling of the subfamily suggested certain mimosoid genera, Dinizia Ducke and Piptadeniastrum Brenan, were unresolved with respect to related caesalpinioid outgroups (Luckow et al., 2000 ). Fourth, the subfamily Caesalpinioideae (caesalpinioids) is consistently resolved as paraphyletic with respect to mimosoids and papilionoids (e.g., Polhill et al., 1981 ; Doyle et al., 2000 ; Kajita et al., 2001 ), although several well-supported subclades have been detected in recent studies of this subfamily; for example, the tribe Cercideae, resolved as the sister clade to the rest of the family (Doyle et al., 2000 ), the tribe Detarieae sensu lato (s.l.), distributed principally in tropical Africa and including approximately half of the genera in the Caesalpinioideae (Bruneau et al., 2001 ; Herendeen et al., 2003a ), and the "Umtiza" clade (Herendeen et al., 2003b ). Lastly, the traditionally recognized subfamily Papilionoideae (sensu Polhill, 1981a , 1994 ) is consistently resolved as monophyletic, albeit with only modest support (e.g., Doyle et al., 1997 ; Kajita et al., 2001 ).

The Papilionoideae has received the most attention, if only because it is the largest and most widespread of the three legume subfamilies with an estimated 476 genera and 13 860 species (Lewis et al., in press). Papilionoids traditionally have been diagnosed by traits that now are considered synapomorphies of the subfamily. These include wood with predominantly paratracheal axial parenchyma that is usually storied; vessels with alternate vestured pits and simple perforation plates; absence of bipinnate leaves; unidirectional initiation of sepals, petals, and stamens; clawed petals; and a seed testa with a hilar valve and no pleurogram (Polhill, 1981a ; Tucker, 1987a , 2002 ; Tucker and Douglas, 1994 ; Chappill, 1995 ; Gasson, 2000). These many distinctions have sometimes resulted in papilionoids being ranked at the familial level (e.g., Hutchinson, 1964 ; Takhtajan, 1969 ). Moreover, support for the monophyly of Papilionoideae has not changed with family-wide molecular phylogenetic analyses involving the plastid rbcL locus (Käss and Wink, 1995 , 1996 , 1997 ; Doyle et al., 1997 , 2000 ; Kajita et al., 2001 ) or trnL intron (Pennington et al., 2001 ).

Despite insights gained into the higher-level relationships of the family from studies of the rbcL locus, and to a lesser extent the trnL-F region, many issues in legume phylogeny remain unresolved (reviewed in Wojciechowski, 2003 ). This is particularly true for the relationships within the caesalpinioid and mimosoid subgroups and among some of the major papilionoid clades, the genistoids, dalbergioids, millettioids-phaseoloids, and Hologalegina (e.g., Crisp et al., 2000 ; Hu et al., 2000 ; Wojciechowski et al., 2000 ; Lavin et al., 2001 ). More variable nucleotide sequences are needed to improve the resolution of and support for the major clades within legumes. The most promising is the plastid gene matK, which has been shown by several recent studies on different papilionoid subgroups to provide excellent resolution among closely related genera (Hu et al., 2000 ; Lavin et al., 2001 , 2003 ; Steele and Wojciechowski, 2003 ). Here we draw on these recent studies and a large number of new matK sequences as part of a more extensive phylogenetic analysis of the family, with particular emphasis on the major clades of the Papilionoideae.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
Taxon sampling
Complete matK gene sequences from 330 taxa were included in this study, representing 235 genera of legumes as recognized by Polhill (1994) and four outgroup taxa from Fabales (Polygala, Suriana, Quillaja) and Rosales (Vauquelinia). Sampling was most extensive in papilionoids (Papilionoideae), including representatives from 29 of the 30 tribes and 179 of the 451 genera. In contrast, 28 of 151 genera (four of four tribes) of caesalpinioids (Caesalpinioideae) and 28 of 76 genera (four of five tribes) of mimosoids (Mimosoideae) were sampled. Representatives of segregate genera Cercidium Tul. (Parkinsonia), Brachypterum Benth. and Paraderris (Miq.) R. Geesink (Derris), Poissonia Baill. (Coursetia), Philenoptera Benth. (Lonchocarpus), Calia Teran & Berland (Sophora), and the newly described Maraniona (Hughes et al., 2004 ) were included. This study samples extensively in traditionally circumscribed tribes Aeschynomeneae (17/26 genera), Dalbergieae (15/17), all genera of Amorpheae (8), Robinieae (12), most genera of Trifolieae (6/7), and Vicieae (4/5). Representatives of only two, monogeneric tribes, Mimozygantheae Burkart (Mimosoideae) and Euchresteae (Nakai) Ohashi (Papilionoideae), were not sampled for this analysis. Appropriate outgroup taxa from Polygalaceae, Surianaceae, Quillajaceae, and Rosaceae were chosen based on results of recent molecular phylogenetic studies of eurosids using rbcL-atpB-18S nuclear ribosomal DNA (Soltis et al., 2000 ), rbcL alone (Soltis et al., 1995 ; Kajita et al., 2001 ), matK (Steele et al., 2000 ), and the trnL-F region (Persson, 2001 ).

Sequences from 140 taxa are formally reported here for the first time, complete with voucher specimen and database accession information, although a few of them have been used in part for phylogenetic analyses presented previously (Wojciechowski et al., 2000 ). Papers by Hu et al. (2000) , Lavin et al. (2001 , 2003 ), Luckow et al. (2003) , Miller et al. (2003) , McMahon and Hufford (2004) , Steele and Wojciechowski (2003) , Riley-Hulting et al. (2004) , and Thulin et al. (in press) provide sampling information for approximately 190 matK sequences from subgroups of the taxa included here and should be consulted for more details. The sources of plant material and GenBank accession numbers for matK sequences from all taxa included in this paper are provided in the Appendix (see Supplemental Data accompanying the online version of this article).

DNA sequence data
The data presented here were gathered in our laboratories using similar methods. Genomic DNAs were isolated from field-collected, greenhouse-grown plants, silica-dried and herbarium material using the procedure of Doyle and Doyle (1987) or using DNeasy Plant Minikits (Qiagen, Valencia, California, USA). Polymerase chain reaction (PCR) amplifications were performed using Taq and Platinum Taq DNA polymerases (Life Technologies, Gaithersburg, Maryland, USA) as described previously (Wojciechowski et al., 1999 ; Lavin et al., 2000 ). For most of the newly sequenced taxa, double-stranded copies of the matK gene and the flanking 3' trnK intron region were amplified using primers trnK685F and trnK2R*; typical reaction conditions were 2 min at 95°C for denaturation, followed by 35 cycles of 30 s at 95°C, 30–60 s at 55–57° C for annealing, 2 min 30 s at 72°C for primer extension, then followed by a final 7 min incubation at 72°C. Amplification products were purified and then sequenced using these same primers and others listed in Table 1. DNA sequencing was performed on Applied Biosystems 377 and 3100 sequencers (Applied Biosystems, Foster City, California, USA) at the University of California (DBS Sequencing Facility, Davis, California, USA), Iowa State University (DNA Sequencing Facility, Ames, Iowa, USA), Northwoods DNA (Becida, Minnesota, USA), and Arizona State University (DNA Laboratory, Tempe, Arizona, USA). Sequencer output files were assembled into contigs and edited using the program Sequencher 4.1 (GeneCodes, Ann Arbor, Michigan, USA) before alignment.


View this table:
[in this window]
[in a new window]
 
Table 1. Sequences of oligonucleotide primers used for PCR ampli fication and sequencing of the plastid matK gene in legumes. Se quences given are all 5' to 3'; forward and reverse refer to direction with respect to matK coding sequence

 
Primers for the PCR amplification and sequencing of the trnK/matK region from legumes (Table 1) were originally designed by one of us (M. F. Wojciechowski) using published primer sequences (Steele and Vilgalys, 1994 ; Johnson and Soltis, 1995 ), which were modified based on the sole legume matK sequence available at the time (Pisum sativum; Boyer and Mullett, 1988 ); various primers have been subsequently modified further to work more specifically with certain groups of taxa (e.g., Lavin et al., 2000 ; Riley-Hulting et al., 2004 ). Use of these primers generally resulted in 100% overlap in bidirectional sequencing of the entire matK gene, and most of the 3' flanking trnK intron sequence, from these taxa.

The matK sequences were initially assembled into a data matrix by first translating a small set of representative DNA sequences (25 taxa) to their corresponding amino acid sequences, which were then aligned using ClustalX (Thompson et al., 1997 ) using standard pairwise and multiple alignment parameters settings (default gap penalty parameters and Gonnet weight matrix). The amino acid alignment was then used as a template to align the corresponding DNA sequences with insertions and deletions (indels) at equivalent positions. The remaining DNA sequences were added to this primary alignment as they became available, and the data matrix was manually readjusted as necessary to allow for additional indels. Indels were numerous but none were ambiguous with respect to their placement in the final aligned data matrix (see below). All sequences were essentially complete except for 20 taxa that had 50 or more nucleotides missing at one of the ends. All new sequences have been deposited in GenBank, and the final data matrix has been deposited in TreeBASE, study accession S1968 (http://www.treebase.org/).

Phylogenetic analyses
Phylogenetic analyses utilized maximum parsimony (MP) and Bayesian approaches. All parsimony analyses were performed using PAUP* (version 4.0b10; Swofford, 2002 ). Multiple tree searches were conducted using heuristic search options that included SIMPLE, CLOSEST, or RANDOM addition sequences (1000 replicates) holding five trees per replicate, and tree bisection-reconnection (TBR) branch swapping, with retention of multiple parsimonious trees (MAXTREES = 5000–10 000). A nucleotide substitution model was selected using AIC implemented in Modeltest (version 3.06; Posada and Crandall, 1998 ). Bayesian analyses were performed using MrBayes version 3.0 (Huelsenbeck and Ronquist, 2001 ). Multiple Metropolis-coupled Markov chain Monte Carlo analyses were run with random or user-defined starting points for each run. Parameters for the Akaike information criteria (AIC)-selected GTR + {Gamma} + I model were estimated using the default value of four Markov chains and the "temperature" parameter set to 0.2. Markov chains of 2 000 000 generations each were sampled every 100–10 000 generations, which was sufficient to distinguish the burn-in from stationarity phase. Log likelihood values for sampled trees stabilized after approximately 200 000 generations. Clade support was assessed using both nonparametric bootstrap resampling (Felsenstein, 1985 ) and Bayesian posterior probabilities (Huelsenbeck et al., 2002 ). Nonparametric bootstrap proportions were estimated from 100 to 500 bootstrap replicates incorporating heuristic parsimony searches using addition sequence and branch-swapping options as in our standard parsimony analyses. Bayesian posterior probabilities were estimated as the proportion of trees sampled after "burn-in" that contained each of the observed bipartitions. Although posterior probabilities may be over-credible as measures of clade support (Suzuki et al., 2002 ) and currently are a subject of debate (Wilcox et al., 2002 ; Alfaro et al., 2003 ), we have observed that high Bayesian posterior probabilities often support nodes that are otherwise also detected by parsimony strict consensus.

Thirty-seven indels of 1–4 amino acids each were identified in the complete data set, corresponding to 219 nucleotide positions. These sites, and a 25-nucleotide region surrounding a primer-binding site (positions 541–565) that was missing in about 80 taxa, were excluded from all analyses (244 total characters). Each of the 37 indels was treated as a separate character, and states were scored according to the presence or absence (1 or 0) of a sequence within an indel region (1711 total characters). Sensitivity analyses (cf. Whiting et al., 1997 ) involving the inclusion or exclusion of these recoded indel characters were performed to determine the effect on tree topology and clade support.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
Characteristics of the matK sequences
The matK gene in legumes ranges from 1476 bases (492 amino acids) in Erythrostemon gilliesii to 1545 bases (515 amino acids) in length in several dalbergioid taxa (e.g., Adesmia and Pictetia). The final matK data set includes 1674 aligned positions with 1042 potentially parsimony-informative characters (73%, which excludes the 244 indels and missing characters) among the 330 taxa analyzed. Of the total 552 420 characters in the data set, missing data accounted for 1.3% while indels and other excluded positions accounted for 14.6% (80 500 bases). By comparison, the complete legume rbcL data set (Kajita et al., 2001 ) is 1404 aligned bases in length (with no indels) but contains fewer potentially informative characters (530 among the 319 sequences). That study, with a total of 242 sequences sampled from 194 legume genera and a similar emphasis on Papilionoideae but with fewer representatives from the other two subfamilies (24 Caesalpinioideae, 6 Mimosoideae) than the matK data set, contains sequences from a large number of taxa that are identical or very closely related to those in the matK data set. Comparative analyses of these genes (M. Lavin, P. S. Herendeen, and M. F. Wojciechowski, unpublished manuscript) show levels of sequence divergence up to ninefold lower for rbcL than for matK, and substitutions are distributed less uniformly among the three codon positions in rbcL. For example, the substitution rate at the third codon position in rbcL is 10 times that of the second position, whereas the third position in matK shows twice the rate of the second position. Previous comparisons of rates of substitution in matK vs. rbcL have yielded similar patterns of variation in other angiosperm groups (e.g., Steele and Vilgalys, 1994 ; Manos and Steele, 1997 ).

Within Fabales, pairwise distances (calculated across all sites as uncorrected p values in PAUP*) in matK sequences ranged from a maximum of nearly 17% among outgroups and caesalpinioids, to nearly 7% among mimosoids, 12% among caesalpinioids, and just over 19% among papilionoids. Within monophyletic genera for which we have more than three species sampled, pairwise distances varied between 0.8–1.9% in Astragalus and 0.9–4.3% in Sesbania. Within major papilionoid clades, pairwise distances varied from 0.1% to almost 11% within the genistoid clade, 0.3–12% in Loteae + Robinieae; 0.0% (in Lens) to nearly 11% in the IRLC; 0.2–13% in dalbergioids, and 1.4–17.9% in the millettioids.

Phylogenetic reconstruction
Multiple heuristic searches of 1042 parsimony informative nucleotide characters, excluding indels, consistently converged on a large number of equally most parsimonious trees (maximum set saved = 10 000) with a minimum length of 8397 steps, a consistency index (CI) of 0.288 excluding uninformative characters, and a retention index (RI) of 0.791. The strict consensus tree of 5000 representative equally most parsimonious (MP) trees is highly resolved and presented in Figs. 15. The semi-strict consensus of the same set of most parsimonious trees resolves only three nodes that are not present in the strict consensus, while in the 50% majority-rule tree a total of only seven nodes were not fully resolved. Branching order and support values for the major clades of legumes resolved by these matK data were very similar in the maximum parsimony and Bayesian analyses (Figs. 16). To illustrate the heterogeneity in estimated branch lengths, a phylogram representation of a typical Bayesian tree (sampled post burn-in) is shown in Fig. 6.



View larger version (44K):
[in this window]
[in a new window]
 
Fig. 1. Phylogeny of Leguminosae based on parsimony analyses of plastid matK gene sequences. Phylogenetic relationships among the three subfamilies of Leguminosae (crown clade, "L"), and within subfamilies Caesalpinioideae and Mimosoideae. Outgroup lineages are indicated by bold lines. Members of Mimosoideae are indicated by gray boxes. Tree shown is strict consensus of 5000 equally most parsimonious trees (length = 8397 steps, consistency index = 0.288, retention index = 0.791) derived from heuristic search analyses of 330 matK sequences. Nodes designated by a diamond were not resolved in a 50% majority-rule consensus of the same set of 5000 equally most parsimonious trees. Nonparametric bootstrap proportions and Bayesian posterior probabilities from separate analyses (individual or range) are indicated above and below branches or immediately to left of appropriate node, respectively. Values are given for most nodes for which support values from both analyses were greater than 50%. Vauquelinia (Rosaceae) was designated as the outgroup for all analyses. Major papilionoid clades informally named here are indicated by a filled circle

 


View larger version (45K):
[in this window]
[in a new window]
 
Fig. 5. Phylogenetic relationships in Hologalegina clade: the Robinioid clade and the IRLC. See Fig. 1 for details.> >

 


View larger version (31K):
[in this window]
[in a new window]
 
Fig. 6. Representative Bayesian tree sampled according to posterior probabilities from an analysis of 330 legume matK sequences. Estimated branch lengths (under the GTR + {Gamma} + I model) are shown; scale is indicated at bottom. Major subclades of Leguminosae, and representative taxa, are indicated by filled circles

 
Analysis of the matK data confirms results from earlier studies in that the family is a monophyletic group, papilionoids and mimosoids, excluding Dinizia (tribe Mimoseae), are monophyletic and nested within a paraphyletic Caesalpinioideae. All mimosoids and the majority of the caesalpinioid tribes Caesalpinieae and Cassieae comprise a strongly supported clade (Fig. 1) that is the sister group to papilionoids. Seven major clades and a number of minor clades within papilionoids are also highly supported (Figs. 25). In spite of this, relationships among certain of the clades, especially the genistoid s.l. and the dalbergioid s.l. clades, remain unresolved. Parsimony analyses suggest the dalbergioid s.l. clade branches before the genistoid s.l. clade, whereas Bayesian analyses suggest the genistoid s.l. clade is the sister group to the dalbergioid s.l. clade plus the remaining papilionoids (i.e., Baphia clade, mirbelioids, millettioids, and Hologalegina).



View larger version (41K):
[in this window]
[in a new window]
 
Fig. 2. Phylogenetic relationships of the Genistoid sensu lato clade, as well as other papilionoids including many of those once placed in the tribes Swartzieae and Sophoreae. The "core genistoids" (sensu Crisp et al., 2000 ) clade is indicated by bold lines. Nodes consistent with the presence of a 50-kb inversion in the plastid DNA genome are indicated by arrows. See Fig. 1 for details

 
Of the 37 indel characters, 12 were synapomorphic for clades identified in the maximum parsimony and Bayesian analyses. For example, two single-amino-acid insertions, one at positions 421–423 and a second at positions 1498–1500, were synapomorphies for the papilionoid clade. Similarly, one-, two-, or three-amino-acid insertions/deletions uniquely mark each of the Sweetia-Vatairea clade (Fig. 2), New World Lupinus (Fig. 2), dalbergioid s.l. clade (Fig. 3), the genus Sesbania (Fig. 5), and the Caragana plus Hedysarum clade (Fig. 5). Inclusion of the indels as additional characters had little effect on phylogenetic relationships, based on comparison of the strict consensus topology derived from analysis of the data set containing the indel characters (data not shown) to that presented in Figs. 1 5. The exception involved Platycyamus regnellii which was resolved as sister to the clade defined by the MRCA of Apios americana and Phaseolus vulgaris (Fig. 4), in analyses that included the indel characters. Likewise, addition of the indel characters had little effect on bootstrap proportions for nodes receiving support in the 50% bootstrap consensus tree (<5% difference; data not shown).



View larger version (47K):
[in this window]
[in a new window]
 
Fig. 3. Phylogenetic relationships in Dalbergieae sensu lato clade: Amorpheae and Dalbergioid subclades. See Fig. 1 for details.> >

 


View larger version (58K):
[in this window]
[in a new window]
 
Fig. 4. Phylogenetic relationships in Millettioid clade, Indigofereae, Hypocalypteae, Mirbelioid, and Baphioid clades; memberships are indicated by boxes. The "core Millettieae" clade (sensu Hu et al., 2000 ) is indicated by bold lines. Arrows are used to specify nodes with indicated support values. See Fig. 1 for details

 
Phylogenetic criteria for papilionoid clade nomenclature
We have used four criteria for recognizing and informally naming major clades within the Leguminosae, which are consistent with formal node-based definitions under a system of phylogenetic nomenclature (de Queiroz and Gauthier, 1994 ). First, groups are resolved as monophyletic in strict consensus analyses. Second, bootstrap proportions greater than 70% support the clade of interest. Third, taxonomic sampling within the clade is diverse and/or extensive. Lastly, results are at least approximately congruent with that obtained by other studies (i.e., in showing support for clades that correspond to those informally recognized here). The following clade names are used throughout the discussion. The "Caesalpinioid crown" clade includes all the Mimosoideae and members of tribes Caesalpinieae and Cassieae of subfamily Caesalpinioideae that form the sister group to them and is defined as the least inclusive clade that contains Ceratonia siliqua and Albizia julibrissin. The "papilionoid" clade is equivalent to the subfamily Papilionoideae and is delimited by the most recent common ancestor (MRCA) of Swartzia simplex and Vicia faba. Within papilionoids, the "Cladrastis" clade is delimited by the MRCA of Cladrastis platycarpa and Cladrastis lutea, which renders the genus Cladrastis paraphyletic with respect to Pickeringia and Styphnolobium (Fig. 2). The "genistoid s.l." clade is delimited by the MRCA of Poecilanthe parviflora and Lupinus argenteus (Fig. 2). The "dalbergioid s.l." clade comprises all descendants of the MRCA of Amorpha fruticosa and Pterocarpus indicus (Fig. 3). The "mirbelioid" clade is delimited by the MRCA of Daviesia latifolia and Gompholobium minus (Fig. 4). The "millettioid" clade is delimited by the MRCA of Xeroderris stuhlmannii and Phaseolus vulgaris (Fig. 4). The "robinioid" clade comprises all descendents of the MRCA of Robinia pseudoacacia and Lotus japonicus (Fig. 5). The "inverted repeat-lacking" clade (IRLC) is delimited by the MRCA of Glycyrrhiza lepidota and Vicia faba (Fig. 5). The robinioids and IRLC are sister groups and comprise "Hologalegina."


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 LITERATURE CITED
 
The phylogenetic analyses of the legume matK sequences presented here achieve our main goal of reconstructing a robust molecular phylogeny for the Leguminosae, with the particular finding that the identity and inter-relationships of many clades within papilionoids are for the first time well resolved. The monophyly of the family and details of relationships among the outgroups, as well as the identity and relationships among the various caesalpinioid and mimosoid subgroups are only briefly addressed in this study as they are the subject of intensive, ongoing investigation by others (e.g., Luckow et al., 2000 , 2003 ; Bruneau et al., 2001 ; Herendeen et al., 2003a , b ). Regardless, when combined with results from these recent studies, this study reveals much promise for matK sequences in resolving all of the major clades within Fabales.

Consistent with recent results using the rbcL gene (Kajita et al., 2001 ) and trnL intron (Bruneau et al., 2001 ; Herendeen et al., 2003a ) sequences, analyses of matK sequences support the monophyly of the family Leguminosae (Fig. 1) and the paraphyly of subfamily Caesalpinioideae. Even with our limited sampling of caesalpinioid genera, at least four well-supported clades emerge from this analysis that correspond to major caesalpinioid clades detected by analyses of trnL intron sequences alone (Bruneau et al., 2001 ), trnL sequences combined with morphological characters (Herendeen et al., 2003a , b ), or rbcL sequences (Kajita et al., 2001 ). The well-supported clade represented by Colophospermum, Hymenaea, and Berlinia (Fig. 1) corresponds to the "Detarieae s.l." clade of Herendeen et al. (2003a ; see their fig. 3). Our finding of a sister relationship of this clade to that comprising Bauhinia and Cercis (the tribe Cercideae clade) is unexpected yet interesting; it may be an artifact of poor sampling among these caesalpinioid taxa but to our knowledge has not been observed in any previous study. The genera Petalostylis and Poeppigia form a well-supported clade that corresponds to the "Dialiinae s.l." clade of Herendeen et al. (2003a) . The sister relationship of this latter clade to all remaining Leguminosae (i.e., papilionoids, mimosoids, and closely related caesalpinioids) is also detected by these earlier analyses. A well-supported clade including the remaining caesalpinioids, all mimosoids and papilionoids corresponds to "clade A" of Bruneau et al. (2001 ; see their fig. 6), while the Umtiza + Caesalpinia + Mimosoideae clade of Herendeen et al. (2003a ; see their fig. 2), is well supported with matK sequences (Fig. 1). We refer to this clade as the Caesalpinioid crown. The Umtiza subclade (Herendeen et al., 2003b ) is sister to the rest of the Caesalpinioid crown clade and is here represented by Arcoa, Ceratonia, Gymnocladus, and Gleditsia. This subclade is not supported by parsimony bootstrap (<50%) in our analyses, however. Regardless, the constituents and relationships of the Umtiza subclade within the Caesalpinioid crown clade detected in this study are in agreement with and further substantiate the findings of Herendeen et al. (2003a , b ).

Although sampling of taxa from Mimosoideae was limited, our results generally agree well with those of Luckow et al. (2003) , especially with respect to the monophyly of at least the vast majority of the genera traditionally assigned to the Mimosoideae and paraphyly of constituent tribes. With the exception of Dinizia, which appears more closely related to certain caesalpinioids than to mimosoids on the basis of morphological and molecular evidence, as concluded by Luckow et al. (2003) , the rest of the taxa sampled from this subfamily are resolved as monophyletic with high support in our analyses (Fig. 1). Furthermore, our results clearly show Piptadeniastrum nested within the mimosoid clade, confirming recent results by Luckow et al. (2003) . The mimosoid clade generally shows poorly resolved relationships or at least short internal branch lengths compared to other clades of Leguminosae (Fig. 6). This suggests either a slow down in the rate of substitution or a relatively recent diversification of most of the extant members of mimosoids. These alternative hypotheses are being addressed elsewhere (Lavin et al., in press ).

In contrast to the caesalpinioids and mimosoids, our results have significant implications with regard to papilionoid phylogenetics. In all our analyses, papilionoids are strongly supported as monophyletic (Fig. 1) compared to previous rbcL studies where papilionoids were resolved as monophyletic but with relatively low statistical support (e.g., Kajita et al., 2001 ; 57% bootstrap and 62% parsimony jackknife). Similar to the findings of other studies involving broad sampling of caesalpinioid legumes (e.g., Herendeen et al., 2003a ), papilionoids are not resolved as sister to an isolated caesalpinioid lineage, as are the mimosoids, but rather are nested among the major caesalpinioid clades as an early branch in the legume phylogeny (Fig. 1). In marked contrast to the most recent rbcL analysis in which most major clades within papilionoids were weakly resolved (fig. 3 of Kajita et al., 2001 ), the matK strict consensus is very highly resolved (Figs. 25). Furthermore, both bootstrap proportions and Bayesian posterior probabilities for the major subclades often exceed 95%. The results presented here provide some of the best evidence to date in support of relationships among the major papilionoid subclades, which heretofore have been largely unresolved by cladistic analyses of DNA sequences data.

Consistent with the results of Doyle et al. (1997) and Pennington et al. (2001) , the matK phylogeny resolves certain representatives of Swartzieae and Sophoreae as the sister group to the rest the subfamily. The clade that forms the sister group to all remaining papilionoids, here delimited by the MRCA of Swartzia simplex and Myrospermum sousanum (Fig. 2), is unexpected in that it now includes representatives of a number of disparate lineages such as Angylocalyx and Dipterygeae (Dipteryx and Pterodon) that had been poorly resolved or supported in previous studies (e.g., Pennington et al., 2001 ). One of two subclades of this clade includes Swartzia and recent segregate Bobgunnia, Ateleia, and Cyathostegia, and corresponds to the "swartzioid" clade of Ireland et al. (2000) and Pennington et al. (2001) . The other contains Amburana, Angylocalyx, Dipterygeae, Dussia, Myrocarpus, and Myrospermum. The resolution of this larger clade of morphologically eclectic genera as sister to the rest of Papilionoideae suggests that the swartzioid clade of Pennington et al. (2001) could be expanded to encompass the majority of papilionoid genera that lack the 50-kb inversion in the plastid DNA genome.

With respect to the rest of the papilionoid subgroups, our sampling is much more extensive. The following seven well-supported clades resolved in this study are thus validated not only by extensive sampling, but also by the resolution of these subclades in other recent studies. These seven are the Cladrastis clade, the genistoid s.l., the dalbergioid s.l., the mirbelioids, the millettioids, the robinioids, and the inverted-repeat-lacking clades, the last two of which comprise Hologalegina. Even if resolved by previous studies, relationships among these major papilionoid subclades have been heretofore resolved at best with only weak support (e.g., Hu et al., 2000 ; Kajita et al., 2001 ; Lavin et al., 2001 ; Pennington et al., 2001 ).

The Cladrastis clade
The genera Cladrastis and Styphnolobium traditionally have been classified in Sophoreae s.s. (Polhill, 1981b ) whereas Pickeringia Nuttall has been classified in tribe Thermopsideae (Turner, 1981 ). These three genera form a well-supported clade in all our analyses. While a sister group relationship of Cladrastis and Styphnolobium has been observed in previous molecular studies (e.g., Doyle et al., 1997 ; Pennington et al., 2001 ) and is notable biogeographically because both genera exhibit East Asian–North American disjunctions, this study is the first to suggest a close relationship of these two genera with Pickeringia. Pickeringia is a monotypic genus restricted to the sclerophyllous chaparral vegetation of the California Floristic Province of western North America (Raven and Axelrod, 1995 ). The strongly supported position of this genus in the Cladrastis clade confirms Polhill's (1981b) initial prediction and Sousa and Rudd's (1993) subsequent conclusion of a close relationship between these three genera based on floral (bracts at base of inflorescence) and chromosomal similarities (n = 14; Goldblatt, 1981 ; Palomino et al., 1993 ). This clade is also supported by results from cladistic analyses of nuclear ribosomal DNA internal transcribed spacers (nrDNA ITS) sequence data (M. F. Wojciechowski, unpublished data). This placement of Pickeringia reveals that Thermopsideae sensu Yakovlev (Turner, 1981 ) is not monophyletic, contrary to molecular evidence presented previously (Crisp et al., 2000 ). Furthermore, the absence of quinolizidine alkaloids in Pickeringia of the type that is characteristic of other Thermopsideae (Turner, 1981 ) is consistent with the matK results. The presence of quinolizidine alkaloids, a prominent group of secondary metabolites once considered to be widely distributed in papilionoid legumes (Kinghorn and Balandrin, 1984 ), now appear to be of systematic significance only for the "genistoids" (see next paragraph). In a recent analysis (Kite and Pennington, 2003 ), the failure to detect similar alkaloids in extracts of Cladrastis and Styphnolobium is also in accordance with the phylogenetic position of these taxa based on trnL sequence data (Pennington et al., 2001 ) and the result presented here. The disjunct distribution of the Cladrastis clade in warm temperate to tropical regions of the Northern Hemisphere is common to many other legume groups (e.g., Gleditsia, Gymnocladus, Desmodium, Lespedeza, etc.; Schrire et al., in press ).

The genistoid s.l. clade
The genistoids include the many genera traditionally classified in the tribes Genisteae, Thermopsideae, Euchresteae, Crotalarieae, Liparieae, Podalyrieae, and Sophoreae s.s. (Käss and Wink, 1997 ; Crisp et al., 2000 ). The concept of a "genistoid alliance" was circumscribed by Polhill (1981a , 1994 ) who brought together for the first time this group of putatively related, predominantly Southern Hemisphere tribes that have been considered relatively isolated among quinolizidine-alkaloid-accumulating papilionoid legumes. The alliance as recognized by Polhill comprises four separate lineages. One includes the predominantly Northern Hemisphere Genisteae sensu stricto (s.s.), Euchresteae, and Thermopsideae together with certain Sophoreae (Sophora group). A second involves the mainly southern African Crotalarieae, Liparieae, and Podalyrieae, and now segregate tribe Hypocalypteae. A third comprises the endemic Australasian Bossiaeeae and Mirbelieae. The fourth includes the Neotropical–Australian Brongniartieae (including the Templetonia group). Early molecular phylogenetic analyses by Käss and Wink (1996 , 1997 ) suggested that most species of the alliance formed a monophyletic group with some certain members of Sophoreae (i.e., some but not all species of Maackia Rupr. & Maxim. and Sophora L.) near the base of papilionoids. The analysis of Doyle et al. (1997) suggested the genistoids were polyphyletic and formed three clades, the largest of which approximates the genistoid clade of Käss and Wink. The monogeneric Euchresteae (Euchresta) has been shown by subsequent analyses (Kajita et al., 2001 ) to be nested within a Sophora s.s. clade, while Liparieae has been formally included within Podalyrieae (Schutte and van Wyk, 1998 ), a placement verified by analyses of rbcL and nrDNA ITS sequences (Käss and Wink, 1997 ; Kajita et al., 2001 ; van der Bank et al., 2002 ).

Crisp et al. (2000) confirmed the polyphyly of the genistoids sensu Polhill (1981a) , but suggested that this name be restricted to a well-supported "core genistoids" group, from Africa and Eurasia, that comprises the majority of the tribes that made up Polhill's genistoid alliance. This clade is strongly supported by the matK data (Fig. 2; corresponds to the clade delimited by the MRCA of Bolusanthus speciosus and Spartium junceum). In addition, the matK phylogeny corroborates results of other studies in resolving a core genistoid clade nested within a larger genistoid clade. A more inclusive group, referred here to as the genistoid s.l. clade and well supported by matK sequence analysis (Fig. 2), includes the Brongniartieae (sensu Crisp and Weston, 1987 ; Thompson et al., 2001 ), Poecilanthe and Cyclolobium of Millettieae (Hu et al., 2000 , 2002 ), and a number of largely woody Neotropical genera of Sophoreae such as Acosmium Schott, Bolusanthus Harms, Bowdichia Kunth, Cadia Forssk., Diplotropis Bentham, and most likely Ormosia Jackson (Kajita et al., 2001 ), Dicraeopetalum Harms, Clathrotropis Harms, and Platycelyphium Harms (Pennington et al., 2001 ), several of which have not been sampled for matK sequences. The monophyly of the genistoid s.l. clade as defined here is also supported by the taxonomic distribution of quinolizidine alkaloids (e.g., Kinghorn and Balandrin, 1984 ; van Wyk, 2003 ). All taxa known to accumulate these alkaloids, with the exception of Calia (Kite and Pennington, 2003 ) and Ormosia (Kinghorn and Balandrin, 1984 ), are members of the genistoid s.l. clade as defined here. While the relationship of these particular taxa to this clade is not definitively resolved by our analyses (Figs. 2, 6), Pennington et al. (2001) did find weak support at least for the inclusion of Osmosia within an "expanded" genistoids. Further resolution and sampling of these taxa as well as Holocalyx, Uribea, and the vataireoids, may yet show quinolizidine alkaloids to be a non-molecular synapomorphy for an expanded genistoid clade.

The dalbergioid s.l. clade
The dalbergioid legumes, a mostly pantropical group of papilionoids, was originally circumscribed by a combined data approach to include 44 genera and ca. 1100 species from the tribes Aeschynomeneae, Adesmieae, subtribe Bryinae of Desmodieae, and tribe Dalbergieae except Andira, Hymenolobium, Vatairea, and Vataireopsis (Lavin et al., 2001 ). In addition, this clade is diagnosed apomorphically by the presence of the aeschynomenoid root nodule (Sprent, 2001 ). Although the position of the dalbergioid clade within the Papilionoideae was not well resolved or supported in previous studies using rbcL and trnL (Kajita et al., 2001 ; Pennington et al., 2001 ), there was preliminary evidence that its sister group included the predominantly North American temperate tribe Amorpheae (Lavin et al., 2001 ). Our results, like those of McMahon and Hufford (2004) , consistently show Amorpheae as the sole sister clade to the dalbergioid clade even if parsimony bootstrap support for this relationship is moderate (Fig. 3). Thus, the dalbergioid clade (sensu Lavin et al., 2001 ) is now expanded to encompass this tribe and is herein referred to collectively as the dalbergioid s.l. clade. The similarity in base chromosome number among dalbergioids and genera of Amorpheae, where x = 10 is apparently ancestral with derived cases of aneuploidy (e.g., x = 9 and x = 8; Goldblatt, 1981 ), supports this decision. Furthermore, the glandular punctate leaves and indehiscent, single-seeded pods (derived from a two-plus-ovulate ovary) of Amorpheae, once thought to indicate a relationship with the genera of Psoraleeae (e.g., Stirton, 1981 ), are found variously within dalbergioids. The sister relationship of the primarily tropical American Andira plus Hymenolobium to the dalbergioid s.l. clade is very weakly supported but resolved in both the parsimony strict consensus (Fig. 3) and Bayesian analyses (Fig. 6). This relationship, though consistent with results of Lavin et al. (2001 ; see their fig. 5) needs to be further investigated with additional sampling of Andira and Hymenolobium species.

The mirbelioid clade
The endemic Australasian tribes Bossiaeeae and Mirbelieae comprise a fifth clade of c. 31 genera and 750 species within the papilionoids, although matK sampling is still quite limited from this group. The analyses of Doyle et al. (1997) and Crisp et al. (2000) provided the first molecular evidence, albeit not well supported by bootstrap analysis, that validated Crisp and Weston's (1987) hypothesis of a monophyletic Mirbelieae-Bossiaeeae group. Recent trnL intron and nrDNA ITS sequence analyses (Crisp and Cook, 2003 ) suggest Bossiaeeae is nested within a paraphyletic Mirbelieae. Although the monophyly of Mirbelieae–Bossiaeeae is well-supported by both bootstrap and Bayesian analyses of matK sequences (Fig. 4), neither of these tribes is resolved as monophyletic, consistent with the results of Crisp et al. (2000) and Crisp and Cook (2003) .

The matK analyses provide the first unequivocal evidence for a sister group relationship between Mirbelieae-Bossiaeeae and the tribe Hypocalypteae (Schutte and van Wyk, 1998 ). Although this relationship receives only modest support in bootstrap and Bayesian analyses, it is consistently resolved (Fig. 4). A sister group relationship of Hypocalypteae, rather than nested within the Australasian Mirbelieae or Bossiaeeae, is also more consistent with their respective geographic distributions. The taxonomic position of Hypocalyptus Thunberg, a genus of three species geographically confined to the Cape region of South Africa, has been uncertain since Bentham (1837) , but is historically considered linked to various tribes of the genistoid alliance. A cladistic analysis of morphological and biochemical characters (Schutte and van Wyk, 1998 ) resulted in the tribal ranking of Hypocalypteae and the determination that it was more closely related phylogenetically to Millettieae. Crisp et al. (2000) however, resolved a sister relationship of Hypocalypteae to Indigofereae using combined rbcL and nrDNA ITS sequences, although bootstrap support was lacking. Neither of these alternative positions fundamentally conflicts with the position detected using matK because neither finds any support from parsimony bootstrap analyses (Bayesian analyses were not performed in these two studies). Regardless, Hypocalypteae and Mirbelieae-Bossiaeeae collectively, or perhaps individually, occupy a sister position to the rest of the papilionoids that accumulate nonprotein amino acids in seed (i.e., Indigofereae plus the millettioids and the robinioids plus the IRLC; Fig. 4).

The millettioid clade
This clade includes all genera of the tribes Millettieae, Abreae, Phaseoleae, and Psoraleeae, plus Desmodieae subtribes Desmodiinae and Lespedezinae (Lavin et al., 1998 ; Hu et al., 2000 , 2002 ; Kajita et al., 2001 ), with Indigofereae as its moderately supported sister group (Fig. 4). The predominantly tropical, woody Old World tribe Millettieae has been considered to be a transitional link from the "less advanced" elements of Dalbergieae and Sophoreae to putatively "more advanced" Old World tribes like Phaseoleae and Galegeae (Geesink, 1981 , 1984 ; Polhill, 1981a ). These authors used the term "advanced" to indicate a high degree of fusion of stamens and keel petals, as well as the accumulation of nonprotein amino acids in seeds rather than alkaloids. Geesink (1984) went so far as to consider Millettieae the paraphyletic "stem" group from which all other "advanced" papilionoids branched. Early rbcL analyses (Doyle et al., 1997 ) suggested the polyphyly of both Millettieae and Phaseoleae, and this was subsequently confirmed using nuclear phytochrome gene sequences (Lavin et al., 1998 ), trnK/matK sequences (Hu et al., 2000 ), and a combined analysis of rbcL, matK, and nrDNA ITS data (Hu, 2000 ). Regardless, the emerging pattern of relationships derived from these studies and from this matK analysis is that most of the constituent genera of Millettieae and Phaseoleae clade fall out in two very well-supported subclades (Fig. 4). The first, previously referred to as "core Millettieae" (Hu et al., 2000 ), comprises the majority of Millettieae and includes the large genera Millettia Wight & Arn. (c. 150 spp.), Lonchocarpus Kunth (c. 130 spp.), Derris Lour. (50–60 spp.) and Tephrosia Pers. (c. 350 spp.), while the majority of Phaseoleae dominates the second. Although shown here (Fig. 4) as sister lineages to core Millettieae, relationships of the monogeneric Abreae and certain Phaseoleae such as Galactia and presumably the other genera of subtribes Galactinae and Diocleinae (see Lewis et al., in press), are not yet resolved with certainty with respect to the core Millettieae clade. Additionally, certain genera classified in Millettieae, including Xeroderris and Platycyamus, have unresolved or weakly supported relationships with respect to the two main millettioid clades.

The large Phaseoleae clade resolved by matK includes tribes Desmodieae (except subtribe Bryinae) and Psoraleeae, in agreement with other studies (e.g., Kajita et al., 2001 ; Hu et al., 2002 ). Lackey's (1981) subtribal classification of Phaseoleae, the largest tribe of legumes in number of genera, is not congruent with the monophyletic subclades detected in this and other analyses (e.g., Kajita et al., 2001 ). The delimitation of the subtribe Phaseolinae is restricted to the descendants of the MRCA of Wajira and Phaseolus (Fig. 4), and excludes other genera once included, such as Psophocarpus and Otoptera, which are now known to be more closely related to genera in Glycininae (Thulin et al., in press ; A. Delgado-Salinas, M. Lavin, M. Thulin, and N. Weeden, unpublished data). In spite of the findings of Lee and Hymowitz (2001) , Glycininae is also not monophyletic and includes certain genera of subtribe Phaseolinae and probably all genera of tribe Psoraleeae (Kajita et al., 2001 ; A. Delgado-Salinas, M. Lavin, M. Thulin, and N. Weeden, unpublished data).

In this and most previous analyses (e.g., Kajita et al., 2001 ), tribe Indigofereae emerges as the moderately supported sister group to the millettioid clade (Fig. 4). The genera of Indigofereae have an inflorescence of the simple racemose type (each node bearing a single flower), whereas the tribes Abreae, Desmodieae, Millettieae, Phaseoleae, and Psoraleeae, which form the millettioid clade, share an unusual type of inflorescence, the pseudoraceme (see Tucker, 1987a , b ). Indeed, the pseudoracemose inflorescence is found only and in all members of the millettioid clade, rendering it the most readily morphologically distinguished of the newly circumscribed large subclades of Papilionoideae. In addition, typical chromosome numbers in Indigofereae (n = 8, 7, 6; Goldblatt, 1981 ) are different from that which predominate in the tribes comprising the millettioids (e.g., Millettieae, Phaseoleae; n = 11, 12; Goldblatt, 1981 ). Whether the millettioid clade should be expanded to include the Indigofereae is as yet undecided, especially given its sister relationship is only moderately supported, but the lack of any nonmolecular evidence to support such a relationship argues against this.

The robinioid clade
Tribes Loteae (including Coronilleae; sensu Polhill, 1994 ) and Robinieae (sensu Lavin and Sousa, 1995 ) comprise an expanded robinioid clade, which is distributed primarily in the Northern Hemisphere of the New World, Europe, and Africa. Within this clade, a monophyletic Sesbania L. (Robinieae) is weakly supported as sister to Loteae and these collectively form the sister group to the remaining members of Robinieae (Fig. 5). This position of Sesbania thus renders Robinieae paraphyletic, a finding first suggested by a preliminary phylogenetic analysis of matK sequences (Wojciechowski et al., 2000 ) and confirmed with a recent study utilizing exhaustive sampling for matK, trnL intron, and nrDNA ITS sequences (Lavin et al., 2003 ). Allan et al. (2003) have provided molecular evidence for the monophyly of the largely Eurasian–North American Loteae and the paraphyly of the large genus Lotus. The robinioid clade is here expanded from that described by Lavin et al. (2003) to encompass Sesbania and Loteae. Multiple lines of molecular evidence strongly support the monophyly of this collective group. For example, surveys for the presence of the inverted repeat in plastid DNA genomes among papilionoid legumes first suggested Loteae was distinct from the other temperate herbaceous tribes (Lavin et al., 1990 ) and its sister group relationship to Sesbania was only more recently determined (Wojciechowski et al., 2000 ).

The IRLC
The inverted-repeat-lacking clade or IRLC (Wojciechowski et al., 1999 , 2000 ) includes most members of Polhill's (1981a) temperate herbaceous group. This group comprises all members of tribes Carmichaelieae, Cicereae, Hedysareae, Trifolieae, Vicieae, and Galegeae, but not Loteae (and Coronilleae), as well as at least three genera of Millettieae, Afgekia Craib, Callerya Endlicher, and Wisteria Nutt. The IRLC was essentially the first clade of legumes to be distinguished on the basis of a molecular synapomorphy, loss of one copy of the 25-kilobase inverted repeat in the plastid genome (Lavin et al., 1990 ; Liston, 1995 ), in addition to a number of morphological features shared by members of this group including a predominantly herbaceous habit, epulvinate compound leaves, and base chromosome numbers of n = 7 or n = 8 (Polhill, 1981a ). The monophyly of the IRLC has been consistently detected in all subsequent cladistic analyses of molecular sequence data (Sanderson and Wojciechowski, 1996 ; Doyle et al., 1997 ; Käss and Wink, 1997 ; Hu et al., 2000 , 2002 ; Wojciechowski et al., 2000 ; Kajita et al., 2001 ). Within the IRLC, the genera Afgekia (not included in this analysis), Callerya, and Wisteria, formerly classified in the tribe Millettieae, along with Glycyrrhiza L. of Galegeae, form a paraphyletic grade with respect to the remaining IRLC (Fig. 5). The "vicioid" subclade of the IRLC includes many of the particularly important agricultural genera such as Cicer L., Lathyrus L., Lens Mill., Medicago L., Melilotus Mill., Pisum L., Trifolium L., and Vicia L. (Wojciechowski et al., 2000 ; Steele and Wojciechowski, 2003 ). The vicioids are morphologically the most distinctive subclade of the IRLC, apomorphically characterized by craspedidromous leaflets and consistently well supported as monophyletic with Parochetus (Trifolieae) as the sister to all other vicioid taxa (Fig. 5). Other major clades within the IRLC include the Astragalean clade (Wojciechowski et al., 1999 , 2000 ), defined as all descendants of the MRCA of Astragalus americanus and Clianthus puniceus (Fig. 5), and the hedysarioid clade (Sanderson and Wojciechowski, 1996 ; Wojciechowski et al., 2000 ), defined as all descendants of the MRCA of Hedysarum boreale and Caragana arborescens (Fig. 5).

Hologalegina
The robinioids and the IRLC comprise the largest of the well-marked papilionoid subclades, Hologalegina (Wojciechowski et al., 2000 ). This clade includes over 4800 species that make up the vast majority of legumes presently distributed in temperate regions of the world. The robinioids-IRLC dichotomy had been detected in studies with rbcL (Doyle et al., 1997 ; Kajita et al., 2001 ) but resolution was weakly supported. Furthermore, the placement of Bolusanthus (Sophoreae) as the sister group to Robinieae in those studies stands in marked disagreement with the results presented here and in most other studies that consistently place Bolusanthus within the genistoids (Lavin et al., 1998 ; Hu et al., 2000 , 2002