|
|
||||||||
Genetics |
Section of Integrative Biology and Institute of Cellular and Molecular Biology, The University of Texas at Austin, 1 University Station C0930, Austin, Texas 78712 USA; DOE Joint Genome Institute and Lawrence Berkeley National Laboratory, 2800 Mitchell Drive, Walnut Creek, California, 94598 USA; Department of Integrative Biology, University of California, 3060 Valley Life Sciences Building #3140, Berkeley, California 94720 USA; Genome Project Solutions, 1024 Promenade Street, Hercules, California 94547 USA
Received for publication July 5, 2006. Accepted for publication January 5, 2007.
ABSTRACT
We have sequenced two complete chloroplast genomes in the Asteraceae, Helianthus annuus (sunflower), and Lactuca sativa (lettuce), which belong to the distantly related subfamilies, Asteroideae and Cichorioideae, respectively. The Helianthus chloroplast genome is 151 104 bp and the Lactuca genome is 152 772 bp long, which is within the usual size range for chloroplast genomes in flowering plants. When compared to tobacco, both genomes have two inversions: a large 22.8-kb inversion and a smaller 3.3-kb inversion nested within it. Pairwise sequence divergence across all genes, introns, and spacers in Helianthus and Lactuca has resulted in the discovery of new, fast-evolving DNA sequences for use in species-level phylogenetics, such as the trnY-rpoB, trnL-rpl32, and ndhC-trnV spacers. Analysis and categorization of shared repeats resulted in seven classes useful for future repeat studies: double tandem repeats, three or more tandem repeats, direct repeats dispersed in the genome, repeats found in reverse complement orientation, hairpin loops, runs of A's or T's in excess of 12 bp, and gene or tRNA similarity. Results from BLAST searches of our genomic sequence against expressed sequence tag (EST) databases for both genomes produced eight likely RNA edited sites (C
U changes). These detailed analyses in Asteraceae contribute to a broader understanding of plastid evolution across flowering plants.
Key Words: Asteraceae chloroplast DNA comparative genomics divergent sequence genomic repeats Helianthus annuus Lactuca sativa RNA editing
Asteraceae is the second largest family of plants, with over 20 000 species (Bremer, 1994
). For the past two decades, numerous phylogenetic studies using chloroplast DNA sequence data have contributed to our understanding of the evolutionary relationships within this family. These include comparisons of the chloroplast genes rbcL (Kim et al., 1992
) and ndhF (Kim and Jansen, 1995
), as well as noncoding DNA from the trnL intron plus the trnL-trnF intergenic spacer (Jansen and Kim, 1994
; Bayer and Starr, 1998
), matK (Denda et al., 1999
), and with lesser resolution, psbA-trnH (Kim et al., 1999
). This research culminated in a study by Panero and Funk (2002)
that used over 13 000 bp per taxon for the largest family-wide classification revision of Asteraceae in over a hundred years. Still, many uncertainties remain with regards to species, generic, and tribal level relationships. It would be very useful to have more information on the relative rates of sequence evolution among the Asteraceae plastid genes and on genome organization as a potential set of characters to help guide future phylogenetic studies.
To contribute to this area of research, we report two complete chloroplast genome sequences from members of the Asteraceae, Helianthus annuus and Lactuca sativa. These plants belong to two distantly related subfamilies, Asteroideae and Cichorioideae, respectively (Panero and Funk, 2002
). In addition to these chloroplast genomes, there are only two other published chloroplast genome sequence for any plant within the large group, euasterids II, Panax ginseng (Araliaceae) (Kim and Lee, 2004
) and Daucus carota (Apiaceae) (Ruhlman et al., 2006
).
Early chloroplast genome mapping studies demonstrated that Helianthus annuus and Lactuca sativa share a 22.8-kb inversion relative to members of the subfamily Barnadesioideae (Heyraud et al., 1987
; Jansen and Palmer, 1987a
, b
). By comparison to outgroups, this inversion was shown to be derived, indicating that the Asteroideae and Cichorioideae are more closely related than either is to Barnadesioideae. A later mapping study (Knox et al., 1993
) and subsequent sequencing study (Kim et al., 2005
) found that taxa that share this 22.8-kb inversion also contain within this region a second, smaller, 3.3-kb inversion.
The complete chloroplast genome sequences of Helianthus and Lactuca enable analysis of repeat patterns in the genomes and of RNA editing by comparison to available expressed sequence tag (EST) sequences. In addition, because both of these genomes are from crop plants, their sequences will facilitate development of chloroplast genetic engineering technology as demonstrated in recent studies by Daniell and colleagues (Daniell et al., 2004
, 2005
; Ruiz and Daniell, 2005
; Saski et al., 2005
; Lee et al., 2006
). Knowing the exact sequence of spacer regions is crucial for introducing transgenes into the chloroplast genome (Daniell et al., 2005
). From a broader perspective, these two genomes will enable Asteraceae, the second largest plant family, to be included in larger analyses of chloroplast genome rearrangement and rates of gene evolution across flowering plants. This is important because plastids are uniform enough to perform interesting comparative studies across flowering plants, but divergent enough to capture interesting evolutionary genomic events. To understand these larger processes, it is necessary at first to perform smaller, more detailed comparative studies, which will then inform our understanding of genome evolution on a broader scale.
MATERIALS AND METHODS
Chloroplast isolation, amplification, and sequencing
Fresh leaf material from Lactuca (Lactuca sativa strain Salinas) and Helianthus (Helianthus annuus line HA383) was used for the chloroplast isolation. These strains are the same ones used in the EST and nuclear genome sequencing efforts of the Compositae Genome Project (Michelmore et al., 2006
). Chloroplasts were isolated from the fresh leaves by the sucrose-gradient method (Palmer, 1986
). They were then lysed and amplified using the REPLI-g whole genome amplification kit (Molecular Staging, New Haven, Connecticut, USA). The product was then digested with EcoRI and BstBI, and the clear banding pattern ensured that the amplification product was indeed chloroplast and not nuclear DNA. A detailed description of these steps is outlined in Jansen et al. (2005)
. Purified cpDNA was sheared by serial passage through a narrow aperture using a Hydroshear device (Gene Machines, Genomic Solutions, Ann Arbor, Michigan, USA). These fragments were enzymatically treated to repair blunt ends, were gel purified, and then ligated into pUC18 plasmids. These clones were introduced into E. coli by electroporation, plated onto nutrient agar with antibiotic selection, and grown overnight. Colonies were randomly selected and robotically processed through rolling circle amplification of plasmid clones, sequencing reactions using BigDye chemistry (Applied Biosystems, Foster City, California, USA), reaction cleanup using solid-phase reversible immobilization, and sequencing determination using an ABI 3730 XL automated DNA sequencer (Applied Biosystems). Detailed protocols are available at http://www.jgi.doe.gov/sequencing/protocols/protsproduction.html.
Genome assembly and annotation
Sequences from randomly chosen clones were processed using the computer program phred and assembled based on overlapping sequences into a draft genome sequence using the program phrap (Ewing and Green, 1998
). Quality of sequence determination and assembly was verified by eye using the program Consed (Gordon et al., 1998
). PCR and sequencing at The University of Texas at Austin were used to bridge gaps and mend low-quality areas of the genome. Additional sequences were added until a completely contiguous consensus was created representing the entire cpDNA. Throughout the entire consensus, we verified that all regions had a quality of Q40 or greater and included at least two overlapping reads. For both Lactuca and Helianthus, most of the genome far exceeds these minimum requirements. The beginning of each genome was standardized for gene annotation to be the first base pair after the IRa (in this case both started right before trnH). The program DOGMA [Dual Organellar GenoMe Annotator (Wyman et al., 2004
)] was used to assist in fully annotating all genes and to identify coding sequence, rRNAs, and tRNAs using the plastid/bacterial genetic code.
Calculating sequence divergence
The whole genome sequence and annotation of Lactuca and Helianthus were compared to the reference genome, Nicotiana tabacum, by a percent identity plot produced by the program MultiPipMaker (Schwartz et al., 2000
). The individual genes, rRNAs, tRNAs, introns, and intergenic spacers were also exported from both genomes in DOGMA and aligned by hand in MacClade (Maddison and Maddison, 2002
) for a more detailed quantification of sequence divergence. Because we only compared two genomes, we quantified sequence divergence as the proportion (p) of aligned nucleotide sites within a specified region that are different (p-distance). A perl script was written to call PAUP* (Swofford, 2003
) on each nexus file, calculate the p-distance between each region, and write out to a tab-delimited file. Indels were calculated by hand-aligning each pair of genes then counting the number of gaps in the alignment.
Examination of repeat structure
REPuter (Kurtz et al., 2001
) is a widely used program that identifies repeated sequences in genomes; however, there are two issues that skew repeat results when using the program. One is the use of Hamming distance (HD) as a measure of determining similarity of repeating sequence. This is a fixed parameter that only allows one user-defined number of differences per repeat, which is the same regardless of length. In effect, this biases toward the number of smaller repeats found in the genome because a greater percentage of differences for smaller repeats is allowed. The second issue is that REPuter finds overlapping repeats, which over estimates the number of actual repeats present. We solved these problems using the program Comparative Repeat Analysis (CRA) (N. Holtshulte and S. K. Wyman at Williams College, http://bugmaster.jgi-psf.org/repeats/, unpublished program) that runs and filters REPuter output, identifying the shared and unique repeats among the input genomes. We used CRA for both Lactuca and Helianthus genomes and compared them to the reference genome, Nicotiana tabacum. The following constraints were set in CRA to solve the first issue of HD as a measure of similarity: (1) minimum repeat size of 21 bp, and (2) 90% or greater sequence identity for each 10 bp bin (i.e., HD was set to 2 for 2130 bp, HD = 3 for 3140, HD = 4 for 4150 etc., until no further repeats were found). The second issue of reporting overlapping repeats is solved by CRA sifting through the REPuter output and excluding repeats contained within others. For time reasons, only repeats above 22 bp were examined by eye and placed into author-defined repeat categories.
Variation between coding sequences and cDNAs
Expressed sequence tags (EST) for both Lactuca and Helianthus were downloaded from two different databases: the Compositae Genome Project Database (CGPDB) (Michelmore et al., 2006
) and the TIGR Gene Index Database, (TIGR, 2005
). The complete set of coding sequences from our direct sequencing of Lactuca and Helianthus was searched for similarity by BLAST against their respective EST databases. Significant hits with an e-10 value or below were examined by eye for base-pair differences and summarized in a table as possible RNA edited sites.
RESULTS
Size, gene content, order, and organization
The Lactuca chloroplast genome (DQ383816) was 152 772 bp in length (Fig. 1) and contained a pair of inverted repeats (IRs) of 25 034 bp each, separated by a large and small single-copy (LSC and SSC) region of 84 105 bp and 18 599 bp, respectively. The Helianthus chloroplast genome (NC_007977) was 151 104 bp in length, with IRs of 24 633 bp each, separated by an LSC of 83 530 bp and a SSC of 18 308 bp. The G+C content of both Helianthus and Lactuca was 38% across the whole cp genome. Gene content and arrangement were identical in both cpDNAs. They also shared one large (Inv 1) and one small inversion (Inv 2) with respect to Nicotiana tabacum. There were 81 unique protein-coding genes in both genomes, six of which were duplicated in the IR. The four rRNA genes were contained completely within the IR, so they were doubled in the genome. There were 29 unique tRNA genes, of which seven were in the IR, which brought the total number to 36 in the genome. There were 18 unique intron-containing genes, five of these were duplicated in the IR; 16 genes had a single intron, and two genes had two introns.
|
|
|
|
|
Repeat analysis
Because the raw REPuter (Kurtz et al., 2001
) output contains many redundant repeats, we used the filtering program Comparative Repeat Analysis (CRA) (N. Holtshulte and S. K. Wyman, Williams College, unpublished data), which identifies and excludes repeats that are contained entirely within other repeats. CRA also identifies shared repeats by similarity searching using BLAST to identify the repeats in other input genomes. The direct output of the CRA analysis is found in Fig. 4A. Most of the repeats were less than 40 bp, with only two larger than 90 bp. Only repeats that are 23 bp or larger were examined by eye for both Helianthus and Lactuca. Because we were interested in the role of repeats in genome organization, we attempted to categorize these repeats and arrived at seven classes (Fig. 4B): (1) three or more tandem repeats, (2) direct repeats dispersed in the genome, (3) repeats found in reverse complement orientation dispersed in the genome (4) hairpin loops with a predicted 2° structure based on mfold (Zuker, 2003
) (palindromic repeats), (5) tandem repeats, (6) runs of A's or T's in excess of 12 bp (no repeats of G or C of those lengths are found), and (7) repeats of tRNAs (i.e., similarity between trnS-GCU and trnS-UGA) or portions of protein-encoding genes. Figure 4C shows the updated, more-accurate histogram of frequency of repeats after recategorizing them by length. For example, the four largest repeats from Fig. 4A were actually composed of smaller tandem repeats, so these were reclassified with their shorter length. In comparison to Fig. 4A, there are much fewer large repeats. This number went down even more when we recognized that two of our categories were not considered "real" repeats for our purposes: gene similarity and tRNA repeats provided evidence of gene duplication, which is shared among most land plants and poly-A and poly-T runs are actually single subunit repeats (SSRs). Figure 4D omits these categories of repeats and identifies which of the remaining ones were shared and unique among the genomes. Only two were shared by Nicotiana plus both Asteraceae genomes, four repeats were shared only among Asteraceae genomes, and the rest were unique to Helianthus or Lactuca. The two repeats that were shared among Helianthus, Lactuca, and Nicotiana were as follows: a 32-bp tandem repeat in the rrn4.5-rrn5 spacer and a 42-bp repeat that occurred in the second intron of ycf3, ndhA intron, and the rps12-ycf15 intergenic spacer. Most repeats were found in noncoding DNA (Fig. 4E). The greater number of repeats present in spacers vs. introns when corrected for proportion was almost identical: 3 repeats/18 introns vs. 17 repeats/112 spacers = 0.166 vs. 0.152, respectively. A table with more specific repeat information is located in Appendix S2 (see Supplemental Data accompanying online version of this article).
|
U changes, which are thought to be conventional angiosperm RNA editing changes (Hirose et al., 1999
|
Genome organization
Although the Helianthus and Lactuca chloroplast genomes are identical in gene content and arrangement, they differ in length. Some of this length difference is due to the length difference in the IRs: the Lactuca IR is 401 bp longer than the Helianthus IR. Even though the Lactuca genome IR is longer, the Helianthus IR extends further into the genes at both its margins relative to Lactuca by 146 bp total. The Helianthus IR extends an additional 105 bp into the coding region of ycf1 compared with Lactuca and an additional 41 bp into rps19. The general boundaries of the Asteraceae IRs (i.e., within ycf1 and rps19) are similar to others reported, although the exact extent into the single-copy genes varies among other published genomes, such as Glycine max, Nicotiana tabacum, Gossypium hirsutum, Eucalyptus globulus, and Panax ginseng (Wakasugi et al., 1998
; Kim and Lee, 2004
; Saski et al., 2005
; Steane, 2005
; Lee et al., 2006
).
There is a significant length difference between the IR regions in the Helianthus and Lactuca chloroplast genomes due to a large gene deletion. The genic IR boundaries are expanded in Helianthus, but the overall length of its IR is shorter than Lactuca's due to a deletion of 456 bp in ycf2. This 152 amino-acid (aa) deletion is relative to Lactuca and Nicotiana. The gene ycf2 is commonly absent in some species' chloroplast genomes (Millen et al., 2001
), i.e., monocot grasses, specifically maize, rice, and sugarcane (Maier et al., 1995
; Matsuoka et al., 2002
; Asano et al., 2004
). However, knockout studies of ycf2 have confirmed it as an essential chloroplast gene for survival in Nicotiana tabacum (Drescher et al., 2000
). If this deleterious effect is true in all dicots, then the gene must be functional because Helianthus continues to exist with this deletion. No studies on Helianthus have looked at the possible transfer of this gene to the nucleus. Other supporting evidence that the ycf2 gene in Helianthus is functional is that the rest of the gene is highly conserved relative to the Lactuca copy, with only 1.31% sequence divergence. If the large deletion in the Helianthus copy rendered it a pseudogene, we would expect there to be higher sequence divergence and/or internal stop codons unless the deletion were very recent. Early RFLP studies identified a deletion of similar size and location (Schilling and Jansen, 1989
, 1997
), which was shown to be derived within subtribe Helianthinae. Once a well-resolved subtribe phylogeny is available the exact timing of this deletion can be better determined.
Other differences between the chloroplast genomes occur with respect to gene length. In Helianthus, the start codon in the accD gene occurs 15 aa further into the gene than it does in Lactuca, a position that matches the annotation in Lotus and Arabidopsis. Lactuca also has a 25-aa insertion in the middle of the accD gene. As with ycf2, we assume the gene is still functional because sequence divergence is otherwise low across the rest of the gene. There are a few other instances where the lengths of genes differ (matK, rbcL, rpl22, rpl33, rpoC2, ycf1, ycf15) by a few amino acids, but the majority of genes between Helianthus and Lactuca have no indels. The tRNAs are even lower in indel events: one involves a 5-bp deletion in trnS-UGA that is shared between Helianthus and Lactuca, and two others are a 1-bp indel in both trnV-UAC and trnI-GAU. We assume these events do not affect tRNA function for similar reasons to those stated earlier.
Our exploration of using previously published EST databases as a comparative tool against our direct genomic sequence gave us some unexpected results. From our experience, we recommend that users of online EST databases be wary of basing detailed conclusions on sequences without accompanying quality scores. None of the possible C
U editing sites are shared between Helianthus and Lactuca (Table 3), nor are they shared with other published angiosperm RNA edited sites (Tsudzuki et al., 2001
). Although edited sites can be shared among distantly related taxa (Hirose et al., 1999
), they might be more recently derived as seen in other studies (Tsudzuki et al., 2001
). The other bp differences are not considered editing sites because only C
U changes have been reported in angiosperms (Tsudzuki et al., 2001
). This finding is interesting because, at least for the CGPDB database, the ESTs were made from the exact same strain of plant as was used in the chloroplast genome sequencing. These differences could be due either to intraspecific polymorphisms or to low-quality sequence in the ESTs (our stringent phredphrap requirement across the genomic sequence makes it very unlikely that low-quality sequence could be present in the genomic sequence). Both Daniell et al. (2006)
and Lee et al. (2006)
showed a similar pattern of intraspecific polymorphism between DNA and EST sequences, and in Lee et al.'s case only two of 11 polymorphisms were C
U edits. Because the CGPDB posts the raw data along with the EST contigs, we checked the chromatograms for the Helianthus gene, psbC, which had two indels present in the EST sequence. Both indels in this gene were miscalled peaks resulting from low-quality sequence data. This also calls into question any base-pair difference, including the C
U changes, between the ESTs and our genomic sequence. The TIGR database does not post the raw data so we could not determine the authenticity of polymorphisms from this database. For this reason, we estimated the expected number of C
U changes given the number of polymorphic sites we collected. If incorrect base calls occur at random, we would expect only 1/12 of them to be C
U changes. To get an unbiased estimate of base-pair differences, we added up the non-C
U changes between genomic and EST sequences, which totaled to 80. Therefore the expected number of C
U changes, if they occurred by chance, is 7.3 (80 non-C
U changes / 11 possible changes). We had 16 C
U changes, so we can estimate that on average seven to eight of our C
U possible editing sites are probably due to low-quality reads and eight to nine are possible RNA edited sites. From this experience, we recommend that users of online EST databases exercise caution in using these types of sequence databases without quality scores.
Evolutionary implications
Past analyses of repeated sequences in chloroplast genomes have focused primarily on simple sequence repeats (SSRs) (Powell et al., 1995
; Marshall et al., 2001
; Provan et al., 2001
), which are useful for population-level studies. But, tools for identifying and summarizing larger and more complex repeats have only recently emerged as current studies showed they were associated with rearranged genomes (Hupfer et al., 2000
; Kim et al., 2005
; Saski et al., 2005
). We have attempted to place these larger repeats into classes instead of lumping them all together. This will make future comparative repeat studies much more direct and informative. We showed that REPuter vastly overestimates the number of repeats, and even with helpful filters like CRA, the number of larger repeats is still inflated (see Fig. 4A vs. 4C).
Because repeats have been implicated in the rearrangement of chloroplast genomes, we looked for them at our three rearrangement endpoints (Table 4). The 31-bp repeat at positions 12 333 and 31 010 in the Helianthus genome is close to two of the second and third rearrangement endpoints, respectively, although the copy at coordinate 31 010 is 173 bp away from the third endpoint, which is a bit farther than its repeat pair. None of the other repeats stand out as being correlated with the rearrangement. Our analysis only looked at repeats of 23 bp and larger, so further examination of smaller repeats might reveal a higher density of repeats in this area. Another possible explanation for the lack of repeats associated with rearrangement endpoints may relate to the presence of tRNAs flanking all three of our rearrangement endpoints. Other researchers have noticed this association (Hiratsuka et al., 1989
) and have hypothesized that tRNA-associated recombination may facilitate large inversions rather than repeats.
|
|
U changes that are possible RNA edited sites. From our calculations of error, due to the database's poor sequence quality, we estimate that about half of these are possible edited sites. This finding is important and alerts the community of the potential problems using EST databases for fine-scale analyses. Finally, our analyses of sequence divergence between these two Asteraceae genomes identifies the fastest evolving genomic regions, both coding and noncoding. This provides the plant systematic community working in this family with regions to target for phylogenetic analyses. Also, from a broader perspective, these two genomes will enable Asteraceae, the second largest plant family, to be included in broader analyses of plastid evolution, from genome rearrangement to rates of gene evolution across all plastid-containing organisms. Ultimately, a comprehensive understanding of the genomic contribution that a plastid provides its plant host will be of great value to the plant research community.
FOOTNOTES
1 The authors thank R. Linder, B. Simpson, and the reviewers for providing valuable comments on this manuscript. Leaf material was provided by S. Knapp (Helianthus annuus) and R. Michelmore (Lactuca sativa). Z. Cai assisted with the perl scripts. This research was supported in part by a grant from the National Science Foundation (DEB 0120709) and the Sidney F. and Doris Blake Centennial Professorship in Systematic Botany to R.K.J. and a National Science Foundation IGERT grant (0114387) to R.E.T. Part of this work was performed under the auspices of the U.S. Department of Energy, Office of Biological and Environmental Research, by the University of California, Lawrence Berkeley National Laboratory, under contract no. DE-AC02-05CH11231. ![]()
2 Author for correspondence (e-mail: retimme{at}mail.utexas.edu
) ![]()
LITERATURE CITED
Asano T. Tsudzuki T. Takahashi S. Shimada H. Kadowaki K.. 2004. Complete nucleotide sequence of the sugarcane (Saccharum officinarum) chloroplast genome: a comparative analysis of four monocot chloroplast genomes. DNA Research 11: 93-99.[Abstract]
Bayer R. J. Greber D. G. Bagnall N. H.. 2002. Phylogeny of Australian Gnaphalieae (Asteraceae) based on chloroplast and nuclear sequences, the trnL intron, trnL/trnF intergenic spacer, matK, and ETS. Systematic Botany 27: 801-814.
Bayer R. J. Starr J. R.. 1998. Tribal phylogeny of the Asteraceae based on two non-coding chloroplast sequences, the trnL intron and trnL/trnF intergenic spacer. Annals of the Missouri Botanical Garden 85: 242-256.
Bremer K.. 1994. Asteraceae: cladistics and classification Timber Press, Portland, Oregon, USA.
Daniell H. Dhimgra A. Ruiz O.. 2004. Chloroplast genetic engineering to confer desired plant traits. Methods in Molecular Biology 286: 111-137.
Daniell H. Kumar S. Ruiz O.. 2005. Breakthrough in chloroplast genetic engineering of agronomically important crops. Trends in Biotechnology 23: 238-245.[CrossRef][ISI][Medline]
Daniell H. Lee S.-B. Grevich J. Saski C. Quesada-Vargas T. Guda C. Tomkins J. Jansen R. K.. 2006. Complete chloroplast genome sequences of Solanum bulbocastanum, Solanum lycopersicum and comparative analyses with other Solanaceae genomes. Theoretical and Applied Genetics 112: 1503-1518.[CrossRef][ISI][Medline]
Denda T. Watanabe K. Kosuge K. Yahara T. Ito M.. 1999. Molecular phylogeny of Brachycome (Asteraceae). Plant Systematics and Evolution 217: 299-311.
Drescher A. Ruf S. Calsa T. Carrer H. Bock R.. 2000. The two largest chloroplast genome-encoded open reading frames of higher plants are essential genes. Plant Journal 22: 97-104.[CrossRef][ISI][Medline]
Ewing B. Green P.. 1998. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Research 8: 186-194.
Funk V. A. Chan R. Keeley S. C.. 2004. Insights into the evolution of the tribe Arctoteae (Compositae: subfamily Cichorioideae s.s.) using trnL-F, ndhF, and ITS. Taxon 53: 637-655.
Gordon D. Abajian C. Green P.. 1998. Consed: a graphical tool for sequence finishing. Genome Research 8: 195-202.
Heyraud F. Serror P. Kuntz M. Steinmetz A. Heizmann P.. 1987. Physical map and gene localization on sunflower (Helianthus annuus) chloroplast DNA: evidence for an inversion of a 32.5-kbp segment in the large single copy region. Plant Molecular Biology 9: 485-496.
Hiratsuka J. Shimada H. Whittier R. Ishibashi T. Sakamoto M. Mori M. Kondo C. Honji Y. Sun C. R. Meng B. Y. Li Y. Q. Kanno A. Nishizawa Y. Hirai A. Shinozaki K. Sugiura M.. 1989. The complete sequence of the rice (Oryza sativa) chloroplast genomeintermolecular recombination between distinct transfer-RNA genes accounts for a major plastid DNA inversion during the evolution of the cereals. Molecular & General Genetics 217: 185-194.
Hirose T. Kusumegi T. Tsudzuki T. Sugiura M.. 1999. RNA editing sites in tobacco chloroplast transcripts: editing as a possible regulator of chloroplast RNA polymerase activity. Molecular Biology and Evolution 262: 462-467.
Hupfer H. Swaitek M. Hornung S. Herrmann R. G. Maier R. M. Chiu W. L. Sears B.. 2000. Complete nucleotide sequence of the Oenothera elata plastid chromosome, representing plastome I of the five distinguishable Euoenthera plastomes. Molecular Genetics and Genomics 263: 581-585.
Jansen R. K. Kim K. J.. 1994. Implications of chloroplast DNA data for the classifications and phylogeny of the Asteraceae. Compositae: Systematics, Proceedings of the International Compositae Conference, Kew 1: 317-339.
Jansen R. K. Palmer J. D.. 1987a. A chloroplast DNA inversion marks an ancient evolutionary split in the sunflower family (Asteraceae). Proceedings of the National Academy of Sciences, USA 84: 5818-5822.
Jansen R. K. Palmer J. D.. 1987b. Chloroplast DNA from lettuce and Barnadesia (Asteraceae): structure, gene localization, and characterization of a large inversion. Current Genetics 11: 553-564.
Jansen R. K. Raubeson L. A. Boore J. L. de Pamphilis C. W. Chumley T. W. Haberle R. C. Wyman S. K. Alverson A. J. Peery R. Herman S. J. Fourcade H. M. Kuehl J. McNeal J. R. Leebens-Mack J. Cui L.. 2005. Methods for obtaining and analyzing whole chloroplast genome sequences. Methods in Enzymology 348-384.
Kim H.-G. Choi K.-S. Jansen R. K.. 2005. Two chloroplast DNA inversions originated simultaneously during the early evolution of the sunflower family (Asteraceae). Molecular Biology and Evolution 22: 1-10.
Kim K.-J. Jansen R. K.. 1995. ndhF sequence evolution and the major clades in the sunflower family. Proceedings of the National Academy of Sciences, USA 92: 10379-10383.
Kim K. J. Jansen R. K. Wallace R. S. Michaels H. H. Palmer J. D.. 1992. Phylogenetic implications of rbcL sequence variation in the Asteraceae. Annals of the Missouri Botanical Garden 79: 428-445.
Kim K. J. Lee H. L.. 2004. Complete chloroplast genome sequences from Korean ginseng (Panax schinseng Nees) and comparative analysis of sequence evolution among 17 vascular plants. DNA Research 11: 247-261.[Abstract]
Kim S. C. Crawford D. J. Jansen R. K. Santos-Guerra A.. 1999. The use of a non-coding region of chloroplast DNA in phylogenetic studies of the subtribe Sonchinae (Asteraceae: Lactuceae). Plant Systematics and Evolution 215: 85-99.
Knox E. B. Downie S. R. Palmer J. D.. 1993. Chloroplast genome rearrangements and the evolution of giant Lobelias from herbaceous ancestors. Molecular Biology and Evolution 10: 414-430.[ISI]
Kurtz S. Choudhuri J. V. Ohlebusch E. Schleiermacher C. Stoye J. Giegerich R.. 2001. REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Research 29: 4633-4642.
Lee S.-B. Kaittanis C. Jansen R. K. Hostetler J. B. Tallon L. J. Town C. D. Daniell H.. 2006. The complete chloroplast genome sequence of Gossypium hirsutum: organization and phylogenetic relationships to other angiosperms. BMC Genomics 7: 61.[CrossRef][Medline]
Maddison D. R. Maddison W. P.. 2002. MacClade: analysis of phylogeny and character evolution, 4.05 Sinauer, Sunderland, Massachusetts, USA.
Maier R. M. Neckermann K. Igloi G. L. Kossel H.. 1995. Complete sequence of the maize chloroplast genome: gene content, hotspots of divergence and fine tuning of genetic information by transcript editing. Journal of Molecular Biology 251: 614-628.[CrossRef][ISI][Medline]
Marshall H. D. Newton C. Ritland K.. 2001. Sequence-repeat polymorphisms exhibit the signature of recombination in lodgepole pine chloroplast DNA. Molecular Biology and Evolution 18: 2136-2138.
Matsuoka Y. Yamazaki Y. Ogihara Y. Tsunewaki K.. 2002. Whole chloroplast genome comparison of rice, maize, and wheat: implications for chloroplast gene diversification and phylogeny of cereals. Molecular Biology and Evolution 19: 2084-2091.
Michelmore R. Knapp S. J. Bradford K. J. Rieseberg L. H. Jackson L. E. Kesseli R. V. Compositae Genome Project Database Website http://cgpdb.ucdavis.edu/sitemap.html [accessed November 2005].
Millen R. S. Olmstead R. G. Adams K. L. Palmer J. D. Lao N. T. Heggie L. Kavanagh T. A. Hibberd J. M. Gray J. C. Morden C. W. Calie P. J. Jermiin L. S. Wolfe K. H.. 2001. Many parallel losses of infA from chloroplast DNA during angiosperm evolution with multiple independent transfers to the nucleus. Plant Cell 13: 645-658.
Palmer J. D.. 1986. Isolation and structural analysis of chloroplast DNA. Methods in Enzymology 118: 167-186.
Panero J. L. Crozier B. S.. 2003. Primers for PCR amplification of Asteraceae chloroplast DNA. Lundellia 6: 1-9.
Panero J. L. Funk V. A.. 2002. Toward a phylogenetic subfamilial classification for the Compositae (Asteraceae). Proceedings of the Biological Society of Washington 115: 909-922.
Powell W. Morgante M. McDevitt R. Vendramin G. G. Rafalski J. A.. 1995. Polymorphic simple sequence repeat regions in chloroplast genomes: applications to the population genetics of pines. Proceedings of the National Academy of Sciences, USA 92: 7759-7763.
Provan J. Powell W. Hollingsworth P. M.. 2001. Chloroplast microsatellites: new tools for studies in plant ecology and evolution. Trends in Ecology and Evolution 16: 142-148.
Ruhlman T. Lee S.-B. Jansen R. Hostetler J. Tallon L. Town C. Daniell H.. 2006. Complete plastid genome sequence of Daucus carota: implications for biotechnology and phylogeny of angiosperms. BMC Genomics 7: 222.[CrossRef][Medline]
Ruiz O. Daniell H.. 2005. Engineering cytoplasmic male sterility via the chloroplast genome. Plant Physiology 138: 1232-1246.
Saski C. Lee S. Daniell H. Wood T. Tomkins J. Kim H.-G. Jansen R. K.. 2005. Complete chloroplast genome sequence of Glycine max and comparative analyses with other legume genomes. Plant Molecular Biology 59: 309-322.[CrossRef][ISI][Medline]
Schilling E. E.. 1997. Phylogenetic analysis of Helianthus (Asteraceae) based on chloroplast DNA restriction site data. Theoretical and Applied Genetics 94: 925-933.
Schilling E. E. Jansen R. K.. 1989. Restriction fragment analysis of chloroplast DNA and systematics of Viguiera and related genera (Asteraceae: Heliantheae). American Journal of Botany 76: 1769-1778.
Schwartz S. Z. Z. Frazer K. A. Smit A. Riemer C. Bouck J. Gibbs R. Hardison R. Miller W.. 2000. PipMaker: a web server for aligning two genomic DNA sequences. Genome Research 10: 577-586.
Shaw J. Lickey E. B. Beck J. T. Farmer S. B. Liu W. Miller J. Siripun K. C. Winder C. T. Schilling E. E. Small R. L.. 2005. The tortoise and the hare II: relative utility of 21 noncoding chloroplast DNA sequences for phylogenetic analysis. American Journal of Botany 92: 142-166.
Steane D. A.. 2005. Complete nucleotide sequence of the chloroplast genome from Tasmanian blue gum, Eucalyptus globulus (Myrtaceae). DNA Research 12: 215-220.
Swofford D. L.. 2003. PAUP*: phylogenetic analysis using parsimony (*and other methods) Sinauer, Sunderland, Massachusetts, USA.
[TIGR] The Institute for Genomic Research.. 2005. TIGR Gene Index Database Website http://compbio.dfci.harvard.edu/tgi/plant.html [accessed November 2005].
Tsudzuki T. Wakasugi T. Sugiura M.. 2001. Comparative analysis of RNA editing sites in higher plant chloroplasts. Journal of Molecular Evolution 53: 327-332.[CrossRef][ISI][Medline]
Wakasugi T. Sugita M. Tsudzuki T. Sugiura M.. 1998. Updated gene map of tobacco chloroplast DNA. Plant Molecular Biology Reporter 16: 231-241.
Wyman S. K. Boore J. L. Jansen R. K.. 2004. Automatic annotation of organellar genomes with DOGMA. Bioinformatics 20: 3252-3255.
Zuker M.. 2003. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Research 31: 3406-3415.
This article has been cited by other articles:
![]() |
J. Shaw, E. B. Lickey, E. E. Schilling, and R. L. Small Comparison of whole chloroplast genome sequences to choose noncoding regions for phylogenetic studies in angiosperms: the tortoise and the hare III Am. J. Botany, March 1, 2007; 94(3): 275 - 288. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |