Specific Aims for Medicinal Plant Genomics Resource
High throughput sequencing of genomes and transcriptomes has revolutionized and
accelerated the pace and progress of research across the life sciences.
In plants, the application of these approaches to model organisms and
major agricultural crops (e.g., Arabidopsis, rice, sorghum, maize and
poplar) has provided tremendous insight into plant metabolic processes.
However, while primary and intermediary metabolism is conserved across
the plant kingdom, the specialized secondary metabolic pathways leading
to medicinal compounds are not well conserved. Indeed, medicinal
compounds are often produced by a handful of plant genera or species.
As a result, progress in understanding and manipulating these
taxonomically restricted metabolic pathways, many of which produce
compounds of pharmaceutical importance, has not benefited to the same
extent from the genomics revolution. The proposed research will address
this gap in our species-specific knowledge of plant metabolism by
determining the DNA sequence and expression of the transcriptomes and
the associated metabolomes of 14 key medicinal plant species, thereby
allowing genome-enabled identification of candidate pathway genes in
these organisms through correlation of gene expression with the
production of specific pharmaceutically relevant metabolites.
The resulting datasets will provide an unparalleled resource for the research
community working at the interface of plant metabolism and human health.
List of Objectives for Medicinal Plant Genomics Resource
-
Obtain well-characterized and reproducible samples of plant materials for the
14 taxonomically diverse medicinal plant species. For each
species, up to 20 tissue samples will be selected that are anticipated
to have significant variation in concentrations of medicinal compounds:
10 core tissue samples and up to 10 additional samples selected by the
relevant species experts. Each sample will be extracted for RNA and
metabolites and aliquots provided to the metabolomics and
transcriptomics units for analyses.
-
Perform quality control on samples using LC-MS to assess the levels of 5-10 known,
well-characterized medicinal compounds in each plant species. These validated,
chemically diverse samples will then be used for whole transcriptome sequencing,
gene expression profiling and quantitative metabolite profiling analyses.
-
Obtain 600-800 Mb of transcriptome sequence per species using next generation
sequencing of a normalized library made from pooled mRNA from 5 diverse tissues
(e.g., roots, stems, flowers, young leaf buds and callus tissue). Generate a
virtual transcriptome for each species. Annotate the assembled transcriptome
for putative gene function using bioinformatic approaches including sequence
similarity, motif/domain searches, and subcellular localization predictions.
-
Employ Illumina RNA-Seq whole
transcriptome sequencing to generate deep expression profiles in up to 20 chemically
diverse tissues for each plant species. Map expression data to
assembled virtual transcriptome of each species. Use data to
further improve the transcriptome assembly.
-
Perform quantitative metabolite
profiling of the samples to quantify the relative levels of
known medicinal compounds and potential metabolic intermediates in each
species.
-
Deposit all datasets into the relevant publicly accessible databases and
make raw and processed datasets available through the project website. This site will provide a user-friendly data
interface and will incorporate custom tools for the
community to download, access, and compare the sequences, annotations,
transcript expression and metabolite data sets. The data and website
will enable a key, previously unaccessible link between the genome and
metabolome of medicinal plants, and provide novel information to the
community about the genes and markers of medicinal compound synthesis.