2012 Johnson et al PLoS ONE 1kp methods

Transcrição

2012 Johnson et al PLoS ONE 1kp methods
Evaluating Methods for Isolating Total RNA and
Predicting the Success of Sequencing Phylogenetically
Diverse Plant Transcriptomes
Marc T. J. Johnson1*., Eric J. Carpenter2., Zhijian Tian3., Richard Bruskiewich4¤, Jason N. Burris5,
Charlotte T. Carrigan6, Mark W. Chase7, Neil D. Clarke8, Sarah Covshoff9, Claude W. dePamphilis10,
Patrick P. Edger11, Falicia Goh8, Sean Graham12, Stephan Greiner13, Julian M. Hibberd9, Ingrid JordonThaden14,19, Toni M. Kutchan15, James Leebens-Mack6, Michael Melkonian16, Nicholas Miles14,19,
Henrietta Myburg17, Jordan Patterson2, J. Chris Pires11, Paula Ralph10, Megan Rolf15, Rowan F. Sage18,
Douglas Soltis14, Pamela Soltis19, Dennis Stevenson20, C. Neal Stewart Jr.5, Barbara Surek16,
Christina J. M. Thomsen1, Juan Carlos Villarreal21, Xiaolei Wu3, Yong Zhang3, Michael K. Deyholos2,
Gane Ka-Shu Wong2,3,22*
1 Department of Biology, University of Toronto at Mississauga, Mississauga, Ontario, Canada, 2 Department of Biological Sciences, University of Alberta, Edmonton,
Alberta, Canada, 3 BGI-Shenzhen, Bei Shan Industrial Zone, Yantian District, Shenzhen, China, 4 International Rice Research Institute, Metro Manila, Philippines,
5 Department of Plant Sciences, University of Tennessee, Knoxville, Tennessee, United States of America, 6 Department of Plant Biology, University of Georgia, Athens,
Georgia, United States of America, 7 Jodrell Laboratory, Royal Botanic Gardens, Kew, Richmond, Surrey, United Kingdom, 8 Genome Institute of Singapore, Singapore,
Singapore, 9 Department of Plant Sciences, University of Cambridge, Cambridge, United Kingdom, 10 Department of Biology and Intercollege Graduate Program in Plant
Biology, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania, United States of America, 11 Division of Biological Sciences,
University of Missouri, Columbia, Missouri, United States of America, 12 Department of Botany and UBC Botanical Garden, University of British Columbia, Vancouver,
British Columbia, Canada, 13 Max Planck Institute for Molecular Plant Physiology, Wissenschaftspark Golm, Am Mühlenberg 1, Potsdam-Golm, Germany, 14 Department of
Biology, University of Florida, Gainesville, Florida, United States of America, 15 Donald Danforth Plant Science Center, St. Louis, Missouri, United States of America,
16 Department of Botany, Cologne Biocenter, University of Cologne, Cologne, Germany, 17 Department of Plant Biology, North Carolina State University, Raleigh, North
Carolina, United States of America, 18 Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada, 19 Florida Museum of Natural
History, University of Florida, Gainesville, Florida, United States of America, 20 New York Botanical Garden, Bronx, New York, United States of America, 21 Department of
Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut, United States of America, 22 Department of Medicine, University of Alberta, Edmonton,
Alberta, Canada
Abstract
Next-generation sequencing plays a central role in the characterization and quantification of transcriptomes. Although
numerous metrics are purported to quantify the quality of RNA, there have been no large-scale empirical evaluations of the
major determinants of sequencing success. We used a combination of existing and newly developed methods to isolate
total RNA from 1115 samples from 695 plant species in 324 families, which represents .900 million years of phylogenetic
diversity from green algae through flowering plants, including many plants of economic importance. We then sequenced
629 of these samples on Illumina GAIIx and HiSeq platforms and performed a large comparative analysis to identify
predictors of RNA quality and the diversity of putative genes (scaffolds) expressed within samples. Tissue types (e.g., leaf vs.
flower) varied in RNA quality, sequencing depth and the number of scaffolds. Tissue age also influenced RNA quality but not
the number of scaffolds $1000 bp. Overall, 36% of the variation in the number of scaffolds was explained by metrics of RNA
integrity (RIN score), RNA purity (OD 260/230), sequencing platform (GAIIx vs HiSeq) and the amount of total RNA used for
sequencing. However, our results show that the most commonly used measures of RNA quality (e.g., RIN) are weak
predictors of the number of scaffolds because Illumina sequencing is robust to variation in RNA quality. These results
provide novel insight into the methods that are most important in isolating high quality RNA for sequencing and
assembling plant transcriptomes. The methods and recommendations provided here could increase the efficiency and
decrease the cost of RNA sequencing for individual labs and genome centers.
Citation: Johnson MTJ, Carpenter EJ, Tian Z, Bruskiewich R, Burris JN, et al. (2012) Evaluating Methods for Isolating Total RNA and Predicting the Success of
Sequencing Phylogenetically Diverse Plant Transcriptomes. PLoS ONE 7(11): e50226. doi:10.1371/journal.pone.0050226
Editor: Christopher Quince, University of Glasgow, United Kingdom
Received June 27, 2012; Accepted October 22, 2012; Published November 21, 2012
Copyright: ß 2012 Johnson et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: The 1KP Consortium is supported by the Government of Alberta (Ministry of Advanced Education and Technology), Musea Ventures, BGI-Shenzhen,
Alberta’s Informatics Circle of Research Excellence (iCORE), and iPlant (NSF(National Science Foundation) grant DBI-0735191). NSERC Canada supported M.
Deyholos, S. Graham, M. Johnson and R. Sage. NSF supported M. Johnson (DEB-0919869), J. Leebens-Mack and C. dePamphilis (DEB-0830009, DEB-0829868), J.
Pires and P. Edger (DEB-1110443, DEB-1146603), D. and P. Soltis (PGR-0638595, IOS-0922742, DEB-0919254), D. Stevenson (DEB-082762, EF-0629817, IOS-0421604)
and J. Villarreal (DDIG-0910258). National Institutes of Health (NIH) supported T. Kutchan, C. dePamphilis and J. Leebens-Mack (1R01DA025197-01). USDA National
Institute of Food and Agriculture and the University of Tennessee Agricultural Experiment Station supported J. Burris and C. Stewart. The Agency for Science,
Technology and Research (Singapore) supported F. Goh and N. Clarke. The Max Planck Society supported S. Greiner and the University of Cologne supported M.
Melkonian and B. Surek. The Royal Botanic Gardens, Kew, UK, supported M. Chase. S. Covshoff, J. Hibberd, R. Sage and R. Bruskiewich were funded by the
International Rice Research Institute C4 project through the Bill & Melinda Gates Foundation. The funds from Musea Ventures came in the form of a charitable
donation to the University of Alberta. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
PLOS ONE | www.plosone.org
1
November 2012 | Volume 7 | Issue 11 | e50226
Plant RNA Isolation and Transcriptome Sequencing
Competing Interests: The authors have declared that no competing interests exist.
* E-mail: [email protected] (MJ); [email protected] (GKSW)
. These authors contributed equally to this work.
¤ Current address: Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
data will constitute a cornucopia of genetic information to be used
for the biotechnological engineering of crops, development of
biofuels, genetic and biochemical characterization of medicinally
important plants, and a comprehensive phylogenetic understanding of how plant life has evolved during the past 1 billion years (see
www.onekp.com). To achieve 1KP’s goals we have overcome
biological, logistical and technological impediments that had
hitherto not been attempted on such a scale.
In this paper, we develop and evaluate methods for isolating
high-quality total RNA from non-model plants, which can then be
used to sequence the diversity of genes expressed within any plant
tissue using next-generation sequencing methods. It was necessary
for the 1KP project to develop methods that could be employed
for the wide phylogenetic diversity of plant species found in nature,
from green algae to angiosperms, which diverged .900 mya [23].
To achieve this objective we developed and/or implemented 18
distinct protocols to isolate total RNA from samples; our results
and experience showed that no one protocol was suitable for all
tissues and species. Using these protocols we sought to answer
three questions: 1) what is the success rate of a commonly
employed commercial kit for the extraction of total RNA from
plants, and which alternative extraction protocols increase the
success of isolating RNA; 2) are different plant tissues associated
with differences in the quality of total RNA and the size of
assembled transcriptomes; and 3) what measures of RNA quality
provide the best predictor of the size of assembled transcriptomes
into scaffolds?
Introduction
Next-generation sequencing (NGS) has rapidly transformed the
life sciences as it is now possible to sequence entire genomes and
virtually all expressed genes in a fast and cost-effective manner [1–
6]. In the case of plant biology, this has made it possible to
accelerate fundamental and applied research on how genes
interact to form genetic networks [7], the identity of enzymes
involved in the biosynthesis of medicinally important primary and
secondary metabolites [8], and responses of plants to biotic and
abiotic environmental stress [9]. Moreover, NGS is producing
massive data sets to answer questions relating to phylogenomic
analyses of plant evolution and diversification [10], which is
important for extending fundamental results from model organisms to other plant groups, including many that are economically
important. Although these exciting developments are quickly
changing research across the life sciences, they also present
biologists with numerous logistical challenges [6,11]. Here we
describe the development and evaluation of some of the methods
and initial results from the One Thousand Plants Consortium
(1KP; www.onekp.com), which seeks to provide state-of-the-art
molecular tools for hundreds of non-model plant species by
sequencing transcriptomes across the diversity of green plants and
use these transcriptomes to answer some of the most pressing
problems in plant biology [12].
Sequencing plant transcriptomes by NGS is complicated by
many factors. The first challenge is to isolate RNA of sufficient
quality (i.e., non-degraded RNA, free of impurities) and yield.
RNases are particularly problematic in this regard because they
rapidly degrade RNA and are widespread in nature and
laboratories [13]. A further complication is the myriad primary
and secondary plant metabolites (e.g., phenolics, polysaccharides)
that vary dramatically within and between species [14–17], and
can interfere with RNA isolation [18]. A second challenge relates
to identifying metrics of RNA quality that predict sequencing
success [19], quantified here as the number of long scaffolds
($1000 bp) assembled from Illumina sequence reads, which
provides an estimate of the number of full-length transcripts in a
sample. Most genomic sequencing facilities attempt to simultaneously maximize multiple metrics of RNA quality [20], but we
are not aware of any large-scale tests that empirically evaluate how
these measures affect RNA sequencing (RNA-seq) and downstream assembly of short sequence reads into putatively expressed
genes. Finally, the rapidly changing methods associated with
sequencing platforms and chemistry will likely have large, but as of
yet, unquantified effects on sequencing success [21,22]. Successful
application of NGS technologies to biological problems requires
that we understand the impacts of these issues [6].
1KP is seeking to address these problems by sequencing
transcriptomes from more than 1000 plant species. The project
involves an unprecedented interdisciplinary collaboration of over
one hundred researchers from around the world. The overall goal
of the project is to create genomic resources for one thousand
plant species sampled from across the plant kingdom, including
plant species of importance in medicine, agriculture, forestry and
conservation. The vast majority of these species are non-model
organisms with little to no existing molecular tools. Thus, these
PLOS ONE | www.plosone.org
Materials and Methods
Plant Samples and Tissues
We isolated total RNA from 1115 samples from 695 plant
species representing 324 families collected from the field, botanical
gardens, greenhouses, growth chambers and axenic cultures. No
specific permits were required for the collection of samples as the
minority of samples collected directly from the field were taken
from public land. None of these samples represent endangered or
protected species. Samples included non-vascular plants such as
algae, hornworts and mosses, and vascular plants including
lycopods, ferns, gymnosperms and angiosperms. We isolated total
RNA from tissues categorized into one of eight tissue types,
including: i) leaf (489 samples), ii) flower (4), iii) fruit (10), iv) buds
(leaf or flower) (15), v) shoot/stem (7), vi) below-ground (12; 10
roots, 2 bulbs), vii) mixed tissues (two or more of tissues i–vi) (276)
and viii) algal cells (274). Care was taken to properly differentiate
and categorize tissue types, but some samples inevitably had
overlapping cell types with other tissue types and we therefore view
differences among tissues as conservative patterns. For a subset of
71 species, we also tested the effects of tissue age on RNA quality
and sequencing success by comparing ‘‘young’’ freshly expanding
leaves and ‘‘mature’’ fully expanded but non-senescing leaves
collected from tissue that was pooled from at least two healthy
plants grown together in the greenhouse. For these samples we
used approximately 0.1–0.5 g of tissue from young leaves and
roughly 26 as much (up to 1 g) for old leaves; more tissue was
required from mature leaves to achieve equivalent concentrations
2
November 2012 | Volume 7 | Issue 11 | e50226
Plant RNA Isolation and Transcriptome Sequencing
as young leaves (see Results). The complete data set, including the
list of all samples, tissues and data, is provided in Table S1.
chip assay in accordance with the manufacturer’s instructions
(Agilent Technologies, Santa Clara, CA, USA). The Bioanalyzer
uses electrophoretic technology on a chip to separate RNA
fragments by size, which are read by laser induced fluorescence
and translated into gel-like bands and peaks (electropherograms)
showing the distribution and relative amounts of RNA of
different sizes (Figure 1). Concentration (ng/ml) of RNA was
determined by comparing the sample with a standard. We
measured RNA integrity using two metrics: the ratio of the
large (26S) to small (18S) ribosomal RNA subunits (26S/18S)
and the RNA integrity number (RIN) [20]. In non-degraded
RNA, the quantity of 26S RNA should be 1.6–26 that of 18S,
whereas degradation of 26S occurs faster than 18S and thus
deviations below the expected ratio are used as a measure of
RNA degradation [20]. The use of r26S/18S has been criticized
as it might not reflect degradation of other types of RNA such
as mRNA [20,26]–the focus of transcriptome sequencing. RIN
has been proposed as an alternative where its calculation is
based on a regression model that estimates the integrity of the
entire RNA profile, including r26S/18S and the presence and
absence of other electropherogram peaks, some of which
correspond to mRNA [20]. RIN was originally developed to
evaluate the integrity of RNA extracted from mammalian
tissues, but plants exhibit greater heterogeneity in ribosomal
subunits than animal tissues due to the presence of plastid
ribosomes. We therefore selected the plant RNA test program
option available in version B.02.07 of the Bioanalyzer software
[27], which accounts for the greater heterogeneity of RNA
expected from plants.
Protocols for Isolating Total RNA
No single protocol was optimal for isolating total RNA from all
tissues and species. We therefore used 18 distinct protocols to
isolate total RNA from samples, where the method used for
specific species and tissues was shaped by the experience of
individual researchers. Protocols involved commercially available
kits (e.g., Qiagen’s RNeasy Plant Minikit), non-commercial RNA
extraction lab protocols (e.g., CTAB, acid phenol) and hybrid
methods that combined components of both commercial kits and
lab protocols. The detailed protocols used for all isolations are
available online in Appendix S1.
For all isolations, tissues were collected and flash frozen in liquid
nitrogen as soon after collection as possible. In most cases, flash
freezing was performed immediately, although in some field
collections there was a short delay before freezing, which might
account for cases of severe degradation. Slightly less than half of
the samples (493) were sent as frozen tissue on dry ice to BGIShenzhen (hereafter ‘BGI’), China, for RNA isolation. The
remaining samples were isolated by individual labs and sent to
BGI as either frozen RNA extracts on dry ice or as dehydrated
total RNAs shipped using Genvault’s GenTegra RNA kit
(IntegenX, Pleasanton, CA, USA). A comparison of a subset of
our samples (N = 168) allowed us to compare effects of shipping
liquid RNA versus Genvault dehydrated RNA (see Results).
In answering our first research question relating to the success
rates of Qiagen’s RNeasy Plant Minikit and alternative
protocols, we deemed samples acceptable for sequencing if they
met the following criteria: total RNA concentration $150 ng/ml
(eluted in 60 ml of RNase free water; note this elution volume
only applies to samples shown in Table S2), total RNA mass
$20 mg, r26S/18S $1, RIN $8, OD 260/280$1.9 and OD
260/230$1.5. We did relax these rigid criteria for samples not
shown in Table S2 by sequencing samples of lower quality. By
including lower quality samples that did not meet the criteria
identified above, this also allowed us to provide a stronger and
more comprehensive test of how RNA quality correlated with
sequencing success.
Library Construction and Sequencing Methodology
Our study sought to isolate and sequence mRNA from all cells
using Illumina’s GAIIx and HiSeq sequencing platforms. Our
standard procedure was to isolate polyA RNA from 20 mg of total
RNA treated by DNaseI (NEB, Ipswich, MA, USA) using
Dynabeads mRNA purification kit (Life Technologies, Grand
Island, NY, USA). We used up to 50 mg when the use of a lower
mass (typically 20 mg) was insufficient for successful library
construction, which we assessed by running final PCR products
on an agarose gel; the library construction was considered to have
failed when there was no visible band. We used less than 20 mg of
total RNA when isolation of an important sample yielded low
RNA mass but library construction was successful. Purified polyA
RNA was fragmented in a fragmentation buffer (Life Technologies, Grand Island, NY) at 70uC for 90 seconds to 200–300 nt
fragment sizes. The first cDNA strand was synthesized with
random hexamer primers using the SuperScript II reverse
transcription kit (Life Technologies, Grand Island, NY). The
second-strand synthesis was performed by incubation with RNase
H (Life Technologies, Grand Island, NY) and DNA polymerase
(Enzymatics, Beverly, MA, USA). Short double-stranded cDNA
fragments were purified using one of two methods. Our standard
procedure was to use the QIAquick PCR purification kit (Qiagen,
Valencia, CA, USA), whereas for samples with low RNA mass we
used Agencourt AMPure beads (Beckman Coulter, Beverly, MA).
Both methods were followed by end-repair with Klenow
polymerase, T4 DNA polymerase and T4 polynucleotide kinase
(Enzymatics, Beverly, MA, USA). A single 39 adenosine (A base)
was added to the double-stranded cDNA using Klenow (39 to 59
exo-) (Enzymatics, Beverly, MA, USA) and dATP (GE Healthcare,
Buckinghamshire, UK). The Illumina PE Adapter oligo mix was
ligated onto the A base on repaired double-stranded cDNA ends
and DNA fragments of a selected size were gel-purified to make
sure the insert size was 200 bp (610% deviation). Thereafter,
Measures of RNA Quality and Yield
To ensure consistency in measurement of RNA quality, all
measurements were performed at BGI. This was especially
important because measures of RNA quality outside BGI often
deviated substantially from measurements taken at BGI on the
same samples, suggesting that methodological differences and
degradation during shipping can be important sources of variation
in measures of RNA quality. RNA purity was determined by
assaying 1 ml of total RNA extract on a NanoDrop 1000 or
NanoDrop 8000 spectrophotometer (Thermo Scientific, Wilmington, DE, USA). We measured the optical density (OD) ratio
between 260 nm and 280 nm from 526 samples, where pure RNA
eluted in H20 (pH 7.0–8.5) or TE (pH 8.0) is expected to exhibit a
ratio of 2.0–2.1; deviations below 2 indicate acidic pH or
contamination by proteins, phenol and other impurities [13Appendix 8,24]. We also measured OD 260 nm/230 nm from
406 samples, where pure RNA gives a ratio close to 2. Deviations
of OD 260/230 below 2 indicate contamination by polysaccharides, phenol, TRIzol or low pH, which absorb light around
230 nm [18,24,25].
The concentration, integrity and yield of total RNA were
determined by assaying 1 ml of diluted total RNA using
Agilent’s 2100 Bioanalyzer with the Plant RNA Nano or Pico
PLOS ONE | www.plosone.org
3
November 2012 | Volume 7 | Issue 11 | e50226
Plant RNA Isolation and Transcriptome Sequencing
Figure 1. Example of the qualitative and quantitative results from Agilent’s Bioanalyzer 2100. In panels (A) and (B) we show peaks
produced from electropherograms (top left) that depict the size distribution of RNA fragments, the corresponding gel-like image of RNA fragments
(top right), and metrics of RNA concentration and integrity (r26S/18S and RIN). We show two representative samples with (A) high-quality RNA in
terms of high yield and minimal degradation (r26S/18S $1, RIN $8), and (B) low-quality RNA in terms of modest yield and high degradation (r26S/18S
,1, RIN ,5). Absence of clear peaks at 26S and 18S and an abundance of short fragments clustering on the left of the electropherogram in panel (B)
are the hallmarks of severe RNA degradation. In electropherograms, ribosomal 26S (large) and 18S (small) subunits are shown in green and pink,
respectively. Concentrations of 26S and 18S were calculated by taking the area above the green and pink straight lines at the base of the 26S and 18S
peaks, respectively.
doi:10.1371/journal.pone.0050226.g001
libraries were amplified by 15 cycles of PCR with Phusion DNA
polymerase (NEB, Ipswich, MA, USA) and ‘‘indexed’’ paired-end
PCR primers; the prepared libraries were 322 bp long. The
amplified libraries were denatured with sodium hydroxide and
diluted to 2.5 pM in hybridization buffer for loading onto a lane of
an Illumina GAIIx or HiSeq flowcell. Read length on the GAIIx
platform was typically adjusted to 73–75 bp (mean = 74 bp), but
four samples were read at 100 bp. Read length on the HiSeq
PLOS ONE | www.plosone.org
platform was predominately 90 bp with a small number of
sequences with 84–87 bp. All samples were sequenced as pairedend reads, and up to eleven samples were multiplexed into a single
lane of the Illumina flow cell.
Prior to assembly of sequences into scaffolds we filtered
sequence reads based on four criteria. First, all samples were
indexed with unique 6–7 bp sequences and we retained reads with
0 or 1 bp sequence mismatch (.99.5% of reads). Second, we
4
November 2012 | Volume 7 | Issue 11 | e50226
Plant RNA Isolation and Transcriptome Sequencing
plants and algae showed that our assemblies reliably assembled
known proteins; on average 88% 61% (95% CI; median = 95%)
of scaffolds matched a known protein with high confidence
(E,10210).
Our assembly methods followed Li et al. [21]. In brief,
SOAPdenovo uses the de Bruijn graph method to represent all
reads, with a K-mer designated as a node and (K21) base overlaps
between two K-mers as an edge. We used a single K-mer length of
29 for all assemblies because preliminary analyses and other
published studies have shown that a single K-mer of length 25–31
can recover a large number of transcripts with variable expression
levels [11]. Although adjusting K-mer length or using multiple Kmers can optimize the assembly of individual transcriptomes
[28,29], we used a single K-mer in this study to facilitate
comparisons of transcriptome results across all samples based on a
standard and consistent method of assembly. The alternative of
optimizing samples by K-mer length could increase total number
of transcripts identified, but it would have the undesirable effect of
confounding our results with differences in methods of assembly;
this ad hoc approach was impractical given the number of
samples. Some tips and low-coverage K-mers in the graph were
removed to eliminate branches and reduce problems induced by
sequencing errors. The de Bruijn graph was then converted into a
contig graph by turning the series of linearly connected K-mers
into precontig nodes using the merging option for the similarity of
sequences (–M) equal to 2. Dijkstra’s algorithm detected bubbles,
which were then merged into a single path if sequences of
branches were sufficiently similar. By this approach, nearly
identical sequences could be assembled into consensus ‘‘contig’’
sequences, where every base was defined in an uninterrupted
linear sequence. Contigs were then connected by paired-end reads
joined by a known distance to form a scaffolding graph (Figure 2).
Edges in this graph were connections between contigs, and the
edge length was estimated from the insert size (200 bp) of the
paired reads. We retained all scaffolds $100 bp where $95% of
bases were defined (Figure 2).
From each assembled transcriptome the response variable was
the number of scaffolds $1000 bp, which ideally represents an
estimate of the number of full length transcripts in a sample. There
are potential pitfalls with defining genetic diversity of transcriptomes in this way. For example, if we focus on the number of
scaffolds of all lengths as our response variable, then some genes
will be represented by many small scaffolds. We clearly see this in
our samples because the number of scaffolds $100 bp frequently
checked for the adapter sequences in the paired sequence reads
and rejected pairs with more than 15 bp of continuous adapter
sequence, allowing a maximum of three mismatches; this resulted
in rejection of ca. 3% of reads. Third, we removed all sequence
read pairs where either read had .5% uncalled bases (N’s)
(,0.5% of reads). Similarly, we removed sequence read pairs
when either read had more than 20 bp with Phred-type Qscore,10 (3.5% of reads). Together these four criteria ensured
that contaminant sequences were eliminated from our data and
only the highest quality reads were used in assemblies.
We extracted several response variables for analyses from the
sequencing results to depict sequencing depth. Specifically, we
used total number of bases sequenced, which is the number of
bases that exceeded the Phred quality 20 threshold (hereafter
‘‘Q20 bases’’, a measure of the quality/reliability of sequenced
base calls) and the total number of sequence reads.
Transcriptome Assembly
Transcriptomes were assembled using SOAPdenovo [20,25,26],
which is designed to rapidly assemble large genomes sequenced on
short-read sequencing platforms such as Illumina (Figure 2). When
1KP began in early 2009, this was the best option available. Over
the last year, other assemblers (e.g., Trinity, Oases, TransABySS)
[6,27,28] have been introduced that are specifically designed for
assembling transcriptomes, where coverage varies from gene to
gene because of differences in expression and alternative splicing.
We have repeated the assemblies using a new software
SOAPdenovo-trans (http://soap.genomics.org.cn/SOAPdenovoTrans.html) that is faster than Trinity and recovers more fulllength transcripts from the same data set, based on a comparison
to the annotated genomes for rice and mouse (unpublished results).
We find that the number of large ($1000 bp) scaffolds from our
original assemblies and the newer assemblies are highly correlated
(r Pearson = 0.77, P,0.001, N = 626), indicating that the original
assemblies capture a large amount of the variation explained by
higher yielding newer assemblies. We therefore decided to base
our analyses on the older assemblies, since they more honestly
reflect the basis on which project decisions were made and
SOAPdenovo-trans has not yet been formally published, although
it is available to the public (see above). Moreover, a comparison of
large scaffolds ($1000 bp) from our SOAPdenovo assemblies to
GenBank’s non-redundant protein sequence database (nr) using
the BLASTx sequence translation tool containing all vascular
Figure 2. Schematic representation of the method used to assemble Illumina reads into contigs, and contigs into scaffolds. All reads
were initially assembled into contigs using the de Bruijn graph method without using information about paired-end reads (shown by blue dashed
lines). A contig’s sequence was resolved at every base. Contigs were then assembled into longer scaffolds by connecting contigs that contained
paired-end reads assembled into separate contigs. Assembling scaffolds in this way allowed us to create longer sequences of known length, but
sometimes there were gaps of unknown sequence. These gaps were constrained to represent ,5% of total sequence length.
doi:10.1371/journal.pone.0050226.g002
PLOS ONE | www.plosone.org
5
November 2012 | Volume 7 | Issue 11 | e50226
Plant RNA Isolation and Transcriptome Sequencing
platform (Illumina GAIIx vs. HiSeq), which was also correlated
with the length of sequence reads (GAIIx: 74 bp/100 bp, HiSeq:
90 bp).
We achieved on average 2 Gb of data across all samples (with
an enforced minimum of 1 Gb and a practical maximum of 4 Gb).
Nevertheless, there was variation in the number of bases
sequenced (i.e., sequencing depth) and we included this variable
as a covariate in analyses.
We performed two sets of analyses because OD ratios were
measured for a smaller number of samples than the other
variables. In the first analysis, we included all nine variables
described above, and we used maximum likelihood to compare the
explanatory power of all possible linear combination of factors
(511 possible models). The best model was identified according to
the lowest Akaike information criterion (AIC) value; AIC measures
the explanatory power of the model weighted by the number of
parameters included, where lower AIC corresponds to greater
explanatory power. As per convention, nested models with AIC
values that differed by .2 (DAIC) were viewed as significantly
different, whereas nested models with DAIC ,2 were seen as
statistically equivalent [36]. Because AIC values are sensitive to
variation in sample size, only samples for which we had data on all
variables were included in the analysis; the total size of this
‘‘balanced’’ data set was 184 samples. However, all data were used
in the calculation of the variation explained by each factor (partial
r2) because this statistic is less sensitive to variation in sample size.
The second analysis used the larger balanced data set of 512
samples, which lacked the OD ratios but contained all other
variables. This data set considered all possible combinations of
seven explanatory variables (127 models in total) and selected the
best model according to the methods described above.
We focused our results on analyses that used scaffolds
$1000 bp but we also performed analyses on smaller threshold
cutoffs. Comparing results of the best-fitting models produced
qualitatively identical results for all data sets with respect to the
direction and statistical significance of explanatory variables. The
only exception to this was that tissue type became a significant
predictor of the number of scaffolds when the minimum threshold
scaffold size was set to 600 bp, but this result was only found for
the large data set without OD ratios.
exceeds 100,000, which is much larger than conventional
estimates of the number of genes expressed in plant tissues [30–
34]. By contrast, if we only count scaffolds that meet a large
threshold cutoff, then we increase the probability of each scaffold
representing a unique full-length transcript, but potentially miss
legitimate smaller genes.
We determined the optimal threshold size of scaffolds by
evaluating a range of cutoffs. We performed separate analyses on
scaffold assemblies with minimum cutoffs set to 500 bp (i.e., all
scaffolds were $500 bp), 600 bp, 700 bp, 800 bp, 900 bp and
1000 bp. The strength of Pearson product moment correlations (r)
between data sets that varied in the threshold size of scaffolds
ranged between 1.00 (1000 bp vs. 900 bp) and 0.78 (1000 bp vs.
700 bp), with P,0.001 for all correlations (Table S3). In other
words, a transcriptome data set with a minimum cutoff of 500 bp
or 1000 bp contains largely overlapping information about the
estimated number of genes expressed in a sample’s transcriptome.
We focus here on the number of ‘‘large’’ scaffolds ($1000 bp, with
at least 950 resolved bp) in our results for three reasons: 1) there is
a high correlation between the number of large scaffolds
($1000 bp) and the number of scaffolds assembled using smaller
size cutoffs (i.e., 500 bp, 600 bp, 700 bp, 800 bp and 900 bp)
(Table S3); 2) large scaffolds explained the greatest variation in
most statistical models presented in the results; and 3) large
scaffolds exhibited a higher proportion of high confidence matches
to known proteins in BLASTx than smaller thresholds (data not
shown).
Statistical Analyses
Effects of tissue type. We tested the effect of tissue type on
RNA quality, RNA yield, sequencing depth (i.e., number of base
pairs sequenced, number of Q20 bases and number of sequence
reads), and number of long scaffolds assembled using Proc Mixed
in SAS 9.1 (SAS Institute, Cary, NC, USA). We first used a
likelihood ratio test (LRT) to determine whether an unequal
variance model (i.e., estimating variance separately for each tissue
type) provided a better fit for our data than a standard model that
assumes homogeneity of variance. We then assessed whether our
data met assumptions of normality, and we transformed data
accordingly using either log or square root transformations.
Pairwise differences between least-squares mean values of tissue
types were compared using the Tukey-Kramer adjustment for
multiple comparisons implemented using the ADJ option in SAS.
Mean values were subsequently back-transformed to their original
units for illustrative purposes.
It is widely believed that harvesting the youngest tissue results in
higher RNA quality and yield [35, p. 116]. To test this
conventional wisdom, we systematically harvested young freshly
expanding tissue and fully expanded mature tissue (but not ‘‘old’’
or senescing) from a phylogenetically diverse subset of 71 species as
described above (Table S1). The statistical model included tissue
age as the main effect with species included as a blocking factor.
Predicting the size of assembled transcriptomes. We
attempted to understand the factors that provide the best
predictors of the size of the assembled transcriptome, measured
as the number of large scaffolds ($1000 bp). We included four
types of variables in the analyses explained below: 1) tissue type, 2)
measures of RNA quality and yield, 3) factors associated with
sequencing methodology, and 4) a covariate for sequencing depth
(number of bases). Measures of RNA quality and yield included
RNA concentration (ng/ml), total RNA yield or mass (mg), r26S/
18S, RIN, OD 260/280 and OD 260/230. Sequencing methodology refers to factors that were under experimenter control; this
included the mass of total RNA sequenced (mg) and the sequencing
PLOS ONE | www.plosone.org
Results
Likelihood of Isolating High-quality RNA
When presented with a previously unstudied plant tissue, most
researchers first attempt RNA isolation using one of several
commercially available plant RNA isolation kits. To provide
researchers with a priori guidance on the likelihood of success we
asked: What is the success rate of a commonly employed
commercial kit in extracting high-quality RNA for NGS (see
Appendix S1)? For each of 81 samples that represented a wide
phylogenetic diversity of plants we isolated total RNA from fresh
leaf material using Qiagen’s RNeasy Plant Minikit (Appendix S1,
Protocol 1; Table S2); 46% of these samples met our criteria for
yield and quality of total RNA (see Methods). We attempted an
additional extraction using RNeasy’s alternative protocol (Appendix S1, Protocol 2) on 37 of the samples that failed using Protocol
1; 32% (12) of these samples met our criteria for RNA quality and
mass. Thus, using a commonly used commercial extraction kit
(Protocol 1) and its prescribed alternative modification (Protocol
2), we isolated total RNA from 66% of samples that met our
criteria for sequencing. However, certain taxa never yielded highquality RNA using commercial kits, and therefore 16 alternative
6
November 2012 | Volume 7 | Issue 11 | e50226
Plant RNA Isolation and Transcriptome Sequencing
algae and plants with high confidence (,E210). All measures of
sequencing depth and transcriptome size varied significantly
among plant tissues (Table 1). Number of bases sequenced,
number of bases sequenced with high-quality (‘‘Q20 bases’’), and
number of sequence reads were consistently lowest in green algal
cells (hereafter just ‘‘algae’’ or ‘‘algal cells’’) and highest in floral,
fruit and bud tissues (Figure 3C, 3D, Table S6). There was weaker
but still significant variation among plant tissues for number of
large scaffolds ($1000 bp) (Table 1, Figure 3D). Again, algal cells
yielded the lowest number of scaffolds, whereas flowers, fruits and
buds returned the highest numbers of scaffolds. However, these
last contrasts were never significant in a posteriori pairwise
comparisons that corrected for multiple tests (Table S7).
Unexpectedly, age of tissue (i.e., young freshly expanding vs.
mature fully expanded but non-senescing tissue) had only a weak
effect on RNA quality and no clear effect on sequencing results
(Table 1). Although younger leaves gave higher yields of RNA (see
Methods), the only effect of tissue age on RNA quality was seen for
RIN (F1,41 = 48.91, P,0.001), which was 13% higher in young
tissues (mean = 6.93, SE = 0.08) than in mature tissues
(mean = 6.14, SE = 0.07). There was no clear effect of tissue age
on any metric of transcriptome size including the number of large
scaffolds (Table 1). The only exception to this result was seen when
we reduced the minimum threshold size cutoff of scaffold
assemblies to 500 bp, at which point young tissue produced
7.5% more scaffolds than older tissue (F1,31 = 7.1, P = 0.01).
protocols were developed and/or implemented for the remaining
samples (see Appendix S1).
Effects of Tissue Type on RNA Quality and Assembled
Transcriptome Size
Yield and quality of total RNA isolated from samples varied
substantially among plant tissue types. The mass of extracted RNA
varied among tissues by more than 350% (P = 0.007), with flowers,
buds and tissue mixtures yielding the most RNA and fruits
producing the least (Figure 3A, Table 1, Table S4, S5). Measures
of RNA quality, including r26S/18S, RIN (Figure 3B), and OD
260/280, also significantly varied among plant tissues (Table 1).
With the exception of a fairly strong correlation between r26S/
18S and RIN (r = 0.64, P,0.001), descriptors of RNA quality were
not highly correlated with one another. For example, although the
mass of RNA isolated from flowers was 3.56 greater than RNA
mass from fruits (P = 0.03), average RIN was 37% lower in flowers
than in fruits, although not significantly (P = 0.21). Means and
statistical comparisons among all tissues are available online
(Table S4, S5).
Across all sequenced samples, our assembly of transcriptomes
recovered a large number of putative transcripts, with a mean of
15,512 scaffolds $500 bp (6192 [stderr], N = 629 samples) in
length, 7444 scaffolds $800 bp (6124 [stderr], N = 629) and 4997
scaffolds $1000 bp (699 [stderr], N = 629). As described in the
methods, 88% (61%, 95% CI) of long scaffolds ($1000 bp)
matched a known protein in the GenBank nr protein database for
Figure 3. Variation among tissue types in RNA quality and transcriptome size. We observed differences among tissue types for (A) total
RNA mass (mg) isolated, (B) RIN, (C) sequencing depth and (D) number of scaffolds. For each tissue, we show the mean +1 SE and sample size at the
base of columns. A posteriori pairwise contrasts among means corrected for multiple comparisons are shown in Supplemental Tables 3 and 5.
doi:10.1371/journal.pone.0050226.g003
PLOS ONE | www.plosone.org
7
November 2012 | Volume 7 | Issue 11 | e50226
Plant RNA Isolation and Transcriptome Sequencing
Table 1. Effects of tissue type and age on metrics of RNA quality and sequencing.
Tissue type
2
ndf , ddf
3
Tissue age
F
4
P
5
ndf, ddf
F
P
RNA quality
RNA mass1
7,24
3.77
0.007
1,40
1.36
0.25
0.33
r26S/18S
7,1071
10.06
,0.001
1,41
4.56
RIN
7,1061
5.14
,0.001
1,41
48.91
,0.0001
OD 260/280
6,503
3.16
0.005
1,40
0.58
0.45
OD 260/230
6,383
1.30
0.26
–
–
–
Sequencing
Bases
7,576
7.95
,0.001
1,31
2.63
0.12
Q20 bases
7,576
13.23
,0.001
1,31
2.66
0.11
Reads
7,576
10.98
,0.001
1,31
2.54
0.12
Scaffolds
7,574
2.33
0.024
1,31
0.87
0.36
Significant effects (P,0.05) are shown in bold.
1
Measured as mg of total RNA isolated from a given tissue.
2
Numerator degrees of freedom (ndf) of F-statistic.
3
Denominator degrees of freedom (ddf) of F-statistic. ddf are low for RNA mass because an unequal variance model was used to account for heteroscedasticity in
residuals among tissues.
4
F-statistic from analysis of variance (ANOVA).
5
P-value of F-statistic given ndf and ddf.
doi:10.1371/journal.pone.0050226.t001
Predicting Size of Assembled Transcriptomes
RNA quality and covariates associated with sequencing
methodology both predicted the number of large scaffolds
assembled from individual samples. For the data set that included
OD ratios, we compared all 511 possible linear models (Table S8)
using maximum likelihood statistics (see Methods). The best-fitting
model (AIC = 3187) contained seven explanatory variables that
accounted for 36% of the total variation in the number of scaffolds
(Table 2, Table S8). This best-fitting model included tissue type,
r26S/18S, RIN, OD 260/280, OD 260/230, as well as
sequencing platform (GAIIx vs. HiSeq) and the mass of total
RNA used in library construction; a model that also included RNA
concentration fit almost equally well. Among variables relating to
tissue type and RNA quality, only RIN and OD 260/230 were
significant predictors of the number of large scaffolds (Table 2).
Both RIN and OD 260/230 were positively correlated with the
number of large scaffolds, and together they accounted for 8.9% of
the total variation (Figure 4, Table 2). Variables associated with
sequencing methodology accounted for most of the variation in the
number of large scaffolds. Specifically, there were significantly
fewer scaffolds from transcriptomes sequenced on the HiSeq
platform compared to sequences generated from the GAIIx
platform; this one factor accounted for 22% of the total variation
in the number of large scaffolds (Figure 4, Table 2). This result was
unexpected because the HiSeq platform had a nominally longer
read length (90 bp) than the GAIIx platform (74 bp). There was
also a weak yet significant negative relationship between mass of
total RNA used to construct libraries for sequencing and number
of scaffolds (Table 2).
We also used a larger data set (see Methods) that lacked OD
ratios to predict the number of large scaffolds. Among all possible
linear models (Table S9), the model with the highest information
content included the same variables (tissue type, r26S/18S, RIN,
sequencing platform and the amount of RNA sequenced;
AIC = 9174) as the best-fitting model described above (minus the
OD ratios). This model accounted for less total variation (14.5%)
PLOS ONE | www.plosone.org
Table 2. Statistical significance of explanatory variables in the
best-fitting models for the data set with OD ratios and
without OD ratios.
df1
Variable
F
P
r2
0.002
OD ratios included
Tissue type
0.23
0.791
Sequencing platform 1,174
2,174
59.27
,0.001
0.219
r26S/18S
1,174
0.90
0.344
0.003
RIN2
1,174
9.38
0.003
0.035
RNA seq3
1,174
8.11
0.005
0.030
OD 260/280
1,174
3.46
0.065
0.013
OD 260/230
1,174
14.69
,0.001
0.054
0.003
OD ratios excluded
Tissue type
0.28
0.96
Sequencing platform 2,499
7,499
30.35
,0.001
0.104
r26S/18S
1,499
1.00
0.318
0.002
RIN
1,499
8.08
0.005
0.014
RNA seq
1,499
12.77
,0.001
0.022
The best-fitting models were determined by comparing AIC values among
models that considered all possible combinations of explanatory variables.
Statistical significance was determined using an ANOVA model with type III
sums-of-squares (SS). Variables with P,0.05 are shown in bold. Partial r2 values
(coefficient of determination) were determined by dividing SS values of each
factor by total SS.
1
Numerator (first number) and denominator (second number) degrees of
freedom (df) for F-test.
2
RNA integrity number (RIN).
3
Mass of total RNA sequenced.
Other abbreviations as per Table 1.
doi:10.1371/journal.pone.0050226.t002
8
November 2012 | Volume 7 | Issue 11 | e50226
Plant RNA Isolation and Transcriptome Sequencing
plant transcriptomes. First, we identified specific tissues and
metrics of RNA quality that provide significant predictors of the
number of large scaffolds, but these variables explained relatively
little total variation in the data. Second, components of sequencing
methodology (e.g., sequencing platform) had a larger impact on
the number of scaffolds than metrics of RNA quality. Third,
Illumina sequencing and assembly of scaffolds were fairly robust to
variation in RNA quality. Based on these results, we recommend
researchers and genomic facilities optimize RNA quality and
sequencing methodology as much as possible, but still sequence
samples of low to moderate quality as these samples can yield large
transcriptomes sequenced by Illumina.
in the number of large scaffolds than the previous model (Table 1).
As with the smaller data set, RIN, sequencing platform and total
amount of RNA sequenced were significant predictors of the
number of large scaffolds, and directionality of correlations was
identical to those described above (Figure 4). In this larger data set,
number of bases sequenced was also positively correlated with
number of scaffolds (P,0.001, r2 = 0.02), but including the
number of bases in the model led to a weaker overall fit of the
data (DAIC = 15.9) compared to models that excluded this
covariate.
Finally, we evaluated the optimal method of shipping RNA
using a subset of samples (see Methods). Samples were either sent
as frozen RNA extract on dry ice (N = 100) or as dehydrated total
RNA (N = 68) using Genvault’s GenTegra RNA kit (IntegenX,
Pleasanton, CA). Genvault samples had 17% more scaffolds on
average (56756265 scaffolds) than samples sent as frozen extracts
(48426211 scaffolds) (F1,167 = 6.29, P = 0.013), and this effect
explained 3.6% of the total variation in the number of scaffolds
among samples.
Effects of Tissue Type on RNA Quality and Transcriptome
Sequencing
As expected, quality of RNA and size of transcriptomes varied
among tissue types [30], but not always in ways that were
consistent with conventional wisdom. Some tissues such as flowers
and roots resulted in a high yield and quality of total RNA,
whereas the most commonly harvested tissue (leaves) resulted in
relatively modest to low RNA yields and quality (Figure 3, Table
S4). Although many of these results show strong statistical support,
we caution that they should be viewed tentatively given the low
sample sizes and limited phylogenetic coverage of some tissues.
Discussion
Three results from our study have important implications for
the isolation, sequencing and assembly of phylogenetically diverse
Figure 4. Factors that significantly predicted the number of large scaffolds. Among our measures of RNA quality, (A) RNA integrity number
(RIN) and (B) OD 260/230 ratio were the strongest predictors of the number of scaffolds $1000 bp. (C) Sequencing platform also had a strong effect
on number of large scaffolds (P,0.001, Table 2; numbers at the base of bars show sample size), and (D) mass of RNA sequenced had a weak but
detectable effect (see Table 2). Note, for most samples we used 20, 30 or 40 mg of total RNA for sequencing, but a few samples used intermediate or
lower amounts.
doi:10.1371/journal.pone.0050226.g004
PLOS ONE | www.plosone.org
9
November 2012 | Volume 7 | Issue 11 | e50226
Plant RNA Isolation and Transcriptome Sequencing
[40,41]. This is also an unlikely answer. For example, algae were
disproportionately represented on the HiSeq platform (18% of
HiSeq samples were algae vs. 7% of GAIIx samples), but numbers
of large scaffolds sequenced from algae and non-algae were
statistically equivalent (P = 0.17) when we restrict our analysis to
just HiSeq samples.
A third possible explanation relates to differences in the true
quality of sequenced bases on the two platforms. Sequence quality
is quantified as Q-scores, where higher Q indicates lower
sequencing error. Unfortunately, no direct comparison of Qscores can be made between samples sequenced on the GAIIx and
HiSeq platforms because the software used to estimate Q-scores
on the GAIIx platform systematically underestimates Q-scores and
thus over-estimates error rates [22], whereas recent versions of the
software used for the HiSeq platform quantify error rates of
sequenced base pairs more accurately [22]. Hence the number of
high quality bases that are useful for transcriptome assembly is
higher than indicated for the GAIIx platform, even when we
compare the two platforms by a simple Q20 threshold, because
they do not define Q values in an equivalent sense.
Despite the large differences in RNA quality and sequencing
depth, we observed little variation in the number of scaffolds
among tissues (Figure 3D). Leaves did contain fewer scaffolds than
other tissues, but no tissues were significantly different from one
another in pairwise comparisons that adjusted for type-I error
(Table S7). Perhaps the most surprising result was the lack of a
large difference in RNA quality between young and mature leaves
(Table 1). Many researchers report difficulties in extracting nucleic
acids from older tissues [35, p. 116], and our result showing that
young tissue has higher RNA integrity and yield compared to
mature tissue partially supports this view. Despite this fact, our
results also show that if RNA can be extracted, most measures of
RNA quality are comparable between young and mature tissues,
and the number of large scaffolds is statistically equivalent.
Predicting Size of Transcriptomes
Our results identify specific metrics of RNA quality that predict
the size of assembled transcriptomes. Specifically, measures of
RNA integrity (RIN) and purity (OD 260/230) predicted the
number of large scaffolds assembled from samples. Since RIN
quantifies RNA degradation [20], our results imply that degraded
RNA prevents assembly of large scaffolds from sequenced
transcriptomes. Similarly, our results indicate that impurities in
RNA samples interfere with cDNA synthesis from mRNA because
OD 260/230 ratios below 2 indicate the presence of contaminants
(e.g., salts and organic compounds). This result is expected given
studies that show impurities, including ubiquitous plant secondary
metabolites, can inhibit reverse transcriptase [37,38].
Although we identify significant predictors of transcriptome
sequence/assembly quality, it is noteworthy that these predictors
account for less than 10% of the total variation in the number of
large scaffolds. There are two principal explanations for this weak
predictive value. First, transcriptomes are by their very nature
complex mixtures of many types of RNA that dynamically change
in space and time [30,32], and we specifically targeted mRNA,
which comprises only 1–5% of the total RNA in a cell [13]. As
such, all metrics of total RNA quality are bound to be coarse
depictions of quality and quantity of mRNA. Second, sequencing
and assembling transcriptomes is associated with many steps in
which noise and bias can be introduced to obscure an otherwise
clear effect of RNA quality [39]. Therefore, our results suggest that
maximizing RIN and OD 260/230 will increase success of
transcriptome sequencing to a point (Figure 4A, 4B), but it is still
possible to sequence and assemble transcriptomes with suboptimal
levels of these and other measures of RNA quality.
The most counter-intuitive result was the relationship between
sequencing platform and number of large scaffolds. Contrary to
our expectation, Illumina’s HiSeq platform, which produced
predominantly 90 bp reads, resulted in 34% fewer large scaffolds
than samples sequenced on Illumina’s GAIIx platform, which
produced 74 bp reads. This result is surprising because the HiSeq
platform was designed to increase sequencing depth and produce
longer high-quality reads that should facilitate the assembly of
longer scaffolds and larger transcriptomes. Why then do we find
the opposite to be true in our large and phylogenetically diverse
data set? One possibility is that the sequencing chemistries of the
HiSeq and GAIIx platforms are biased in their ability to sequence
GC-rich templates, which could lead to systematic biases when
sequencing inherently GC-rich mRNA. This does not seem to
explain our result because a comparison of HiSeq and GAIIx
platforms finds no clear bias in %GC content of scaffolds (Figure
S1). A second possible explanation is that samples sequenced on
the HiSeq platform represent a phylogenetically biased subset of
species with a lower diversity of expressed genes, such as algae
PLOS ONE | www.plosone.org
Practical Advice for Sequencing Plant Transcriptomes
Given the results of this study and the collective experience of the
many collaborators involved in 1KP, we are able to provide plant
researchers with practical recommendations for maximizing success
of isolating high-quality total RNA from plants for NGS (also see
Appendix S1). First, flash-freezing fresh tissue immediately upon
collection ensures the highest quality RNA with the least
degradation [26]. Second, using flower, bud or young root tissues
can facilitate the isolation of high-quality total RNA [35, p. 116].
Third, smaller amounts of young tissue generally yield as much
RNA as larger amounts of mature tissue. Fourth, use of
commercially available kits and easily implemented hybrid protocols are the most efficient method for successfully isolating total
RNA for NGS when working with new types of tissue or previously
unstudied species. If these extractions do not provide the desired
result we recommend choosing the isolation protocol in Appendix
S1 used for the taxa most closely related to the focal organism of the
experimenter (see Table S1). Fifth, working in an RNase-free
environment is essential for successful isolation and preservation of
total RNA. Sambrook and Russell [13] provided excellent advice on
how to avoid RNase contamination, and we emphasize here that
various chemicals can be added to solutions (e.g., DEPC-treated
water) as well as to working surfaces and equipment (e.g., RNase
Zap, Ambion, Austin, TX) to inactivate RNase enzymes.
Our experience in sequencing and assembling over six hundred
transcriptomes also allows us to make conclusions and recommendations that promote implementation of NGS for the study of
transcriptomes. Optimizing RNA integrity (RIN) and purity (OD
260/230) will improve size of assembled transcriptomes. However,
we show that NGS is robust to variation in RNA quality, and
therefore researchers and genome facilities should consider
sequencing samples that show significant deviation from optimal
levels of RIN scores, OD 260/230 ratios and certain other metrics
of RNA quality. Implementing this recommendation could save
researchers and sequencing facilities considerable time and money
presently allocated for optimizing samples. Also, shipping dehydrated RNA to genome centers, instead of frozen extract, can
further increase sequencing success.
Conclusions
NGS is a powerful method for studying plant transcriptomes,
and it is now possible to sequence virtually all expressed genes in a
given tissue. Our study developed and evaluated many protocols
10
November 2012 | Volume 7 | Issue 11 | e50226
Plant RNA Isolation and Transcriptome Sequencing
Table S5 P-values for a posteriori pairwise contrasts of
that can facilitate the isolation of total RNA from most green
plants. We have also identified several factors that affect the
sequencing and assembly of transcriptomes. As a whole, these
results provide a resource to increase success and efficiency of
NGS to tackle the next generation of questions in plant biology.
RNA quality among tissue types. P-values are adjusted for
multiple comparisons within each variable using the TukeyKramer correction method. P-values ,0.05 are bolded.
(PDF)
Table S6 Least-squares means of descriptors of se-
Supporting Information
quencing success among plant tissue types. The leastsquares mean 61 SE and sample size are provided for each tissue
type.
(PDF)
Figure S1 A comparison of the frequency of scaffolds
according to variation in % GC content between samples
sequenced on HiSeq versus GA II platforms. Distributions
are broadly overlapping. The long right-tail from the HiSeq
samples is caused by a disproportionate number of algae samples
sequenced on that platform. These algae exhibited especially rich
GC transcripts, which is consistent with the results of published
whole genome sequences of green algae [40,41].
(PDF)
Table S7 P-values for a posteriori pairwise contrasts of
various sequencing metrics among different tissue
types. P-values are adjusted for multiple comparisons within
each variable using the Tukey-Kramer correction method. Pvalues ,0.05 are bolded.
(PDF)
Table S1 Raw data used to analyze RNA quality,
sequencing depth and number of scaffolds. For each sample
we provide the species and family name, the tissue collected (Yyoung, expanding tissue; M-mature, fully expanded but nonsenescing tissue), the contributor of the sample, and the RNA
isolation method (numbers correspond to Supplementary Protocols). Metrics of RNA quality and yield include the mass of total
RNA isolated, RNA concentration, the ratio of the large RNA
ribosomal subunit to the small RNA ribosomal subunit (r28S/18S),
the RNA Integrity Number (RIN), OD 260/280, and OD 260/230.
Sequencing data includes the sequencing platform, read length
(averaged between the paired-ends), mass of total RNA sequenced,
number of reads, number of bases, number of bases that surpassed
the Q20 threshold and the number of scaffolds (c) greater than
100 bp (c100), greater than 200 bp (c200), and so on and so forth.
We also include the number of scaffolds that fall within bins (b) of
different sizes, ranging from b100 (i.e., the number of scaffolds 100–
199 bp long) to b1000 (number of scaffolds greater 1000).
(XLS)
Table S8 The statistical fit of all possible combinations
of explanatory factors in the data set that included OD
ratios. Fit of models was statistically compared using maximum
likelihood statistics according to the Akaike information criterion
(AIC). Models are arranged in the order of best-fitting models
(lowest AIC) to poorer fitting models (higher AIC). Inclusion or
absence of explanatory variables from models is shown by 1 and 0,
respectively.
(PDF)
Table S9 The statistical fit of all possible linear
combinations of factors in the large data set that
excluded OD ratios. The fit of models was statistically
compared using maximum likelihood statistics according to the
Akaike information criterion (AIC). Models are arranged in the
order of best-fitting models (lowest AIC) to poorer fitting models
(higher AIC). Inclusion or absence of explanatory variables from
models is shown by 1 and 0, respectively.
(PDF)
The success/failure of RNA isolations using
Qiagen’s RNeasy Plant Minikit (see Appendix S1,
Protocol 1) and alternative hybrid protocol (see Appendix S1, Protocol 2). Successful isolations were those that met the
stringent criteria of two isolations of the same tissue resulting in a
concentration of .150 ng/mL, a total mass of 20 mg RNA, OD
260/280.1.9, OD 260/230.1.5, r26S/18S .1 and RIN .8. A
‘1’ denotes that the sample met these criteria and ‘0’ indicates that
the sample did not meet these criteria; ‘2’ indicates the method
was not attempted for the sample. In most cases, if a sample failed
with the RNeasy kit (Appendix S1, Protocol 1) we attempted the
alternative Qiagen recommended protocol (Appendix S1, Protocol
2). Family names are according to APG III (2009). Except when
noted, we used freshly expanding leaves for all isolations.
(PDF)
Table S2
Appendix S1 Eighteen protocols used to isolate total
RNA from plant tissue included in this study.
(PDF)
Acknowledgments
We thank S. Allen, Z. Cebi, L. Chen, N. Feja, Y. Ou, Y. Peng, and X. Tan
for assisting with RNA extractions and library construction, and Z. Xu and
the BGI sequencing team for sequencing samples. L. Csiba collected and
processed samples for shipping at the Royal Botanic Gardens, Kew, UK. S.
Brady, D. Guttman, B. Joyce, P. Wang and N. Provart helped interpret
results. This research benefited from M. Johnson’s affiliation with
University of Toronto’s CAGEF and is enabled by the use of computing
resources provided by WestGrid and Compute/Calcul Canada. We thank
the many plant collectors who contributed samples to the 1KP initiative.
Table S3 Correlations in the number of scaffolds with
different minimum threshold size cutoffs used during
assemblies. For example, the column labeled 500 bp contains
all scaffolds that are 500 bp or larger. Pearson product moment
correlation coefficients (r) are shown in the upper triangular
matrix, and P-values are shown in the lower triangular matrix.
(PDF)
Author Contributions
Conceived and designed the experiments: MTJJ EJC MKD GKSW D.
Soltis S. Graham JLM SC. Performed the experiments: MTJJ EJC ZT RB
JNB CTC MWC NDC SC CWD PPE FG S. Greiner S. Graham JMH
IJT TMK JLM MM NM HM JP JCP PR MR RFS D. Stevenson PS D.
Soltis CNS BS CJMT JCV XW YZ MKD GKSW. Analyzed the data:
MTJJ EJC JP. Contributed reagents/materials/analysis tools: MTJJ EJC
ZT RB JNB CTC MWC NDC SC CWD SWD PPE FG S. Graham S.
Greiner JMH IJT TMK JLM MM NM HM JP JCP PR MR RFS D.
Stevenson PS D. Soltis CNS BS CJMT JCV XW YZ MKD GKSW.
Wrote the paper: MTJJ EJC ZT XW YZ MKD GKSW.
Table S4 Least-squares means of descriptors of RNA
quality among different plant tissues types. The leastsquares mean 61 SE and sample size are provided for each tissue type.
(PDF)
PLOS ONE | www.plosone.org
11
November 2012 | Volume 7 | Issue 11 | e50226
Plant RNA Isolation and Transcriptome Sequencing
References
1. Hudson ME (2008) Sequencing breakthroughs for genomic ecology and
evolutionary biology. Molecular Ecology Resources 8: 3–17.
2. Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: A revolutionary tool for
transcriptomics. Nature Reviews Genetics 10: 57–63.
3. Hawkins RD, Hon GC, Ren B (2010) Next-generation genomics: an integrative
approach. Nature Reviews Genetics 11: 476–486.
4. Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, et al. (2011)
Genome-wide genetic marker discovery and genotyping using next-generation
sequencing. Nature Reviews Genetics 12: 499–510.
5. Metzker ML (2010) Sequencing technologies - the next generation. Nature
Reviews Genetics 11: 31–46.
6. Martin JA, Wang Z (2011) Next-generation transcriptome assembly. Nature
Reviews Genetics 12: 671–682.
7. Kaufmann K, Muino JM, Osteras M, Farinelli L, Krajewski P, et al. (2010)
Chromatin immunoprecipitation (ChIP) of plant transcription factors followed
by sequencing (ChIP-SEQ) or hybridization to whole genome arrays (ChIPCHIP). Nature Protocols 5: 457–472.
8. Desgagne-Penix I, Khan MF, Schriemer DC, Cram D, Nowak J, et al. (2010)
Integration of deep transcriptome and proteome analyses reveals the
components of alkaloid metabolism in opium poppy cell cultures. BMC Plant
Biology 10: 252.
9. Kulcheski FR, de Oliveira LFV, Molina LG, Almerao MP, Rodrigues FA, et al.
(2011) Identification of novel soybean microRNAs involved in abiotic and biotic
stresses. BMC Genomics 12: 307.
10. Jiao YN, Wickett NJ, Ayyampalayam S, Chanderbali AS, Landherr L, et al.
(2011) Ancestral polyploidy in seed plants and angiosperms. Nature 473: 97–
100.
11. Zhao Q-Y, Wang Y, Kong Y-M, Luo D, Li X, et al. (2011) Optimizing de novo
transcriptome assembly from short-read RNA-Seq data: a comparative study.
BMC Bioinformatics 12:(Suppl. 14)(S2).
12. Grierson CS, Barnes SR, Chase MW, Clarke M, Grierson D, et al. (2011) One
hundred important questions facing plant science research. New Phytologist 192:
6–12.
13. Sambrook J, Russell DW (2001) Molecular Cloning: A Laboratory Manual.
Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press.
14. Macel M, Van Dam NM, Keurentjes JJB (2010) Metabolomics: The chemistry
between ecology and genetics. Molecular Ecology Resources 10: 583–593.
15. Berenbaum MR, Zangerl AR (2008) Facing the future of plant-insect interaction
research: le retour a la ‘‘Raison d’Etre’’. Plant Physiology 146: 804–811.
16. Walter J, Hein R, Auge H, Beierkuhnlein C, Loeffler S, et al. (2012) How do
extreme drought and plant community composition affect host plant metabolites
and herbivore performance? Arthropod-Plant Interactions 6: 15–25.
17. Agrawal AA (2011) Current trends in the evolutionary ecology of plant defence.
Functional Ecology 25: 420–432.
18. Bilgin DD, DeLucia EH, Clough SJ (2009) A robust plant RNA isolation
method suitable for Affymetrix GeneChip analysis and quantitative real-time
RT-PCR. Nature Protocols 4: 333–340.
19. Gayral P, Weinert L, Chiari Y, Tsagkogeorga G, Ballenghien M, et al. (2011)
Next-generation sequencing of transcriptomes: A guide to RNA isolation in
nonmodel animals. Molecular Ecology Resources 11: 650–661.
20. Schroeder A, Mueller O, Stocker S, Salowsky R, Leiber M, et al. (2006) The
RIN: An RNA integrity number for assigning integrity values to RNA
measurements. BMC Molecular Biology 7: 3.
21. Li RQ, Zhu HM, Ruan J, Qian WB, Fang XD, et al. (2010) De novo assembly
of human genomes with massively parallel short read sequencing. Genome
Research 20: 265–272.
22. Illumina (2011) Quality predictors in RTA v1.12. Pub. No. 770–2011–010. San
Diego, CA.
PLOS ONE | www.plosone.org
23. Parfrey LW, Lahr DJG, Knoll AH, Katz LA (2011) Estimating the timing of
early eukaryotic diversification with multigene clocks. Proceedings of the
National Academy of Sciences of the United States of America 108: 13624–
13629.
24. Wilfinger WW, Mackey K, Chomczynski P (1997) Effect of pH anionic strength
on the spectro-photometric assessment of nucleic acid purity. Biotechniques 22:
474–481.
25. Pico de Coaña Y, Parody N, Fernández-Caldas E, Alonso C (2010) A modified
protocol for RNA isolation from high polysaccharide containing Cupressus
arizonica pollen. Applications for RT-PCR and phage display library construction. Molecular Biotechnology 44: 127–132.
26. Box MS, Coustham V, Dean C, Mylne JS (2011) Protocol: A simple phenolbased method for 96-well extraction of high quality RNA from Arabidopsis. Plant
Methods 7: 7.
27. Babu S, Gassmann M (2011) Assessing integrity of plant RNA with the Agilent
2100 Bioanalyzer. Agilent Technologies. URL: www.agilent.com/genomics/
RNA.
28. Surget-Groba Y, Montoya-Burgos JI (2010) Optimization of de novo
transcriptome assembly from next-generation sequencing data. Genome
Research 20: 1432–1440.
29. Schulz MH, Zerbino DR, Vingron M, Birney E (2012) Oases: robust de novo
RNA-seq assembly across the dynamic range of expression levels. Bioinformatics
28: 1086–1092.
30. Schmid M, Davison TS, Henz SR, Pape UJ, Demar M, et al. (2005) A gene
expression map of Arabidopsis thaliana development. Nature Genetics 37: 501–
506.
31. Barakat A, DiLoreto DS, Zhang Y, Smith C, Baier K, et al. (2009) Comparison
of the transcriptomes of American chestnut (Castanea dentata) and Chinese
chestnut (Castanea mollissima) in response to the chestnut blight infection. BMC
Plant Biology 9: 51–61.
32. Brady SM, Orlando DA, Lee JY, Wang JY, Koch J, et al. (2007) A highresolution root spatiotemporal map reveals dominant expression patterns.
Science 318: 801–806.
33. Wall PK, Leebens-Mack J, Chanderbali AS, Barakat A, Wolcott E, et al. (2009)
Comparison of next generation sequencing technologies for transcriptome
characterization. BMC Genomics 10: 347.
34. Ness RW, Siol M, Barrett SCH (2011) De novo sequence assembly and
characterization of the floral transcriptome in cross- and self-fertilizing plants.
BMC Genomics 12: 298.
35. Farrell RE (2009) RNA Methodologies: Laboratory Guide for Isolation and
Characterization. San Diego, CA: Academic Press.
36. Alfaro ME, Santini F, Brock C, Alamillo H, Dornburg A, et al. (2009) Nine
exceptional radiations plus high turnover explain species diversity in jawed
vertebrates. Proceedings of the National Academy of Sciences of the United
States of America 106: 13410–13414.
37. Formica JV, Regelson W (1995) Review of the biology of quercetin and related
bioflavonoids. Food and Chemical Toxicology 33: 1061–1080.
38. Wilson IG (1997) Inhibition and facilitation of nucleic acid amplification.
Applied and Environmental Microbiology 63: 3741–3751.
39. Ozsolak F, Platt AR, Jones DR, Reifenberger JG, Sass LE, et al. (2009) Direct
RNA sequencing. Nature 461: 814–818.
40. Prochnik SE, Umen J, Nedelcu AM, Hallmann A, Miller SM, et al. (2010)
Genomic analysis of organismal complexity in the multicellular green alga Volvox
carteri. Science 329: 223–226.
41. Merchant SS, Prochnik SE, Vallon O, Harris EH, Karpowicz SJ, et al. (2007)
The Chlamydomonas genome reveals the evolution of key animal and plant
functions. Science 318: 245–251.
12
November 2012 | Volume 7 | Issue 11 | e50226
Appendix S1 Eighteen protocols used to isolate total RNA from plant tissue included in this
study.
Methods for RNA isolation
We attempted to isolate RNA from 1115 plant samples, represented by 695 species of vascular
and non-vascular plants. RNA isolation was achieved using 13 distinct protocols. Many of the
protocols share elements or combine components from several methods. For each method, we
describe the reagents and procedures used, and identify the researchers or institute that
implemented the protocol.
Due to the potential for contamination and degradation by RNase enzymes, as well as
health concerns in handling some substances and chemicals, best practices in aseptic wet lab
techniques must be practiced at all times during RNA isolation. Chief among these are the
critical need to avoid contamination of samples by using extreme care when moving liquids and
opening and closing tubes to avoid aerosols. Because of the risk of degradation by RNase
enzymes, it is essential to use sterile RNase-free equipment, disposable plastics and solutions.
RNase degradation and contamination can be avoided by keeping samples constantly frozen at
low temperature (< -80°C) prior to adding buffers that denature or immobilize RNase. Treating
equipment with RNase denaturants (e.g. RNase Zap, Ambion, Austin, TX) and solutions with
diethylpyrocarbonate (DEPC) can also prevent contamination and/or degradation of samples, but
it can have some negative effects on samples1. Many additional helpful tips for successful RNA
isolation are available in Sambrook and Russell1 and in Appendix A of Qiagen’s RNeasy Mini
Handbook downloadable from www.qiagen.com.
Standard equipment used
Although the reagents vary among protocols, most methods used the same basic equipment,
which include:













Ceramic or porcelain mortar and pestle for tissue homogenization, or some other
equipment that can homogenize frozen tissue (e.g. bead mill)
Water bath or heating block, capable of holding temperatures up to 70°C
Non-refrigerated microcentrifuge (with rotor for 2 ml tubes)
Refrigerated microcentrifuge (with rotor for 2 ml tubes) – only for protocols where noted
Liquid nitrogen (and associated thermos or dewer)
Sterile, RNase-free disposable tips with filter barriers
Sterile, RNase-free disposable microcentrifuge tubes (2 ml and 15 ml were most
commonly used, but sizes vary among protocols)
Pipettors – (capable of pipetting various volumes between 0.1-1000μl)
Stainless steel spatulas
RNase-free water
Sterile razor blades
Analytical balance
Glass beakers, flasks and graduated cylinders
1 Protocol 1: Qiagen RNeasy Plant Minikit
Implemented by: Jim Leebens-Mack and Charlotte Carrigan
The first protocol used for isolation of total RNA was Qiagen’s RNeasy Plant Minikit (Qiagen,
Valencia, CA). These kits are commercially available and widely used in RNA isolation. For
detailed methods associated with this kit we refer readers to the “RNeasy Mini Handbook”
available for download at www.qiagen.com.
Protocol 2: McKenzie et al’s Qiagen hybrid method
Implemented by: Jim Leebens-Mack and Charlotte Carrigan
This protocol was developed by McKenzie et al2 to facilitate the isolation of RNA from woody
plants rich in phenolics and polysaccharides, such as grapes (Vitaceae) and fruit bearing
Rosaceae (apples, cherries and pears). The protocol is provided on Qiagen’s website as an
alternative method to be used in combination with their RNeasy Plant Minikit (see
www.qiagen.com/products/rnastabilizationpurification/rneasysystem/rneasyplantmini.aspx#Tabs
=t2, under the “Protocols” tab). We repeat the protocol here in the event that this protocol is
removed from Qiagen’s website or readers find it difficult to obtain the original publication.
Reagents
Lysis Buffer:
 4 M guanidine isothiocyanate
 0.2 M sodium acetate, pH 5.0
 25 mM EDTA
 2.5% (w/v) PVP -40 (polyvinylpyrollidone, average molecular weight, 40,000)
 1% (v/v) β-mercaptoethanol (β-ME); add immediately before use.
Other reagents:
 20% (w/v) sarkosyl
 Ethanol (96-100%)
Procedure
1. Grind sample (up to 50 mg) in liquid nitrogen to a fine powder using a mortar and pestle.
Transfer the tissue powder and liquid nitrogen to an appropriately sized tube, and allow
the liquid nitrogen to evaporate. Do not allow the sample to thaw. Continue immediately
with step 2
Note: Incomplete grinding of the starting material will lead to reduced RNA yields.
2. Add 600 μl lysis buffer to a maximum of 50 mg of tissue powder. Vortex vigorously.
3. Add 60 μl of 20% sarkosyl. Incubate at 70°C in a water bath or heating block for 10 min
with vigorous shaking or intermittent vortexing.
2 Note: For samples with a high starch content, incubation at elevated temperatures should
be omitted to prevent swelling of the starting material.
4. Pipet the lysate directly onto a QIAshredder Spin Column (lilac-colored column, supplied
in the RNeasy Plant Mini Kit) placed in a 2 ml collection tube. Centrifuge for 2 min at
maximum speed (>14,000 g [rcf]).
5. Carefully transfer the supernatant of the flow-through fraction to a new microcentrifuge
tube without disturbing the cell-debris pellet in the collection tube.
Note: It may be necessary to cut the end off the pipet tip in order to pipet the lysate onto
the QIAshredder Spin Column. Centrifugation through the QIAshredder Spin Column
removes cell debris and simultaneously homogenizes the lysate. While most of the cell
debris is retained on the QIAshredder Spin Column, a very small amount of cell debris
will pass through and form a pellet in the collection tube. Be careful not to disturb this
pellet while transferring the lysate to a new microcentrifuge tube.
6. Add 0.5 volumes (usually 300 μl) ethanol (96–100%), and mix well by pipetting. If some
lysate is lost during homogenization (step 4), reduce volume of ethanol proportionally. A
precipitate may form after the addition of ethanol, but this will not affect the RNeasy
procedure.
7. Continue with step 6 of the RNeasy Mini Protocol for Isolation of Total RNA from Plant
Cells and Tissues and Filamentous Fungi in the RNeasy Mini Handbook.
Protocol 3: CTAB-PVP Method
Implemented by: Beijing Genomics Institute
Note: a similar protocol was used by C. dePamphilis and P. Ralph
Reagents:
CTAB-PVP Buffer:
 CTAB (Hexadecyltrimethylammonium bromide; 2% w/v)
 PVP-40 (2% w/v)
 100mM Tris-HCl (pH8.0)
 25 mM EDTA
 2 M NaCl (Warmed to 65Ԩ in a water bath to suspend in solution)
 Add β-ME to final concentration of 2% before use
SSTE buffer:
 1 M NaCl
 SDS (0.5% w/v)
 10mM Tris-HCl (pH8.0)
 EDTA (1 mM)
Other reagents:
 75% ethanol (treated with 0.1% DEPC)
3 







96-100% ethanol
Acid phenol (pH4.5)
Chloroform
Isoamyl alcohol
10 M LiCl
Glycogen (5mg/ml)
3M Sodium Acetate (pH5.2)
RNase free water
Procedure
1. Grind tissue to a powder in liquid nitrogen.
2. Add 200-500 mg of ground, frozen tissue to 3.0 ml of pre-heated extraction buffer in a 5
ml tube.
3. Vortex the tube until the tissue is mixed with the buffer.
4. Incubate the tube at 65Ԩ for 30 min. (min), vortexing briefly (15 seconds) every 2-3 min
during the incubation.
5. Aliquot the mixture into four 2 ml-RNase free tubes, 1 ml each tube.
6. Spin the tube at 12,000g for 10 min in a microcentrifuge. All of the insoluble matter
should form a pellet at the bottom of the tube.
Note: When used for algae the centrifugation was performed at 3,000g
7. Pour the supernatant into a new 2 ml RNase free tube.
Note: When used for algae, 15 ml tubes were used.
8. Add equal volume of 24:1 chloroform:isoamyl alcohol to fill the tube.
9. Vortex tubes until the phases mix and appear cloudy, then incubate at 20-25°C for 5 min..
10. Spin the tubes at 12,000g for 10 min in a microcentrifuge.
Note: When used for algae the centrifugation was performed at 3,000g
11. Transfer the upper, aqueous phase to new 2 ml RNase free tubes, repeat step 7 to 9 one
more time.
Note: When used for algae, 15 ml tubes were used.
12. Transfer the upper, aqueous phase to new 2 ml RNase free tubes, add 1/3 volume of 10M
LiCl to each tube, mix and let stand at 4Ԩ for 6-8 hrs or overnight to precipitate RNA.
13. Spin the tubes at 18,000 g for 20 min in a microcentrifuge and decant the supernatant,
taking care not to lose the pellet
14. Add 1 ml 75% ethanol to the pellet.
15. Spin the tube at maximum speed (>11,270 g) for 5 min in a microcentrifuge, decant the
supernatant carefully.
16. Repeat steps 14 and 15 one more time.
17. Open cap and air-dry the pellet.
4 18. Add 30μl RNase free water to dissolve the pellet, and then add 70 μl SSTE buffer to each
tube.
19. Combine all 4 tubes into 1 tube.
20. Add equal volume of 25:24:1 acid phenol:chloroform:isoamyl alcohol to the tube.
21. Vortexes the tubes until the phases mix and appear cloudy, then incubate at 20°C for 5
min..
22. Spin the tube at 12,000g for 10 min.
23. Transfer the upper, aqueous phase to a new 2 ml RNase free tube and add equal volume
of 24:1 chloroform:isoamyl alcohol to the tube.
24. Vortex the tubes until the phases mix and appear cloudy, then incubate at 20°C for 5
min..
25. Spin the tube at 12,000g for 10min in a microcentrifuge.
26. Transfer the upper, aqueous phase to a new 2 ml RNase free tube, add 2 volumes of
cooled 100% ethanol, 1/10 volumes of NaAc (pH5.2) and 2 μl glycogen, mix and
incubate at -20Ԩ for 2 hrs.
27. Spin the tube at 18,000 g for 20 min in a microcentrifuge, and then decant the
supernatant, taking care not to lose the pellet
28. Add 1 ml 75% cooled ethanol to the pellet and leave at 20°C for 3 min..
29. Centrifuge at 4°C for 5 min at 12,000g. Decant the liquid carefully, taking care not to
lose the pellet. Briefly centrifuge to collect the residual liquid and remove it with a
pipette.
30. Wash the pellet twice with cooled 75% DEPC-ethanol, open cap and air-dry the pellet.
31. Add 30μl RNase-free water to dissolve the pellet.
32. Treat RNA with DNase I as per supplier’s protocols.
Protocol 4: CTAB-PVP-TRIzol Method
Implemented by: Beijing Genomics Institute
Reagents
CTAB-PVP buffer:
 CTAB (2% w/v)
 PVP-40 (2% w/v)
 100mM Tris-HCl (pH8.0)
 25mM EDTA
 2 M NaCl
 Spermide (0.5g/L),
(Warmed to 65Ԩ in a water bath to suspend in solution)
 Add β-ME to final concentration of 2% before use
5 SSTE buffer:
 1 M NaCl
 SDS (0.5% w/v)
 10mM Tris-HCl (pH8.0)
 1mM EDTA
Other Reagents:
 75% ethanol (DEPC treated)
 100% ethanol
 Acid phenol (pH4.5)
 Chloroform
 Isoamyl alcohol
 10M LiCl
 Glycogen (5mg/ml)
 3M NaAc pH5.2
 TRIzol reagent (Invitrogen)
 RNase free water
Procedure
1. Grind tissue to a powder in liquid nitrogen.
2. Add 200-500 mg of ground tissue to 3.0 ml of pre-heated extraction buffer in a 5 ml tube.
3. Vortex the tube until the tissue is mixed with the buffer.
4. Incubate the tube at 65Ԩ for 30 min., vortexing briefly (15 seconds) every 2-3 min
during the incubation.
5. Aliquot the mixture into four 2 ml RNase free tubes, 1 ml in each tube.
6. Spin the tube at 12,000g for 10 min in a centrifuge. All of the insoluble matter should
form a pellet at the bottom of the tube.
7. Pour the supernatant into a new 2 ml tube.
8. Add an equal volume of 24:1 chloroform:isoamyl alcohol to fill the tube.
9. Vortex tubes until the phases mix and appear cloudy; incubate at 20°C for 5 min..
10. Spin the tubes at 12,000g for 10 min in a centrifuge.
11. Transfer the upper aqueous phase to new 2 ml RNase free tubes, repeat steps 8 to 10 one
more time.
12. Transfer the upper, aqueous phase to new 2 ml RNase free tubes, add 1/3 volume of 10M
LiCl to each tube, mix and let stand at 4Ԩ for 6-8 hrs or overnight to precipitate RNA.
13. Spin tubes at 18,000g for 20 min in a centrifuge and decant the supernatant, taking care
not to lose the pellet.
14. Add 1 ml 75% cooled ethanol to the pellet.
6 15. Spin the tube at maximum speed for 5 min in a centrifuge, decant the supernatant
carefully. Repeat steps 14 and 15 one more time.
16. Open cap and air-dry the pellet.
17. Add 30µl RNase free water to dissolve the pellet, and then add 300μl TRIzol reagent and
equal volume of chloroform to TRIzol reagent (Invitrogen). Vortex vigorously and store
at 20°C for 5 min.
18. Centrifuge at >12,000g for 10 min..
19. Transfer the upper, aqueous phase to a new 2 ml RNase free tube, add 2 volumes of
cooled 100% ethanol, 1/10 volume of NaAc and 2μl of glycogen. Mix and incubate at ----20Ԩ for 2 hrs.
20. Spin tubes at >12,000g for 20 min at 4Ԩ in a centrifuge
21. Decant the supernatant taking care not to lose the pellet and add 1 ml 75% ethanol to the
pellet. Let tube stand at 20°C for 3 min..
22. Centrifuge at 4°C for 5 min at >12,000g. Decant the liquid carefully, taking care not to
lose the pellet. Briefly centrifuge to collect the residual liquid and remove it with a
pipette.
23. Repeat steps 21 and 22 one more time.
24. Open cap and air dry the pellet.
25. Add 10-30µl RNase free water to dissolve the pellet.
Protocol 5: pBIOZOL Method
Implemented by: Beijing Genomics Institute
Reagents
 5M NaCl
 Chloroform
 Isopropyl alcohol
 75% ethanol (DEPC treated)
 pBIOZOL Reagent (Beijing Bai billion New Technology Co., Beijing, China)
 RNase-free water
Procedure
1. Grind tissue to a powder in liquid nitrogen.
2. Add 1.3 ml of cold (4°C) pBIOZOL Reagent for up to 100 mg of frozen, ground tissue.
Mix by briefly vortexing or flicking the bottom of the tube until the sample is thoroughly
suspended.
3. Incubate the tube for 5 min at 20°C.
Note: Lay the tube down horizontally to maximize surface area during RNA extraction.
4. Centrifuge for 2 min. at 12,000g. Transfer the supernatant to an RNase-free tube.
7 5. Add 100μlof 5M NaCl to the extract and tap tube to mix.
6. Add 300μl of chloroform. Mix thoroughly by inversion.
7. Centrifuge the sample at 4°C for 10 min. at 12,000g to separate the phases. Transfer the
top aqueous phase to an RNase-free tube.
8. Add to the aqueous phase an equal volume of isopropyl alcohol. Mix and let stand at
20°C for 10 min.
9. Centrifuge the sample at 4°C for 10 min. at 12,000g.
10. Decant the supernatant, taking care not to lose the pellet and add 1 ml of chilled 75%
ethanol to the pellet.
Note: Pellet may be difficult to see.
11. Centrifuge at room temperature for 5 min at 12,000g. Decant the liquid carefully, taking
care not to lose the pellet. Briefly centrifuge to collect the residual liquid and remove it
with a pipette.
12. Add 10-30 μl RNase-free water to dissolve the RNA. Pipette the water up and down over
the pellet to dissolve the RNA. If any cloudiness is observed, centrifuge the solution at
room temperature for 1 min at 12,000 × g and transfer the supernatant to a fresh tube.
Protocol 6: pBIOZOL and Qiagen RNeasy Plant Mini Kit Method
Implemented by: Beijing Genomics Institute
Reagents
Acid phenol (pH 4.5)
Chloroform
Isopropyl alcohol
75% ethanol (DEPC treated)
100% ethanol
5M NaCl
pBIOZOL Reagent (Beijing Bai billion New Technology Co., Beijing, China)
RNeasy Plant Mini Kit (Qiagen)
RNase-free water
Procedure
1. Grind tissue to a powder in liquid nitrogen.
2. Add 1.3 ml of cold (4°C) pBIOZOL reagent for up to 100mg of frozen ground tissue.
Mix by briefly vortexing or flicking the bottom of the tube until the sample is thoroughly
re-suspended.
3. Incubate the tube for 5 min. at room temperature.
Note: Lay the tube down horizontally to maximize surface area during RNA extraction.
4. Centrifuge for 10 min. at 12,000g in a microcentrifuge at room temperature. Transfer the
supernatant to a new 1.5 ml RNase-free tube.
8 5. Add 100μl of 5M NaCl and 300μl chloroform, vortex vigorously.
4. Centrifuge at 12,000g for 10 min.
5. Transfer the top aqueous phase to a new 1.5 ml RNase-free tube, add an equal volume of
5:1 acid phenol:chloroform to the tube.
6. Vortex the tube until the phases mix and appear cloudy, then incubate at 20°C for 5 min..
7. Centrifuge at 12,000g for 10 min.
8. Transfer the top aqueous phase to a new 1.5 ml RNase-free tube, add to the aqueous
phase equal volume of 24:1 chloroform:isoamyl alcohol. Vortex the tube until the phases
mix and appear cloudy, then incubate at room temperature for 5 min..
9. Centrifuge at 12,000g for 10 min.
10. Transfer the top aqueous phase to a new 1.5 ml RNase-free tube, add 1/2 volume of
100% ethanol.
11. Pour the contents of the tube into a Qiagen mini RNA spin column (pink), until the
column is almost filled with liquid.
12. Cap the tube and centrifuge at 12,000g for 15 sec. The column should be empty at the
end of this spin.
13. Discard the flow-through from the collection tube.
14. Repeat the previous two steps with the same mini RNA spin column, until all of the
liquid in the tube(s) has been passed through the column. The nucleic acid is now bound
to the silica membrane in the spin column.
15. Apply 700 μl of solution RW1 to the spin column.
16. Cap the tube and centrifuge at 12,000g for 15 seconds. The column should be empty at
the end of this spin.
17. Discard the flow-through from the collection tube.
18. Apply 500 μl of solution RPE to the spin column. Cap the tube and centrifuge at 12,000g
for 15 seconds at 12, 000 g. The column should be empty at the end of this spin.
19. Discard the flow through.
20. Repeats previous two steps one time.
21. Spin at maximum speed for 2 min. to remove remaining liquid from the silica membrane.
22. Transfer the spin column to a new 1.5 ml conical bottom microcentrifuge tube
23. Add 30-50 μl of RNase-free water to the column, and then let tube incubate at 20°C for 3
min..
24. Spin at maximum speed for 1 min to collect RNA solution.
Protocol 7: pBIOZOL-LiCl Method
Implemented by: Beijing Genomics Institute
9 Reagents
 Acid phenol (pH 4.5)
 Chloroform
 Isopropyl alcohol
 75% ethanol (DEPC treated)
 100% ethanol
 2M NaAc (pH 4.2)
 3M NaAc (pH5.2)
 5M NaCl
 10M LiCl
 pBIOZOL Reagent (Beijing Bai billion New Technology Co., Beijing, China)
 RNase-free water
SSTE Buffer:
 1M NaCl
 SDS (0.5% w/v)
 10mM Tris-HCl (pH8.0)
 1 mM EDTA
Procedure
1. Grind tissue to a powder in liquid nitrogen.
2. Add 1.3 ml of cold (4°C) pBIOZOL reagent for up to 100mg gram of frozen, ground
tissue. Mix by briefly vortexing or flicking the bottom of the tube until the sample is
thoroughly suspended.
3. Incubate the tube for 5 min. at room temperature.
Note: Lay the tube down horizontally to maximize surface area during RNA extraction.
4. Centrifuge at 12,000g for 2 min..
5. Transfer the supernatant to a new 1.5 ml RNase-free tube.
6. Add 50 μl of 2M NaAc (pH4.2) to the extract and tap tube to mix, and then add 100ul 5M
NaCl and 300µl chloroform, vortex vigorously.
7. Centrifuge the mixture at 4°C for 10 min. at 12,000g to separate the phases. Transfer the
top aqueous phase (about 400-500 μl) to a new 1.5 ml RNase-free tube.
8. Add to the aqueous phase 1/3 volume of 10M LiCl. Mix and let stand at 4°C overnight.
9. Centrifuge the mixture at 4°C for 20 min. at >12,000g.
10. Decant the supernatant, taking care not to lose the pellet and add 1 ml of 75% ethanol to
the pellet and stand the tube at room temperature for 3 min..
Note: Pellet may be difficult to see.
11. Centrifuge at 4°C for 3 min at >12,000g. Decant the liquid carefully, taking care not to
lose the pellet. Briefly centrifuge to collect the residual liquid and remove it with a
pipette.
12. Repeat the previous two steps.
10 13. Add 50 μl RNase-free water to dissolve the RNA pellet. Pipette the water up and down
over the pellet to dissolve the RNA.
Note: If you extract more than 100mg plant tissues, combine different extractions to one
tube.
14. Add SSTE buffer to RNA to a total volume of 600µL, then add equal volume of 25:24:1
acid phenol:chloroform:isoamyl alcohol to the tube.
15. Vortex the tube until the phases mix and appears cloudy, then incubate at 20°C for 5
min..
16. Centrifuge at 12,000g for 10 min in a microcentrifuge.
17. Transfer the top, aqueous phase to a new 1.5 ml RNase-free tube, add equal volume of
24:1 chloroform:isoamyl alcohol to the tube.
18. Vortex the tube until the phases mix and appear cloudy, then incubate at 20°C for 5 min.
19. Centrifuge at 12,000g for 10 min.
20. Transfer the top aqueous phase to a new 1.5 ml RNase-free tube, add to the aqueous
phase 2 volumes of 100% ethanol, 1/10 volume of 3M NaAc (pH 5.2) and 2μl 5mg/ml
glycogen. Invert tube to mix and store at -20°C for 2 hours.
21. Centrifuge at 4°C for 20 min. at >12,000g, decant the supernatant carefully to avoid
losing the pellet.
22. Add 1 ml of 75% ethanol to the pellet and incubate at 20°C for 3 min..
23. Centrifuge at 4°C for 5 min at 12,000 g. Decant the liquid carefully, taking care not to
lose the pellet. Briefly centrifuge to collect the residual liquid and remove it with a
pipette.
24. Repeat step 18 and 19.
25. Open cap and air-dry the pellet no more than 5 min..
26. Add 30µl RNase-free water to dissolve the pellet.
27. Before library construction, treat RNA with DNase I.
Protocol 8: CTAB/Acid Phenol/Silica Membrane Method
Implemented by: Michael Deyholos
This protocol combines elements of standard CTAB, acid phenol, and silica-membrane
protocols. The protocol was developed to extract total RNA from a wide range of plant species
and tissues, and to do so in the smallest volume possible to yield >20µg of total RNA using as
little as 50mg of fresh tissue. Maintaining a small extraction volume also allows for many
samples to be processed in parallel in microcentrifuge tubes; at least 12 samples can be
processed in 3-4 hours. The protocol has been used successfully with dozens of species,
including tissues rich in polysaccharides and/or secondary metabolites. There are four organic
extractions in total (three chloroform and one phenol:chloroform). For many species/tissues, the
full set of chloroform extractions in the order specified is required to maximize RNA purity and
to prevent phase inversion during the acid phenol extraction.
11 Because the RNA is protected from RNase by denaturants throughout most of the
protocol, it is not necessary to use specially treated (e.g. baked or DEPC) labware or solutions.
We have also found that QIAshredder columns have little positive or negative impact on quality
and yields, although they may be useful for some tissues that are not easily disrupted by grinding
in a mortar.
Instead of using a Qiagen silica membrane spin column (also available from other
manufacturers), it is also possible to precipitate the RNA after the last organic extraction (step
22). However, the silica membrane columns provide more reliable recovery of RNA (especially
in a high-throughput, service environment), and allow for the convenient removal of residual
DNA through an on-column digestion. We have not found it beneficial to collect more than one
RNA elution from a spin column, or to pass the eluate through the same column twice.
The binding capacity of the Qiagen column (~100µg RNA) exceeds the yield of RNA
that can be extracted from tissue in a single 2 ml microcentrifuge tube. Therefore, when yields of
>20µg total RNA / ~1 g fresh weight of tissue are required, it is most efficient to aliquot the
tissue sample between two tubes, process the aliquots independently through stage 24, then pool
the samples into a single column.
Both the acid phenol extraction and the QIAgen silica membrane washes are biased in
favour of the recovery of RNA over DNA. Indeed, there appears to be little residual DNA
present even before on-column DNase I digestion. Nevertheless, it is probably worthwhile to
conduct the digestion on all samples to limit the possibility that any genomic DNA molecules
could be used as a sequencing template.
Reagents
CTAB extraction buffer (for 200 ml final volume):
 40 ml Tris-HCl pH 7.5
 10 ml 0.5M EDTA
 35.04g NaCl
 4g CTAB
 4g SDS (sodium dodecyl sulfate)
 4g PVP (polyvinylpyrrolidone)
 8 ml β-ME (2-mercaptoethanol)
Note: Heat to 65°C to dissolve components in solution. SDS may not completely dissolve)
Other reagents:
 Saturated NaOH solution
 Chloroform:Isoamyl Alcohol (24:1)
 Phenol:Chloroform (5:1, pH4.5).
Note: It is essential to use acid-equilibrated phenol, rather than Tris-buffered phenol.
 Qiagen’s RLT, RW1, RPE, DNase digestion solutions, plus Plant minikit spin columns
(pink)
12 Mortars should be rinsed with saturated NaOH to remove residual RNA, and then rinsed with
DEPC-treated water.
Procedure
1. Grind tissue to a powder in liquid nitrogen.
2. Add 400-600mg of ground, frozen tissue to 1.4 ml of pre-heated extraction buffer in a 2
ml microcentrifuge tube.
3. Vortex the tube until the tissue is mixed with the buffer. To facilitate mixing, you may
have to invert the tube on the vortex, and/or heat it briefly in a 65°C water bath.
4. Incubate the tube at 65°C for 10-15 min., vortexing briefly (15 seconds) twice during the
incubation.
5. Spin the tube at maximum speed (>11,269 g) for 3 min in a microcentrifuge. All of the
insoluble matter should form a pellet at the bottom of the tube.
6. Pour the supernatant into a new 2 ml tube.
7. Solvent Extraction #1: Add enough 24:1 chloroform:isoamyl alcohol to fill the tube.
8. Vortex the tube for 15 seconds or until the phases mix and appear cloudy.
9. Spin the tube at maximum speed (>11,269 g) for 3 min in a microcentrifuge. Most of the
chlorophyll will be dissolved in the lower, organic phase.
10. Transfer the upper, aqueous phase to a new 2 ml tube, using a disposable pipette. Avoid
transferring any of the material (usually a white precipitate) from the boundary between
the phases.
11. Solvent Extraction #2: Add 24:1 chloroform:isoamyl alcohol to the tube containing the
aqueous phase (this should be at least 900μl of 24:1 chloroform:isoamyl).
12. Vortex the tube for 15 seconds or until the phases mix and appear cloudy.
13. Spin the tube at maximum speed (>11,269 g) for 3 min in a microcentrifuge.
14. Transfer the upper, aqueous phase to a new 2 ml tube, using a disposable pipette. Avoid
transferring any of the material (usually a white precipitate) from the boundary between
the phases.
15. Solvent Extraction #3: Add 1 ml 5:1 phenol:chlorofom pH4.5 to the tube containing the
aqueous phase.
16. Vortex the tube for 15 seconds or until the phases mix and appear cloudy.
17. Spin the tube at maximum speed (>11,269 g) for 3 min in a microcentrifuge.
18. Transfer the upper, aqueous phase to a new 2 ml tube, using a disposable pipette. Avoid
transferring any of the material (usually a white precipitate) from the boundary between
the phases.
19. Solvent Extraction #4: Add 24:1 chloroform:isoamyl alcohol to the tube containing the
aqueous phase (this should be at least 900μl of 24:1 chloroform:isoamyl).
20. Vortex the tube for 15 seconds or until the phases mix and appear cloudy.
13 21. Spin the tube at maximum speed (>11,269 g) for 3 min in a microcentrifuge.
22. Transfer the upper, aqueous phase to a new 2 ml tube, using a disposable pipette. Avoid
transferring any of the material (usually a white precipitate) from the boundary between
the phases.
23. Estimate the volume of the aqueous phase based on the markings on the tube. Add at
least 0.5 volumes of solution RLT, and mix by briefly shaking.
24. Estimate the new total volume in the tube, and add 0.5 volumes of 95-100% ethanol. Mix
by briefly shaking.
25. Pour the contents of the tube into a Qiagen miniRNA spin column (pink), until the
column is almost filled with liquid.
26. Cap the tube and spin for 15 seconds at >5,000 g. The column should be empty at the end
of this spin.
27. Discard the flow-through from the collection tube.
28. Repeat the previous two steps with the same miniRNA spin column, until all of the liquid
in the tube(s) has been passed through the column. The nucleic acid is now bound to the
silica membrane in the spin column.
29. Apply 350μl of solution RW1 to the spin column.
30. Cap the tube and spin for 15 seconds at >5,000 g. The column should be empty at the end
of this spin.
31. Discard the flow-through from the collection tube.
32. Apply 80μl of DNase digestion solution to the membrane of the spin column.
33. Incubate at room temperature for 15 min.
34. Apply 350μl of solution RW1 to the spin column.
35. Cap the tube and spin for 15 seconds at >5,000 g. The column should be empty at the end
of this spin.
36. Discard the flow-through from the collection tube.
37. Apply 500μl of solution RPE to the spin column.
38. Cap the tube and spin for 15 seconds at >5,000 g. The column should be empty at the end
of this spin.
39. Discard the flow-through from the collection tube.
40. Apply 500μl of solution RPE to the spin column.
41. Cap the tube and spin for 15 seconds at >5,000 g. The column should be empty at the end
of this spin.
42. Discard the flow-through from the collection tube.
43. Transfer the spin column to a new collection tube, and spin at maximum speed for 3 min.
to remove remaining liquid from the silica membrane.
14 44. Transfer the spin column to a new 1.5 ml conical bottom microcentrifuge tube
45. Add 44μl of RNase-free water to the column.
46. Spin at maximum speed for 1 min to elute.
Protocol 9: CTAB/Acid Phenol/Silica Membrane Method
Implemented by: Henrietta Myburg and Marc Johnson
This protocol is a modification of protocol 8. It was developed after protocol 8 and several
commercially available plant RNA isolation kits failed to produce a sufficient yield and quality
of RNA from Oenothera spp. (Onagraceae) for next-generation sequencing. Oenothera are rich
in polysaccharides, oils, flavonoids and complex ellagitannins that likely interfere with isolation.
We suspect that this protocol will be most useful for species and tissues with complex secondary
chemistry and rich in oils (e.g. some Rosaceae and Pinaceae). The most important modifications
to this protocol versus Protocol 8 is the use of less plant tissue, more extraction buffer, and
repeating solvent extractions until the interphase is clean of debris. The protocol is regrettably
longer and more involved than Protocol 8. We attempted to remove or reduce the replication of
the solvent extractions steps without success (i.e. yield and quality are always decreased in
Oenothera when any steps are removed).
Reagents
The solutions are identical to Protocol 8 except we added 2µl of β-ME to 1 ml of CTAB
extraction buffer (instead of 40μl/1ml) immediately prior to the RNA extraction and deleted the
addition of SDS. We also wiped the bench, mortars, pestles with RNaseZap Solution (Ambion,
Austin, TX) immediately prior to grinding tissue and rinsed the mortar and pestles with ddH20.
Procedure
1. Chill mortar and pestle with liquid nitrogen (fill the mortar carefully with liquid nitrogen,
let it evaporate, repeat one more time. Keep the pestle in the mortar when doing this.)
2. Fill the mortar again; add 50-100mg of tissue into the mortar while the liquid nitrogen is
still “bubbling”.
3. Carefully grind tissue to a powder in liquid Nitrogen. (NOTE: Add more nitrogen if
needed, but it is important to be careful when you pour the liquid nitrogen. Especially if
you are grinding a small amount of tissue in a small sized mortar. The liquid nitrogen
tends to “blow” the tissue out of the mortar.)
4. Add 2 ml to 5 ml (depending on the amount of tissue you start with) of prepared
extraction buffer on top of the tissue powder and mix in with the pestle. The extraction
buffer will freeze because the mortar is still very cold from the liquid NITROGEN. Do
not worry about this – start grinding your next sample. The tissue/CTAB mix will slowly
thaw and at some point you will be able to mix it well using the pestle.
15 Note: Keep in mind that if you use more extraction buffer you will be working in more 2
ml tubes per sample and that the number of tubes per sample will increase once you reach
the Qiagen RNeasy part of the protocol. Thus, there will be a lot of pipetting into the
Qiagen spin column. One can do the solvent extraction steps in 15 ml Falcon tubes, but
keep in mind that you will still have to pass a significantly larger amount of liquid
through the Qiagen spin column.
5. Once thawed and well mixed, pipette or pour the slurry into a 2 ml tube. If the slurry is
too thick, cut the tip of the 1 ml pipette tip with a clean razor blade or scissors. If you
start with a lot of tissue, you might have to rinse the side of the mortar with an additional
1 ml of extraction buffer (or more if needed.)
6. Vortex each tube until the tissue is mixed with the buffer. To facilitate mixing, you may
have to invert the tube on the vortex, and/or heat it briefly in a 65°C water bath.
7. Incubate tubes at 65°C for 10-15 min., vortexing briefly (15 seconds) twice during the
incubation.
8. Spin the tubes at maximum speed (>14,000 g) for 3 min in a microcentrifuge. All of the
insoluble matter should form a pellet at the bottom of the tube.
9. Pour or pipette the supernatant into a new 2 ml tube. Make sure that the total volume of
the supernatant per tube is ≤ 1 ml. (You will be adding approximately 1:1 volume ratios
in the following steps)
10. Solvent Extraction #1: Add an equal volume (compared to sample) of 24:1
chloroform:isoamyl alcohol to each tube.
Note: We attempted to remove solvent extraction #1 and #2, but doing so lowers the
quality and quantity of total RNA in evening primrose (Oenothera).
11. Vortex the tubes for 15 seconds or until the phases mix and appear cloudy.
12. Spin the tubes at maximum speed (>14,000 g) for 3 min in a microcentrifuge. Most of the
chlorophyll will be dissolved in the lower, organic phase.
13. Pipette the upper, aqueous phase to a new 2 ml tube. Avoid transferring any of the
material (usually a white precipitate) from the boundary between the phases.
14. Solvent Extraction #2: Repeat steps 10 to 13 one more time and proceed to step 15
15. Solvent Extraction #3: Add equal volume (compared to the sample) of 5:1
phenol:chlorofom (pH4.5) to the tubes containing the aqueous phase.
16. Vortex each tube for 15 seconds or until the phases mix and appear cloudy.
17. Spin the tube at maximum speed (>14,000 g) for 3 min in a microcentrifuge.
16 18. Pipette the upper aqueous phase to a new 2 ml tube, using a disposable pipette. Avoid
transferring any of the material (usually a white precipitate) from the boundary between
the phases.
19. Repeat steps 15 to 18 until the interphase is clean (2-3x for most Oenothera) and then
proceed to step 20.
Note: This is likely the most critical step. Do not proceed to 20 until the interphase
appears clean.
20. Solvent Extraction #4: Add an equal volume (compared to sample) of 24:1
chloroform:isoamyl alcohol to each tube containing the aqueous phase.
21. Vortex each tube for 15 seconds or until the phases mix and appears cloudy.
22. Spin the tubes at maximum speed (>14,000 g) for 3 min in a microcentrifuge.
23. Pipette the upper, aqueous phase to a new 2 ml tube. Avoid transferring any of the
material (usually a white precipitate) from the boundary between the phases. At this point
the interphase should be relatively clean, especially if you repeated the
phenol/chloroform steps to the point where the interphase appeared to be clean.
24. Repeat steps 20 to 23 one more time and then proceed to step 25.
25. Estimate the volume of the aqueous phase based on the markings on the tube. Add at
least 0.5 volumes of buffer RLT and mix well by briefly shaking.
Note: We attempted excluding this step but doing so decreased the RNA yield and
quality)
26. Estimate the new total volume in the tube, and add 0.5 volumes of 95-100% ethanol. Mix
by briefly shaking. If you hold the tube against the light while mixing, you sometimes see
the RNA swirl while it precipitates out.
27. Pipette the contents of the tube into a Qiagen miniRNA spin column (pink), until the
column is almost filled with liquid (≤750µl).
28. Cap the tube and spin for 15 seconds at >8,000 g. The column should be empty at the end
of this spin.
29. Discard the flow-through from the collection tube.
30. Repeat the previous two steps with the same spin column, until all of the liquid in all the
tube(s) from the same sample have passed through the column. The nucleic acid is now
bound to the silica membrane in the spin column.
31. Apply 350μl of buffer RW1 to the spin column.
32. Cap the tube and spin for 15 seconds at >8,000 g. The column should be empty at the end
of this spin.
17 33. Discard the flow-through from the collection tube.
34. Apply 80μl of DNase digestion solution to the membrane of the spin column.
35. Incubate at room temperature for 15 min.
36. Apply 350μl of solution RW1 to the spin column.
37. Cap the tube and spin for 15 seconds at >6,300 g. The column should be empty at the end
of this spin.
38. Discard the flow-through from the collection tube.
39. Apply 500μl of solution RPE to the spin column.
40. Cap the tube and spin for 15 seconds at >6,300 g. The column should be empty at the end
of this spin.
41. Discard the flow-through from the collection tube.
42. Apply 500μl of solution RPE to the spin column.
43. Cap the tube and spin for 15 seconds at >6,300 g. The column should be empty at the end
of this spin.
44. Discard the flow-through from the collection tube.
45. Transfer the spin column to a new collection tube, and spin at maximum speed for 3 min.
to remove remaining liquid from the silica membrane.
46. Transfer the spin column to a new 1.5 ml conical bottom microcentrifuge tube
47. Add 44μl of RNase-free water onto the column. Incubate for 1 min to eluteat room
temperature and spin to to elute.
Protocol 10: TRIzol LS Reagent Method
Implemented by: Michael Melkonian and Barbara Surek (algae) and Juan Carlos Villarreal
(bryophytes)
This protocol follows the procedures provided with the TRIzol LS Reagent (Invitrogen). TRIzol
LS Reagent is a monophasic solution of phenol and guanidine isothiocyanate that can be used in
isolation of total RNA from a wide variety of tissues and organisms, in addition to plants. This
protocol was used in the isolation of total RNA from some algae samples (see Supplementary
Table 1).
Reagents
 Chloroform or BCP (1-bromo-3-chloropropane)
 Isopropanol
 75% Ethanol (in DEPC-treated water)
 Potassium acetate
18 

TRIzol LS Reagent (Invitrogen)
RNase-free water
Procedure
Note: All centrifugation steps are performed at 4°C.
1. Centrifuge at lowest speed to cause algae to form pellet and wash several times with sterile
culture medium (not DEPC-treated); after washing, the algal material is aliquoted into
portions of 250 µl (ca. 50-100 mg packed cell volume)
2. Homogenize each 250 µl portion of pellet material to a powder in liquid nitrogen using
mortar and pestle prechilled with liquid nitrogen
3. Add 750 μl TRIzol LS to each 250 μl of homogenized algal material; add more nitrogen if
needed (see also Procedure described in Protocol 9)
4. Homogenization is continued until the TRIzol is pulverized as well
5. Thaw and aliquot homogenate into several Eppendorf tubes
6. Add 50 μl potassium acetate (0.2 M final concentration) to each sample
7. Incubate for 5 min at 20°C
8. Add 200 µl chloroform (for polysaccharide-rich algae), or 100 µl BCP to each sample and
shake samples for 15 seconds
10. Incubate at 20°C for 10 min.
11. Centrifuge samples at 12,000g for 15 min.
12. RNA will remain in the upper, aqueous phase (ca. 70% of the applied TRIzol)
13. Carefully transfer each RNA phase into RNase-free 1.5 ml tubes
14. Add 500 µl isopropanol
15. Incubate for 1h at -20°C
16. Centrifuge at 12,000g for 10 min
17. Wash pellet with 75% ethanol
18. Gently suspend pellet in solution
19. Centrifuge at 7,500g for 5 min
20. Repeat ethanol wash steps
21. Dry pellet at 50°C for 5-10 min
Note: appearance of drying pellet is important: drying should be terminated when the
pellet begins to become transparent; contaminated RNA remains white)
22. Add RNAse-free water. Incubate at 55-60°C for 10 min.
23. Dissolve pellet completely by pipetting
19 Protocol 11: Tri Reagent Method
Implemented by: Michael Melkonian and Barbara Surek
The protocol for RNA isolation using Tri Reagent (Molecular Research Center, Cincinnati OH)
is analogous to the TRIzol LS reagent protocol except that 1 ml of Tri Reagent is added to each
250 µl of homogenized material (= approx. 50-100 mg packed cell volume) (step 3, Protocol 10).
For background, TRI Reagent combines phenol and guanidine thiocyanate in a monophasic
solution to inhibit RNase activity during RNA isolation.
In cases of high RNA concentration (> 1000 ng/μl), but low 260/230 ratio (less than 1), RNA can
be extracted and precipitated again. For this, steps 8-13 of the TRIzol LS Reagent Method are
repeated. The RNA phase is mixed with 500 µl 8 M LiCl by vortexing and incubating for 1 h at
4°C. Subsequently, steps 16-23 of the TRIzol LS reagent protocol are performed.
Frequently, repetition of extraction decreases previous RNA concentrations up to tenfold, but
increases the 260/230 ratio remarkably attaining values of 1.8 or more. Neither using isopropanol
for the second precipitation process nor replacing isopropanol by LiCl as a main precipitator is as
efficient as the method described above.
Protocol 12: Hot Acid Phenol Method for Angiosperms
Implemented by: Sarah Covshoff, Rowan Sage and Julian Hibberd
This RNA isolation method is a multi-component method involving an initial extraction by hot
acid phenol and then a purification and DNase treatment using the RNeasy Mini Kit by Qiagen.
The method described below is a modification of a method described by van Tunen et al.3.
Reagents
Extraction Buffer :
 100 mM Tris pH9.0
 1% SDS (v/v, starting from 10% SDS stock solution)
 100 mM LiCl
 10 mM EDTA
 RNase-free water
Note: All components were filter purified and then the final reaction buffer was also filter
purified using a Millipore Stericup.
Other reagents:
 Acid phenol (pH 4.3)
 Chloroform:isoamyl (24:1)
 70% ethanol (diluted in RNase free H20)

4M LiCl
Procedure
20 1. Add 3 ml of saturated acid phenol (pH 4.3) and 3 ml of RNA extraction buffer to a 15 ml
snap cap tube. Warm tube to 65°C in a fume hood using a heat block.
2. Homogenize tissue in liquid nitrogen using a mortar and pestle.
3. Transfer up to 1.0 g of ground tissue to a tube containing the phenol-buffer mixture and
close the snap cap tube completely (second stop). Immediately mix tube by hand, vortex
until the phases mix and appear cloudy and keep at room temperature or 65°C depending
on whether SDS is precipitating.
Note: Use a spatula and funnel chilled in liquid nitrogen to transfer the powdered tissue
to the tube.
4. Centrifuge at 2500 g for 10 min.
5. Transfer aqueous phase to a new 15 ml tube.
6. Add 3 ml of chloroform:isoamyl (24:1 ratio), close the snap cap tube completely (second
stop) and mix contents immediately by hand. Then vortex until the phases mix and
appear cloudy.
7. Centrifuge at 2500 g for 10 min.
8. Transfer aqueous phase to a new 15 ml tube.
9. Precipitate nucleic acids by gently adding 3 ml of isopropanol to the tube followed by
gentle inversion to mix the phases. Precipitate for 1 hr at 4°C.
10. Centrifuge at 2500 g for 10 min.
11. Remove the supernatant without dislodging the pellet.
12. Rinse the pellet with 1 ml of chilled 70% ethanol made with RNase-free H2O
13. Transfer pellet and ethanol to a new 1.5 ml tube and invert to wash.
14. Centrifuge at 16,000 g in a microcentrifuge for 2 min and remove the supernatant.
15. Spin dry the RNA pellet by successive removal of any remaining liquid following 1 min
centrifugations at maximum speed in a microcentrifuge. Repeat this process until no
liquid is seen when the tube is flicked. Allow the pellet to air dry for 1 to 5 min,
depending on the size of the pellet.
Note: Avoid over-drying since this will negatively affect re-suspension in water.
16. While on ice, re-suspend the pellet in 500 µl RNase-free H2O using gentle pipetting.
Note: Ensure the pellet is completely re-suspended before proceeding to step 17.
17. Add 500 µl of 4 M LiCl and precipitate the RNA overnight at 4°C.
18. Centrifuge at 16,000 g for 30 min at 4°C in a microcentrifuge.
21 19. Remove supernatant by pipette and wash the pellet three times with 200 µl of chilled
70% ethanol diluted in RNase-free H2O. Centrifuge at 4°C.
Note: Remove the pellet from the wall of the tube to wash the pellet more completely.
Pellet will now dislodge easily and care must be taken when removing the supernatant
between washes.
20. Spin the pellet, remove ethanol and dry as in step 15. Centrifuge at 4°C.
21. Dissolve pellet in RNase-free H2O on ice.
Note: The volume of the water is dependent on the size and clarity of the pellet, as well
as the viscosity of the re-suspension, but 50 µl can be used as a standard starting point.
22. Centrifuge RNA extract for 1 min at 4°C and check concentration and integrity using a
NanoDrop 2000 spectrophotometer.
23. Purify and DNase treat up to 100 µg of RNA on a RNeasy Mini Kit spin column (pink)
by Qiagen.
Note: Follow the ‘RNA Cleanup’ protocol as published in the RNeasy Handbook,
including the on-column DNase digestion with the RNase-Free DNase Set by Qiagen.
24. Elute RNA twice from the column using 40 µl of 95°C RNase-free water to a total of 80
µl. However, if the mass of RNA applied to the column is less than 50 µg, then use only
40 µl to elute.
Protocol 13: Trizol/RNaqueous Midi-Kit
Implemented by: Megan Rolf and Toni M. Kutchan
Note: (two samples from C. dePamphilis and P. Ralph only used the RNAqueous Midi-Kit)
This protocol is based on a combination of two methods: The Trizol method described by
Chomczynski and Sacchi4 and the Ambion RNAqueous Midi-Kit (Life Technologies, Carlsbad,
CA), with minor modifications.
Reagents
Extraction buffer:
 0.8 M guanidine thiocyanate,
 0.4 M ammonium thiocyanate,
 0.1 M sodium acetate pH 5.0
 5% glycerol
 38% phenol (pH 4.3 -5.0)
Other reagents and equipment:
 100 % ethanol
 70 % ethanol
22 









Chloroform
Isopropanol, 99.5% DNase, RNase, and Protease free
Nuclease free H2O
RNAqueous-Midi Kit
Turbo DNA-free kit
18 G ½ needle
14 ml polypropylene round-bottom tube
10 ml syringe
3 ml syringe
50 ml centrifuge tube
Procedure
1. Quickly weigh out 3 g of frozen tissue and place in a pre-chilled mortar containing liquid
nitrogen.
2. Grind the sample into fine powder without allowing it to thaw.
3. Add the sample into a 50 ml centrifuge tube containing 30 ml of extraction buffer (10 ml
buffer/g tissue).
4. Vortex for 1 min, then incubate at room temperature for 5 min.
5. Add 6 ml of chloroform (2 ml chloroform/g tissue) and shake vigorously for at least 20
seconds.
6. Centrifuge at 3095 g for 20 min. at 4°C.
7. Move supernatant to new 50 ml tube. Add one volume isopropanol (about 22.5 ml) to
precipitate RNA.
8. Invert gently several times and then incubate at room temperature for 10 min.
9. Centrifuge at 3095 g for 10 min. at 4°C.
10. Wash pellet with 45 ml of 70 % ethanol and centrifuge at 3095 g for 2 min. at 4°C.
Repeat once.
11. Air dry pellet for 5 min (if not completely dry, it’s still okay to move on to the next step).
Note: The following steps use the RNaqueous midi kit from Ambion (AM1911).
12. Dissolve the pellet in 5 ml of lysis/binding Solution.
Note: Heating at 37 °C and vortexing will help dissolve pellet.
13. Heat 4.5 ml of elution solution to 100°C (for later use). Use a 17 X 100 mm round
bottom sterile polypropylene tube with loose-fitting dual position cap.
14. Add 5 ml of 64% ethanol to RNA in lysis/binding solution and draw into a 10 ml syringe
through an 18 gauge needle.
15. Remove needle and attach filter unit. Slowly push the lysate/ethanol mixture through.
Note: Often times the filter gets clogged. There are tips in the kit manual for dealing
with this but we found these did not help. We tried to get the solution through one filter
23 even with some intense pressure. Sometimes two filters were required and the following
wash/elution steps were performed on both filters.
16. After filtering the solution, force air through using a clean 10 ml syringe until no more
white foam is expelled (at least 3 or 4 times).
17. Wash with 100% volume of Wash Solution #1 (using syringe). Use a clean 18 g needle
to draw up solution.
18. Force air through a few times again.
19. Wash with 70 % volume Wash Solution #2/3. Repeat once using syringe.
20. Force air through again until no more water droplets or fine spray can be seen.
21. Elute at least 2 times into a 2 ml tube (only use 500 ul at a time, so elute three times per
tube to get about 1.5 ml total) using 100°C Elution Solution and a sterile 3 ml syringe.
22. LiCl precipitate each sample using the LiCl provided in the kit.
a. Add ½ volume of LiCl.
b. Place at -20°C for at least 30 min..
c. Centrifuge at maximum speed in a microcentrifuge (about 16,000 g) for 15 min at
room temperature.
d. Wash pellet with 1 ml of 70 % ethanol.
e. Centrifuge again with the same conditons for 5 min.
f. Remove supernatant and air dry pellet.
g. Resuspend in 50 or 100 µl RNase free H2O.
Note: Heating and vortexing can help with this.
23. DNase Treatment using the TURBO DNA-free kit from ambion (AM1907)
a. Add 1.5 µl of DNase + 0.1 volumes of 10X buffer.
b. Incubate at 37°C for 25 min.
c. And another 1.5 µl of DNase.
d. Incubate at 37°C for another 25 min.
e. Add 0.1 volumes DNase Inactivation Reagent.
f. Incubate 5 min at room temperature, flicking tubes every minute.
g. Centrifuge at 10,000 g for 1.5 min.
h. Move supernatant to a new tube.
24. LiCl precipitate the supernatant again (same as above). The final resuspension volume
should be between 50-100 µl using RNase free H2O.
24 Protocol 14: Ambion Trizol RNA Extraction in Microcentrifuge Tubes with Turbo DNAfree Digestion
Implemented by: Ingrid Jordon-Thaden and Nicholas Miles (Soltis Labs)
This procedure eliminates the mortar and pestle homogenization of tissues and instead grinds
tissue in 2 ml microcentrifuge tubes. The method closely follows Ambion’s protocols and could
be used in a 96-well format. This method worked great for species that proved to be difficult to
extract with other methods (i.e. woody and aquatic plants). Listed below are two slightly
different methods for tissue collection: one directly in microcentrifuge tubes, and one in 50 ml
tubes.
Note: both the addition of β-mercaptoethanol in extraction and high salts (recommended by
Ambion: 0.8M sodium citrate and 1.2M NaCl) in precipitation were tried with this method and
yield or quality was not affected. The addition of Sarkosyl significantly improved both yield and
quality.
Reagents
 20% bleach
 95% ethanol
 RNase Zap
 liquid N2
 Trizol Reagent (Ambion, Life Technologies, Carlsbad, CA)
 20% Sarkosyl (optional for difficult species)
 100% Chloroform
 100% isopropanol
 75% Ethanol
 Turbo DNA-free Kit (Ambion, store at -20°C)
o 10X Turbo DNAase buffer
o Turbo DNAase
o DNAase Inactivation Reagent
Other materials
 Zirconia beads, prebaked at 200°C for 4 hours
 P1000 RNA-free filter tips
 P100 RNA-free filter tips
 RNAase free pipettes: P1000, P100
 Stretch winter gloves
 Nitrile gloves (to wear over winter gloves, latex will crack in liquid N2)
 Two large plastic Eppendorf tube racks (96 well)
 Black shaker block (for automatic shaker/pulverizer machine)
 One ice bucket to hold the Eppendorf tube rack on ice
 One small ice bucket with liquid N2 to hold the black shaker block
25 



One small ice bucket with liquid N2 to hold the tubes while weighing
24 – 2 ml RNA-free Eppendorf tubes (to freeze the tissue in; note, do not use cheap tubes
as they will crack)
72 – 1.7mL RNA-free tubes (3 batches of 24, pre-labeled; 96 tubes if you plan on using
the Sarkosyl step, this adds one more chloroform extraction)
3x – 50mL Falcon tubes
o to hold 100% chloroform
o to hold 100% isopropanol (also called 2-propanol)
o to hold 75% Ethanol
Equipment:
 Automatic shaker for tissue homogenization
 24-place centrifuge cooled to 4°C
 Vortexer
 Water bath at 50°C
 Incubator at 37°C with orbital shaker
Concise version of Protocol (longer version available on request from Ingrid Jordon-Thaden,
[email protected])
Leaf Collection in microcentrifuge tubes (best to use 60 to 100 mg of tissue for high
throughput)
Pre-label RNAase free 2mL tubes, place 5 zirconia beads (pre-baked at 200C for 4
hours) in each tube and place in boxes. Cut leaf tissue and put into the tube, then directly
into small cooler with liquid N2. Store the boxes with the leaves in -80°C freezer.
Leaf collection in 50mL Falcon tubes (for large collections so many extractions can be
made on the same sample if necessary)
Identify the species to collect, give it a collection number, and write the number of the 50
ml Falcon tube. Use a scissor or pin to cut a hole in the top of the tube. Fill the tube with
N2 and place in cooler with N2 and rack. Clean scissors with Ethanol and RNAzap. Cut
the youngest leaf tissue and immediately put into the Falcon tube for freezing. Take a
specimen of the plant for a voucher. Place the tubes in the -80°C.
Procedure (note all centrifuge steps at 4°C unless otherwise specified)
1. Clean all surfaces and equipment with 20% bleach, 95% ethanol, and RNase Zap, and
place supplies in hood.
2. Prepare sample tubes with labels.
3. Fill microcentrifuge tubes with 5 zirconia beads (if they have not been done before tissue
collection)
4. Prepare aliquots of the following solutions into 50mL Falcon tubes
26 i. 75% ethanol, 100% chloroform, 100% isopropanol
5. Fill dewar with liquid N2.
6. Get samples from -80°C and place them in the black shaker block that is sitting in a bath
of liquid N2 (if already in microcentrifuge tubes), and check labels.
7. Prepare frozen tissue in microcentrifuge tubes from 50 ml Falcons (if tissue is not in them
already)
a. Fill two small coolers with liquid nitrogen (one for microcentrifuge tubes, the
other for Falcon tubes)
b. Doing about 3 or 4 plants at a time, use forceps or a spatula to move 60 to 100 mg
of tissue into the microcentrifuge tubes from the 50 ml Falcon tubes.
c. Get a weight for each of the samples, add more if necessary.
i. This kit requires 60 to 100 mg of plant tissue.
ii. Note: if you are doing aquatic plants use 2x the weight since so much of
the weight is water.
d. Repeat these steps until you have finished all of the samples.
8. Before pulverizing the frozen tissue, check each tube for beads, making sure they are
easily moving within the tube.
9. Taking the black shaker block with the 24 samples, place tightly into the automatic
shaker, doing this quickly as to not allow thawing. Shake for 2 min (they will stay
frozen).
10. Place block back into the liquid nitrogen cooler if needed to shake another 2 min, then
shake a second time after the block appears to be frozen.
a. Keep on liquid nitrogen, until Trizol is added.
11. Add 1 ml of Trizol solution.
a. Optional adjustment is to add 50 to 100 μl of 20% Sarkosyl to each sample with
the Trizol.
b. Before opening the tubes, tapping the tube on the bottom on the bench will empty
most of the leaf tissue that is in the lid from the shaking process.
c. If it does not move, you can use the vortexer with the solution added to force it
out of the lid.
12. After each tube has had the Trizol solution added, vortex immediately, both the top and
bottom of the tube, until all tissue is hydrated. Vortexing for 2 min can be common.
13. Then place tube on ice, and do this sequentially until you have all 24 tubes finished.
27 14. Once the batch is ready, incubate at room temperature (RT) for 5 min.
15. Centrifuge at 12,000 g for 10 min at 4°C
16. Pipette aqueous solution to a new 1.7 ml tube (will be 900 to 800 μl)
a. If Sarkosyl is used, be aware it will be a thick, viscous layer at the interface. Try
not to pull any into the aqueous layer.
17. Add 200 μl of 100% chloroform to each tube (do not change this volume or more protein
will be forced into the aqueous layer)
18. Vortex for 10-15 sec (solution should be milky colored)
19. Incubate at RT for 10 min.
20. Centrifuge at 12,000 g for 15 min at 4°C (be careful not to disturb layers)
21. Remove upper aqueous layer (should be clear), 500 to 700 μl, and put in new 1.7 ml tube
22. If you suspect the sample does not look “clean” or if you had used the Sarkosyl addition,
repeat steps 6c to 6g (i.e. the chloroform step).
23. Precipitate and pellet the nucleic acids by adding 500 μl of isopropanol, and mix by
inverting the rack.
24. Incubate at RT for 10 min.
25. Centrifuge at 12,000 g for 15 min at 4°C
26. Pour off supernatant as waste.
27. Add 1000 ml of 75% ethanol to tube with pellet and vortex until the pellet is loose.
28. Centrifuge at 8900 g for 5 min at 4°C.
29. Pour off ethanol into beaker and tap tube on tissue to pull as much ethanol off as you can.
30. Centrifuge the “empty” tube for 2 min at 4°C
31. Using pipette, pull off excess ethanol collected at the bottom of the tube.
a. Final pellet should be clear. If white, then may still have salts and you can repeat
the ethanol wash a second time, however, the DNA removal step seems to also
remove salt contamination.
32. Let the pellet dry for 2 min at room temperature, but no more than 10 min.
33. Re-dissolve in 50 μl of RNase-free water. If the RNA is pure, it should be instant.
a. To aid in dissolution, incubate at 55°C for 10 min in water bath, vortex gently
when finished.
Note: Never incubate longer than 10 min at this temperature. Also never increase the
temperature, as this can cause RNA degradation. If pellet does not dissolve immediately,
28 store at 4°C overnight (or until dissolved). However, the best samples dissolve with no
trouble.
34. Check on nanodrop for concentration. Samples should be diluted to be less than 200
ng/μl before proceeding to the DNAase steps. However, we have done the following
steps with samples that are up to 700 ng/μl and had success. We suggest only diluting
when you have 2000 or 3000 ng/μl of RNA.
35. Removal of DNA using Turbo DNA-free kit by adding 0.1 volume of 10X Turbo DNase
buffer (usually 5 μl if no dilution of RNA was made) and add 1 μl of Turbo DNase to the
RNA (always only 1 μl) and mix gently.
36. Incubate at 37°C while shaking on the orbiter inside the incubation oven for 30 min.
37. Add resuspended DNase Inactivation Reagent (typically 0.1 volume; 5 μl if no dilution of
RNA was made) and mix well (vortex very briefly).
38. Incubate at RT for 2 min, vortexing occasionally.
39. Centrifuge at 10,000 G for 1.5 min at 4°C and transfer to a new tube.
40. Measure RNA with nanodrop again.
a. 100 ng/μl is ideal, but there will most likely be more. Expect some loss from the
Turbo kit. Also, if the first time the spectra appeared contaminated, this step may
have cleaned it some. The OD 260/280 ratio should be 1.8 to 2.2 (not less than
1.6), in order to get good transcriptome library construction.
Protocol 15: Hot Acid Phenol Method for Algae
Implemented by: Falicia Goh and Neil Clarke
This RNA isolation method is modified from that described by Köhrer and Domdey5
Reagents
Extraction Buffer:
 1% SDS (v/v, starting from 10% SDS stock solution)
 51 mM sodium acetate pH5.5
 10 mM EDTA
 DEPC treated water
Note: The final reaction buffer was filter purified using Nalgene 0.22µM filter.
Other reagents:
 Acid phenol (pH 4.3)
 Phenol:chloroform (5:1) acid equilibrated to pH4.7 from Sigma
 Isopropanol
 70% ethanol (diluted in DEPC treated water H20)
29 
3M Sodium acetate pH5.5
Procedure
1. Preheat phenol and phenol:chloroform to 65°C. Heated phenol should not be re-used.
2. Collect algae cells via centrifugation for 10 min at 16100 g at room temperature. Flash
freeze pellets with liquid nitrogen and keep at -80°C until extractions are carried out.
3. Re-suspend frozen pellet in 800μl of preheated extraction buffer.
4. Immediately add 800 μl of hot acid phenol. Vortex the tubes for 15 seconds.
5. Incubate at 65°C for 10 min. Vortex every 1 min for 10 sec.
6. Centrifuge at 16100 g at 4°C for 5 min.
7. The aqueous phase was transferred to fresh 1.5 ml micro-centrifuge tube.
8. Repeat steps 5 - 7. Repeat 3X (depending on the amount of cells used).
9. Extract with equal volume of phenol:chloroform (5:1). Vortex for 1 min at room
temperature. Spin for 5 min in a microcentrifuge at top speed. Repeat step 9 three times.
10. Transfer aqueous phase to a new 1.5 ml microfuge tube. Volume should be ~700μl.
11. Add 1/10 volume of 3 M sodium acetate, pH 5.5, and 1 volume of isopropanol. Hold at
4°C for 30 min or more.
12. Spin in micro-centrifuge in 4°C at top speed for 20 min.
13. Remove the supernatant without dislodging the pellet.
14. Wash the pellet with 70% ethanol.
15. Invert and air dry tubes at room temperature.
The pellet was re-suspended in 50μl DEPC treated H2O. The RNAs were stored at -20°C.
Protocol 16: CTAB-Hot Acid Phenol Method for Algae
Implemented by: Falicia Goh and Neil Clarke
This RNA isolation method is a combination and modification of the hot acid phenol method
(protocol 14) and that described by Asif et al6. This method was used for two taxa (P. cruentum
and B. braunii).
Reagents
Extraction Buffer:
 100 mM Tris-HCl pH8.2
 1.4 M NaCl
 2% CTAB
30  20 mM EDTA pH8.2
 1 µl of 2-mercaptoethanol per ml of buffer just before use
 DEPC treated water
Note: The final reaction buffer was filter purified using Nalgene 0.22µM filter.
Other reagents:
 Acid phenol (pH 4.3)
 Phenol:chloroform (5:1) acid equilibrated to pH 4.7 from Sigma
 Chloroform
 Isopropanol
 70% ethanol (diluted in DEPC treated water H20)
 3 M Sodium acetate pH 5.5
 3 M Lithium chloride
Procedure
1. Preheat phenol and phenol:chloroform to 65°C. Heated phenol should not be re-used.
2. Collect algae cells via centrifugation for 10 min at 16100 g at room temperature. Flash
freeze pellets with liquid nitrogen and keep at -80°C until extractions are carried out.
3. Re-suspend the frozen pellet in 800μl of preheated extraction buffer.
4. Incubate at 65°C for 1 hr. Gently vortex every 15 min.
5. Cool to room temperature and add equal volume of chloroform. Shake vigorously until 2
phases form an emulsion.
6. Collect the aqueous phase by centrifuging for 10 min in micro-centrifuge at 16100 g at
room temperature.
7. Collect aqueous phase and re-extract with an equal volume of chloroform. Centrifuge as
above
8. Collect aqueous phase and add 10M LiCl to a final concentration of 3M. Allow the RNA
to precipitate at 4°C overnight.
9. Recover the RNA by centrifugation at 16100 g at 4°C for 20 min.
10. Dissolve pellet in DEPC treated water and extract once with hot acid phenol.
11. Extract the aqueous phase with equal volume of phenol:chloroform (5:1).
12. Vortex for 1 minute at room temperature. Spin for 5 min in a micro-centrifuge at top
speed.
13. Extract the aqueous phase with equal volume of chloroform.
14. Collect aqueous phase and add 1/30 volume of 3M sodium acetate pH 5.5 and 0.1 volume
of 100% ethanol. Mix well and keep on ice for 30 min. Centrifuge in cold for 25 min. A
white jelly-like pellet consisting mostly of polysaccharides is obtained and discarded.
31 15. To the clear supernatant add 3M sodium acetate pH 5.2 to a final concentration of 0.3M
and 3 volumes of 100% ethanol. Allow the RNA to precipitate at -80°C for 3 h to
overnight.
16. Spin in micro-centrifuge at 4°C at top speed for 20 min.
17. Wash the pellet with 70% ethanol.
18. Invert tubes and air dry at room temperature.
19. Resuspend pellets in 50 μl of DEPC treated water.
Protocol 17: Invitrogen PureLink-Qiagen RNeasy Hybrid
Implemented by: Patrick Edger and J. Chris Pires
For a small number of samples the Invitrogen Purelink RNA Mini Kit (Cat #12183-018A) was
used to isolate total RNA while the Qiagen RNeasy MinElute Cleanup kit (Cat #74204) was used
to purify and concentrate total RNA. The methods followed the manufacturer’s instructions and
thus they are not repeated here.
Protocol 18: innuPREP Plant RNA Kit
Implemented by: Michael Melkonian and Barbara Surek
A small number of algae samples were extracted using the innuPREP Plant RNA Kit (Analytik
Jena, Jena Germany) with either the PL and RL lysis buffer. The method followed the
manufacturer’s protocols and so they are not repeated here.
REFERENCES
1. Sambrook, J. & Russell, D.W. Molecular Cloning: A Laboratory Manual, 3rd ed. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 2001). 2. McKenzie, D.J., McLean, M.A., Mukerji, S. & Green, M. Improved RNA extraction from woody plants for the detection of viral pathogens by reverse transcription‐polymerase chain reaction. Plant Disease 81, 222‐226 (1997). 3. van Tunen, A.J. et al. Cloning of the two chalcone flavanone isomerase genes from Petunia hybrida: coordinate, light‐regulated and differential expression of flavonoid genes. The EMBO Journal 7, 1257‐1263 (1988). 4. Chomczynski, P. & Sacchi, N. Single‐step method of RNA isolation by acid guanidinium thiocyanate‐
phenol‐chloroform extraction. Analytical Biochemistry 163, 156‐159 (1987). 5. Kohrer, K. & Domdey, H. Preparation of high molecular weight RNA. Methods in Enzymology 194, 398‐405 (1991). 6. Asif, M.H., Dhawan, P. & Nath, P. A simple procedure for the isolation of high quality RNA from ripening banana fruit. Plant Molecular Biology Reporter 18, 109‐115 (2000). 32 Figure S1. A comparison of the frequency of scaffolds according to variation in % GC content
between samples sequenced on HiSeq versus GA II platforms. Distributions are broadly
overlapping. The long right-tail from the HiSeq samples is caused by a disproportionate number
of algae samples sequenced on that platform. These algae exhibited especially rich GC
transcripts, which is consistent with the results of published whole genome sequences of green
algae[40,41].
Table S2 The success/failure of RNA isolations using Qiagen’s RNeasy Plant Minikit (see
Appendix S1, Protocol 1) and alternative hybrid protocol (see Appendix S1, Protocol 2).
Successful isolations were those that met the stringent criteria of two isolations of the same
tissue resulting in a concentration of > 150 ng/µL, a total mass of 20 µg RNA, OD 260/280 >
1.9, OD 260/230 > 1.5, r26S/18S > 1 and RIN > 8. A ‘1’ denotes that the sample met these
criteria and ‘0’ indicates that the sample did not meet these criteria; ‘-‘ indicates the method was
not attempted for the sample. In most cases, if a sample failed with the RNeasy kit (Appendix
S1, Protocol 1) we attempted the alternative Qiagen recommended protocol (Appendix S1,
Protocol 2). Family names are according to APG III (2009). Except when noted, we used freshly
expanding leaves for all isolations.
Family
Actinidiaceae
Adoxaceae
Asparagaceae
Asparagaceae
Amaryllidaceae
Amaryllidaceae
Amaryllidaceae
Symplocaceae
Araliaceae
Arecaceae
Asteliaceae
Bignoniaceae
Bignoniaceae
Bromeliaceae
Cactaceae
Calycanthaceae
Cannabaceae
Cannaceae
Caprifoliaceae
Caryophyllaceae
Caryophyllaceae
Celastraceae
Celastraceae
Cephalotaceae
Cephalotaxaceae
Chrysobalanaceae
Cistaceae
Cornaceae
Cyrillaceae
Droseraceae
Elaeagnaceae
Ericaceae
Species
Actinidia chinensis
Viburnum sp.
Yucca brevifolia ssp. brevifolia
Yucca brevifolia ssp. jaegeriana
Phycella cyrtanthoides
Rhodophiala pratensis
Zephyranthes treatiae
Symplocus sp.
Polyscias fruticos
Serenoa repens
Astelia (HYBRID)1
Mansoa alliacea
Tabebuia umbellate
Brocchinia reducta
Pereskia aculeate
Idiosporum australiense
Celtis occidentalis
Canna sp.
Lonicera japonica
Schiedea membranacea
Schiedea salicaria
Crossopetalum rhacoma
Hippocratia parviflora
Cephalotus follicularis
Amentotaxus formosana
Chrysobalanus icaco
Cistus influtus
Cornus floridana
Cyrilla racemiflora
Drosera capensis
Elaeagnus pungens
Cavendishia cuatrecasasii
RNeasy
1
0
1
1
1
0
1
1
0
0
0
1
1
1
0
1
1
1
1
1
0
0
0
0
0
0
0
0
1
0
0
0
Alternative
1
0
0
0
0
0
0
0
0
1
0
0
0
1
Erythroyxlaceae
Fagaceae
Fouquieriaceae
Gelsemiaceae
Geraniaceae
Geraniaceae
Heliconiaceae
Hydrangeaceae
Juglandaceae
Lamiaceae
Lauraceae
Lecythidaceae
Malvaceae
Melanthiaceae
Melanthiaceae
Melastomataceae
Primulaceace
Nyssaceae
Ochnaceae
Olacaceae
Oleaceae
Pandanaceae
Papaveraceae
Petrosaviaceae
Phytolaccaceae
Picramniaceae
Pittosporaceae
Plantaginaceae
Plumbaginaceae
Podocarpaceae
Podocarpaceae
Podocarpaceae
Polygonaceae
Rosaceae
Sapindaceae
Sapindaceae
Sapotaceae
Sapotaceae
Sarraceniaceae
Saururaceae
Saxifragaceae
Schisandraceae
Solanaceae
Erythroxylon coca
Quercus shumardii
Fouquieria macdougalii
Gelsemium semperivens
Geranium carolinianum
Geranium maculatum2
Heliconia sp.
Hydrangea quercifolia
Carya glabra
Oxera neriifolia
Sassafras albidum
Barringtonia racemosa
Hibiscus laevis
Helonias bullata
Xerophyllum asphodeloides
Medinilla magnifica
Ardisia humilis
Nyssa ogeche
Ochna serrulata
Ximenia Americana
Chionanthus retusus
Freycinetia multiflora
Sanguinaria canadensis
Japonolirion osense
Petiveria allicea
Picramnia perfaundra
Pittosporum sahnianum
Plantago virginica
Plumbago auriculata
Acmopyle pancheri
Falcatifolium taxoides
Sundacarpus amarus
Polygonella americana
Rosa palustris
Aesculus pavia
Acer negundo
Sideroxylon reclinatum
Synsepalum dulcificum
Heliamphora minor
Anemopsis californica
Oresitrophe rupifraga
Illicium verum
Brugmansia sanguinea
1
1
0
1
1
0
1
1
0
0
1
0
0
1
1
0
0
0
0
0
1
0
1
1
1
1
1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
1
0
0
0
1
0
1
1
1
0
0
0
0
0
1
1
1
0
0
1
0
-
Staphyleaceae
Stemonaceae
Styracaceae
Thymelaeaceae
Violaceae
Zingiberaceae
1
2
Staphylea pinnata
Croomia sp.
Sinojackia xylocarpa
Edgeworthia papyrifera
Viola tricolor
Curcuma Olena
Meristematic bud tissue
Mixture of leaves and flowers
1
1
1
1
0
1
0
-
Table S3 Correlations in the number of scaffolds with different minimum threshold size cutoffs used during assemblies. For example,
the column labeled 500 bp contains all scaffolds that are 500 bp or larger. Pearson product moment correlation coefficients (r) are
shown in the upper triangular matrix, and P-values are shown in the lower triangular matrix.
1000 bp
900 bp
800 bp
700 bp
600 bp
500 bp
1000 bp
1
<.0001
<.0001
<.0001
<.0001
<.0001
900 bp
0.99686
1
<.0001
<.0001
<.0001
<.0001
800 bp
0.98475
0.99531
1
<.0001
<.0001
<.0001
700 bp
0.95667
0.97621
0.99241
1
<.0001
<.0001
600 bp
0.89861
0.92897
0.95957
0.98666
1
<.0001
500 bp
0.78364
0.82632
0.87417
0.92595
0.97468
1
Table S4 Least-squares means of descriptors of RNA quality among different plant tissues types. The least-squares mean ± 1 SE and
sample size are provided for each tissue type.
Tissue type
Belowground
Shoots/Stems
Buds
(lvsa/flwsb)
Leaf
Flower
Fruit
Mixed tissue
Algal cells
a
leaves. bflowers.
loge(RNA conc.)
Mean
N
3.67 ± 0.08
12
3.58 ± 0.09
7
26S:18S
Mean
1.43 ± 0.18
1.50 ± 0.24
3.65 ± 0.22
3.53 ± 0.05
3.96 ± 0.34
2.73 ± 0.34
3.67 ± 0.07
3.30 ± 0.06
1.26 ± 0.16
1.15 ± 0.03
1.00 ± 0.31
1.55 ± 0.20
1.07 ± 0.04
1.47 ± 0.04
15
492
4
10
276
274
N
12
7
RIN
Mean
7.49 ± 7.49
6.24 ± 6.24
N
12
7
15
489
4
10
276
168
6.75 ± 6.75
6.21 ± 6.21
4.68 ± 4.68
7.41 ± 7.41
5.99 ± 5.99
6.75 ± 6.75
15
480
4
10
273
268
OD 260/280
Mean
N
1.57 + 0.33
1
1.98 + 0.33
1
OD 260/230
Mean
N
0.91 + 0.64
1
1.93 + 0.64
1
1.73 + 0.14
1.97 + 0.02
2.07 + 0.24
1.87 + 0.03
1.03 + 0.33
1.38 + 0.26
1.69 + 0.04
1.40 + 0.46
1.55 + 0.06
0.94 + 0.64
6
378
2
0
121
1
6
258
2
0
121
1
Table S5 P-values for a posteriori pairwise contrasts of RNA quality among tissue types. Pvalues are adjusted for multiple comparisons within each variable using the Tukey-Kramer
correction method. P-values < 0.05 are bolded.
P-values
Tissue 1
Belowground
Belowground
Belowground
Belowground
Belowground
Belowground
Belowground
Buds (lvs/flws)
Buds (lvs/flws)
Buds (lvs/flws)
Buds (lvs/flws)
Buds (lvs/flws)
Buds (lvs/flws)
Algal cells
Algal cells
Algal cells
Algal cells
Algal cells
Flower
Flower
Flower
Flower
Fruit
Fruit
Fruit
Leaf
Leaf
Mixed tissue
a
leaves. bflowers.
Tissue 2
Buds (lvsa/flwsb)
Algal cells
Flower
Fruit
Leaf
Mixed tissue
Shoot/Stem
Algal cells
Flower
Fruit
Leaf
Mixed tissue
Shoot/Stem
Flower
Fruit
Leaf
Mixed tissue
Shoot/Stem
Fruit
Leaf
Mixed tissue
Shoot/Stem
Leaf
Mixed tissue
Shoot/Stem
Mixed tissue
Shoot/Stem
Shoot/Stem
RNA mass r26S:18S
RIN
0.9344
0.9974
0.9702
1
0.8795
0.0015
0.4724
0.9368
0.1542
0.9998
1
0.0234
0.1685
0.815
0.2662
0.9602
0.5124
0.1195
0.4701
1
0.8558
0.1538
0.9157
1
0.4849
0.9957
0.5044
0.9475
0.9887
0.0386
0.6048
0.9982
0.9561
0.9149
0.9388
0.7985
0.7720
0.9906
0.999
0.1529
0.8133
0.3536
0.1314
0.9999
0.9567
0.0038
<0.0001
0.0039
<0.0001
<0.0001 <0.0001
1
0.9969
0.0233
0.8107
0.2091
0.0326
0.3029
0.9997
0.7327
0.4781
1
0.8575
0.3578
0.9055
0.8846
0.4903
0.4718
0.0453
0.2346
0.2674
0.0225
1
0.9112
0.0369
0.0874
0.558
0.8139
0.6345
0.8298
1
0.4037
0.6051
1
OD
260/280
0.9995
0.9137
0.8845
0.8989
0.9723
0.9769
0.4615
0.8666
0.5763
0.9415
0.9921
0.1452
0.0766
0.1565
0.4066
0.9995
0.9813
1
0.0955
1
0.9999
OD
260/230
0.9942
1
0.9961
0.8912
0.9561
0.9218
0.996
1
0.9012
0.995
0.9851
0.9973
0.9083
0.9652
0.9317
0.9957
0.9999
0.994
0.4422
0.9998
0.9971
Table S6 Least-squares means of descriptors of sequencing success among plant tissue types. The least-squares mean ± 1 SE and
sample size are provided for each tissue type.
Tissue type
Belowground
Shoots/Stems
Buds (lvsa/flwsb)
Leaf
Flower
Fruit
Mixed tissue
Algal cells
a
leaves. bflowers.
Bases
Mean
2.12 E9 ± 1.25 E8
1.76 E9 ± 1.36 E8
2.15 E9 ± 1.50 E8
2.04 E9 ± 0.26 E8
2.45 E9 ± 3.19 E8
2.21 E9 ± 2.02 E8
2.02 E9 ± 0.36 E8
1.67 E9 ± 0.50 E8
N
13
11
9
304
2
5
157
83
Q20
Mean
2.05 E9 ± 1.26 E8
1.62 E9 ± 1.37 E8
2.07 E9 ± 1.52 E8
1.96 E9 ± 0.26 E8
2.37 E9 ± 3.22 E8
2.13 E9 ± 2.03 E8
1.93 E9 ± 0.36 E8
1.46 E9 ± 0.50 E8
N
13
11
9
304
2
5
157
83
Reads
Mean
1.30 E7 ± 0.73
1.19 E7 ± 0.80
1.43 E7 ± 0.88
1.18 E7 ± 0.15
1.51 E7 ± 1.87
1.40 E7 ± 1.18
1.17 E7 ± 0.21
0.95 E7 ± 0.29
N
13
11
9
304
2
5
157
83
Scaffolds
Mean
5983 ± 687
6356 ± 747
6761 ± 826
4826 ± 143
6437 ± 1751
6816 ± 1108
5124 ± 198
4709 ± 272
N
14
11
10
380
2
5
206
248
Table S7 P-values for a posteriori pairwise contrasts of various sequencing metrics among
different tissue types. P-values are adjusted for multiple comparisons within each variable using
the Tukey-Kramer correction method. P-values < 0.05 are bolded.
Tissue 1
Belowground
Belowground
Belowground
Belowground
Belowground
Belowground
Belowground
Buds (lvs/flws)
Buds (lvs/flws)
Buds (lvs/flws)
Buds (lvs/flws)
Buds (lvs/flws)
Buds (lvs/flws)
Algal cells
Algal cells
Algal cells
Algal cells
Algal cells
Flower
Flower
Flower
Flower
Fruit
Fruit
Fruit
Leaf
Leaf
Mixed tissue
a
leaves. bflowers.
Tissue 2
Buds (lvsa/flwsb)
Algal cells
Flower
Fruit
Leaf
Mixed tissue
Shoot/Stem
Algal cells
Flower
Fruit
Leaf
Mixed tissue
Shoot/Stem
Flower
Fruit
Leaf
Mixed tissue
Shoot/Stem
Fruit
Leaf
Mixed tissue
Shoot/Stem
Leaf
Mixed tissue
Shoot/Stem
Mixed tissue
Shoot/Stem
Shoot/Stem
Bases
1
0.0189
0.9679
0.9999
0.9987
0.9958
0.5426
0.0462
0.9846
1
0.9959
0.9906
0.5435
0.1952
0.1469
<.0001
<.0001
0.9975
0.9971
0.8753
0.8522
0.45
0.9901
0.9829
0.5912
0.9999
0.4899
0.5999
P-values
Q20
Reads Scaffolds
1 0.9529
0.9963
0.6716
0.0004 0.0002
0.9845 0.9694
1
1 0.9956
0.9983
0.9972 0.7556
0.7201
0.9832 0.6628
0.9313
0.2742 0.9683
1
0.2634
0.0039 <.0001
0.9909 0.9999
1
1
1
1
0.997 0.1059
0.2904
0.9859 0.0792
0.5321
0.3394 0.4655
1
0.1019 0.0616
0.9779
0.588
0.0308 0.0046
0.9999
<.0001 <.0001
0.9217
<.0001 <.0001
0.9671 0.0852
0.4345
0.9987 0.9998
1
0.9141 0.6604
0.9845
0.8772 0.6157
0.9956
0.383 0.7658
1
0.9909 0.5738
0.6328
0.9758 0.5064
0.8054
0.4074 0.7993
1
0.9959 0.9996
0.9252
0.2041
1
0.4744
0.3455
1
0.7534
Table S7 P-values for a posteriori pairwise contrasts of various sequencing metrics among
different tissue types. P-values are adjusted for multiple comparisons within each variable using
the Tukey-Kramer correction method. P-values < 0.05 are bolded.
Tissue 1
Belowground
Belowground
Belowground
Belowground
Belowground
Belowground
Belowground
Buds (lvs/flws)
Buds (lvs/flws)
Buds (lvs/flws)
Buds (lvs/flws)
Buds (lvs/flws)
Buds (lvs/flws)
Algal cells
Algal cells
Algal cells
Algal cells
Algal cells
Flower
Flower
Flower
Flower
Fruit
Fruit
Fruit
Leaf
Leaf
Mixed tissue
a
leaves. bflowers.
Tissue 2
Buds (lvsa/flwsb)
Algal cells
Flower
Fruit
Leaf
Mixed tissue
Shoot/Stem
Algal cells
Flower
Fruit
Leaf
Mixed tissue
Shoot/Stem
Flower
Fruit
Leaf
Mixed tissue
Shoot/Stem
Fruit
Leaf
Mixed tissue
Shoot/Stem
Leaf
Mixed tissue
Shoot/Stem
Mixed tissue
Shoot/Stem
Shoot/Stem
Bases
1
0.0189
0.9679
0.9999
0.9987
0.9958
0.5426
0.0462
0.9846
1
0.9959
0.9906
0.5435
0.1952
0.1469
<.0001
<.0001
0.9975
0.9971
0.8753
0.8522
0.45
0.9901
0.9829
0.5912
0.9999
0.4899
0.5999
P-values
Q20
Reads Scaffolds
1 0.9529
0.9963
0.6716
0.0004 0.0002
0.9845 0.9694
1
1 0.9956
0.9983
0.9972 0.7556
0.7201
0.9832 0.6628
0.9313
0.2742 0.9683
1
0.2634
0.0039 <.0001
0.9909 0.9999
1
1
1
1
0.997 0.1059
0.2904
0.9859 0.0792
0.5321
0.3394 0.4655
1
0.1019 0.0616
0.9779
0.588
0.0308 0.0046
0.9999
<.0001 <.0001
0.9217
<.0001 <.0001
0.9671 0.0852
0.4345
0.9987 0.9998
1
0.9141 0.6604
0.9845
0.8772 0.6157
0.9956
0.383 0.7658
1
0.9909 0.5738
0.6328
0.9758 0.5064
0.8054
0.4074 0.7993
1
0.9959 0.9996
0.9252
0.2041
1
0.4744
0.3455
1
0.7534
Table S8 The statistical fit of all possible combinations of explanatory factors in the data set
that included OD ratios. Fit of models was statistically compared using maximum likelihood
statistics according to the Akaike information criterion (AIC). Models are arranged in the order
of best-fitting models (lowest AIC) to poorer fitting models (higher AIC). Inclusion or absence
of explanatory variables from models is shown by 1 and 0, respectively.
Tissue
RNA
conc.
r26S:18S
RIN
OD
260/280
OD
260/230
Platform
µg RNA
sequenced
Bases
sequenced
AIC
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
1
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
1
0
1
0
1
0
1
0
1
0
0
1
1
0
1
0
0
1
0
0
1
1
1
0
0
1
1
0
0
1
1
0
0
1
0
1
1
1
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
1
1
0
1
1
0
1
1
1
1
1
1
0
1
0
0
1
0
1
1
1
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
1
1
1
1
0
1
1
0
1
0
0
1
1
1
1
1
1
1
1
0
1
1
1
1
1
1
0
0
1
1
1
1
1
1
1
1
0
1
0
1
0
1
0
1
1
1
0
0
1
0
1
0
1
1
1
1
1
0
1
1
1
1
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
1
1
1
1
1
0
1
1
0
0
1
0
1
0
1
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
0
1
1
1
1
1
1
1
1
0
0
1
1
0
0
0
0
0
1
1
0
1
1
1
1
1
0
1
1
1
0
0
0
0
0
0
0
0
0
0
1
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
1
0
0
1
3187
3187.8
3201.8
3202.6
3202.6
3204.3
3204.8
3206.1
3208.1
3209.7
3214.5
3214.6
3215.7
3215.7
3218.1
3219
3219.1
3219.2
3219.2
3219.6
3220.9
3222.5
3222.7
3224.4
3224.9
3226
3226.4
3228.1
3228.8
3229.2
3230.2
3230.4
3230.7
3230.8
3231.7
3232.2
3232.3
1
0
1
0
1
0
1
1
1
0
0
1
1
1
0
1
1
0
1
0
1
1
1
1
1
1
1
1
0
1
0
1
1
0
1
1
1
0
1
1
1
0
0
0
0
1
1
1
0
0
0
0
1
0
1
1
0
1
1
0
0
1
1
0
0
1
0
1
1
0
1
0
0
0
1
1
1
0
0
0
1
1
0
1
1
0
1
0
1
0
1
1
0
1
0
1
1
1
0
1
1
0
1
1
1
1
1
0
1
1
1
1
0
0
1
1
1
0
0
1
0
1
0
1
1
1
1
0
0
1
1
0
1
1
1
1
0
0
1
1
0
0
0
1
1
1
0
0
0
0
1
1
1
1
0
1
1
1
0
1
1
1
1
1
1
1
1
0
0
1
1
1
0
1
1
1
1
1
0
1
1
0
0
1
1
0
1
0
0
1
0
1
1
1
0
1
1
1
0
1
1
0
1
0
0
1
0
0
1
1
0
1
0
0
0
1
1
0
1
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
1
0
0
0
0
0
0
0
0
1
0
0
0
1
1
1
1
1
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
1
1
1
1
1
1
0
0
0
0
1
0
1
0
0
0
1
1
0
1
0
1
1
1
1
1
1
1
0
0
1
1
0
0
1
1
1
0
0
0
1
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
1
0
0
0
1
1
1
1
1
0
0
1
1
0
0
3232.6
3233
3233.7
3234.1
3234.2
3234.4
3235.4
3235.4
3235.5
3235.7
3235.8
3237.3
3237.6
3237.8
3238
3238.9
3239
3239.3
3241
3241.1
3241.5
3242
3242.3
3242.7
3243.7
3244.7
3245
3245.4
3245.6
3246.3
3246.3
3246.6
3247
3247.1
3247.3
3247.3
3247.3
3247.7
3248
3249.1
3250.5
3250.6
3250.7
1
0
1
0
1
1
0
1
1
1
1
1
0
1
1
0
1
1
0
1
0
1
0
0
1
1
1
1
1
1
1
0
0
0
1
1
0
1
0
0
0
0
1
1
0
0
0
0
1
0
0
0
0
1
1
1
1
1
1
0
1
1
0
0
0
0
1
1
1
0
1
1
0
0
1
0
0
1
0
1
1
1
0
0
0
0
0
1
0
0
1
1
1
1
0
1
0
1
1
0
1
0
1
1
1
0
1
1
1
1
1
1
0
0
0
0
1
1
1
0
1
0
0
0
1
0
1
1
0
1
1
0
1
1
0
0
1
0
0
0
0
1
0
1
1
0
1
0
1
1
1
0
0
1
0
1
1
1
0
0
1
1
1
0
0
1
0
1
1
1
1
1
1
0
1
1
1
1
1
1
0
0
1
0
0
0
1
1
0
1
1
0
1
0
0
0
0
0
1
0
1
1
1
1
0
1
1
1
1
1
0
1
1
0
0
1
1
1
1
1
1
1
0
1
1
1
1
1
1
1
1
0
0
1
0
0
0
1
1
0
0
0
0
0
0
0
0
0
1
0
1
1
0
0
0
1
1
1
1
1
1
1
0
1
1
1
1
1
1
1
1
1
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
1
0
0
0
1
1
0
1
0
1
1
0
0
0
0
0
0
1
1
1
1
0
1
0
1
1
1
0
1
1
1
1
1
1
1
1
0
1
0
1
0
0
0
0
1
0
1
0
1
0
1
0
0
0
0
0
1
0
0
0
1
0
0
1
0
1
0
1
0
1
0
0
1
1
1
1
0
0
0
1
1
1
3250.7
3250.9
3251
3251.3
3251.9
3252.5
3252.7
3252.9
3253.2
3253.6
3254.3
3254.3
3254.4
3254.4
3254.8
3254.8
3255.6
3255.9
3256.2
3256.3
3256.9
3256.9
3257.2
3257.9
3258.3
3258.8
3258.8
3259.7
3259.8
3260.1
3260.1
3260.3
3260.4
3260.4
3261.7
3261.7
3261.9
3262.1
3262.3
3262.6
3262.6
3263.6
3263.6
0
0
0
1
0
1
0
0
1
1
1
0
0
1
0
1
0
1
0
1
1
1
1
0
1
0
1
1
1
1
0
1
1
1
0
1
1
0
1
1
0
0
1
1
0
1
1
1
0
1
0
1
0
0
0
0
0
1
1
0
0
1
0
1
0
1
0
0
1
1
0
1
1
1
0
0
1
0
0
0
0
1
1
1
0
1
1
1
0
0
1
1
1
1
0
1
0
0
0
0
1
1
1
1
0
0
1
1
0
1
0
0
0
1
1
1
1
1
0
0
1
0
1
1
0
1
1
0
0
1
0
1
0
0
0
1
0
1
1
0
0
1
1
0
0
0
0
0
1
1
1
0
1
0
1
1
0
0
1
0
1
1
1
0
0
0
1
0
1
1
1
1
0
1
1
1
1
0
1
1
0
0
0
1
0
1
1
0
0
1
1
1
0
0
0
0
1
0
1
1
1
0
0
1
0
1
1
0
0
1
1
1
0
1
0
1
0
0
1
0
1
1
1
1
0
1
1
1
1
1
1
1
0
1
0
0
1
1
0
0
1
1
1
0
1
1
1
0
0
0
0
0
0
0
1
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
1
1
1
1
1
0
1
1
1
1
0
0
1
0
1
0
1
1
1
1
1
1
1
0
1
1
1
1
1
1
1
1
0
0
1
0
0
0
1
0
1
1
0
0
0
1
0
0
1
0
0
0
0
1
1
0
1
0
0
1
0
0
1
1
1
0
0
0
0
1
1
0
0
1
0
1
1
1
1
1
0
0
0
0
1
1
0
1
0
1
1
0
0
0
0
0
0
0
1
0
0
0
1
1
0
0
1
1
0
0
0
0
1
3263.7
3263.8
3264
3264.2
3265.7
3265.9
3266
3266.6
3266.9
3267.1
3267.4
3267.4
3267.5
3268.2
3268.7
3269.1
3269.4
3269.6
3269.8
3270.1
3270.2
3270.4
3270.9
3271
3271
3271.1
3271.2
3272.3
3272.7
3272.9
3272.9
3273
3273
3273.3
3273.4
3274.2
3274.3
3274.3
3274.4
3274.4
3274.5
3274.5
3274.6
1
0
1
1
0
0
0
0
0
0
1
0
0
1
0
1
0
0
1
1
0
1
0
0
1
0
1
0
0
0
0
1
0
0
0
1
1
0
1
0
1
1
0
1
1
1
1
1
0
0
1
0
1
0
1
0
0
0
0
1
0
1
1
1
1
1
0
0
0
0
1
0
0
0
0
1
1
1
0
1
1
0
0
1
0
1
1
1
1
0
1
0
1
0
0
0
0
0
1
1
0
0
1
1
0
0
1
1
0
1
1
0
0
1
1
1
0
0
1
1
0
0
1
0
0
1
0
1
1
0
1
0
0
0
1
0
1
1
1
0
1
1
1
1
0
0
0
0
0
1
1
1
1
0
0
1
0
0
1
0
0
0
1
0
1
0
0
0
0
1
1
1
1
1
0
0
1
0
0
1
0
0
1
0
0
1
1
0
0
1
0
1
0
1
1
1
0
1
0
1
0
1
0
0
0
1
0
0
0
1
1
0
0
1
1
1
0
0
0
0
0
0
0
1
1
1
0
1
1
1
1
0
1
1
1
1
1
1
1
0
1
0
1
1
0
1
0
1
1
1
1
0
1
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
0
1
1
1
1
1
1
1
0
1
0
1
1
1
1
1
1
1
1
1
0
1
0
1
1
1
1
1
0
1
1
1
1
1
0
1
1
0
1
1
0
1
0
1
0
1
1
0
1
0
0
1
0
1
0
0
0
0
1
0
1
0
1
1
1
1
0
0
1
0
0
1
0
0
1
1
0
0
0
0
0
1
1
1
0
1
1
1
1
0
1
1
1
1
1
1
0
1
0
1
1
1
1
0
0
1
0
0
0
1
0
1
0
1
0
1
3275.4
3275.7
3276
3276.5
3276.8
3276.9
3277.9
3278
3278.4
3278.5
3278.8
3278.9
3279
3279.4
3279.5
3279.8
3280
3280.8
3281.4
3281.9
3282.4
3282.4
3282.9
3283.2
3283.7
3283.9
3284.3
3284.3
3284.8
3284.9
3285
3285.2
3285.8
3286.1
3286.6
3286.6
3286.8
3287
3287.4
3287.5
3287.7
3287.9
3288.2
0
1
0
1
1
1
1
0
0
1
0
1
1
0
0
1
0
0
1
1
0
1
0
0
1
0
1
0
0
1
1
1
1
0
0
0
1
0
1
1
1
0
1
0
1
0
1
1
0
0
1
0
1
1
0
1
0
1
1
1
0
0
1
0
1
1
1
0
0
0
1
0
0
1
1
0
0
1
0
0
0
0
0
1
1
0
1
0
0
0
0
1
1
1
0
1
1
1
1
1
0
1
0
0
0
0
0
1
1
0
0
0
0
0
1
1
0
0
0
1
0
0
1
0
1
1
1
1
0
1
0
1
1
0
1
0
1
1
1
0
0
0
0
1
1
1
0
1
1
0
0
0
0
0
1
1
0
0
1
1
0
0
1
1
1
0
0
1
1
1
0
0
0
0
0
0
1
0
1
0
1
1
0
0
1
1
1
0
0
1
1
1
1
0
1
1
0
0
1
1
0
0
1
0
1
0
0
1
1
0
1
1
0
0
0
0
0
0
1
0
1
1
0
0
0
0
1
1
0
0
1
0
0
1
1
1
1
0
0
1
1
1
1
1
1
1
1
0
0
1
1
1
1
0
1
1
1
0
1
1
1
0
1
0
0
1
1
0
1
0
0
1
1
0
1
1
0
0
1
0
1
1
1
1
0
1
1
0
0
1
1
1
1
0
0
1
0
0
0
1
1
1
0
0
1
1
0
0
1
1
1
0
1
0
1
1
0
0
1
0
0
1
1
1
1
0
0
1
1
0
1
1
0
0
0
0
1
1
0
0
0
1
0
1
1
0
0
0
1
0
0
1
1
0
0
0
0
1
1
0
0
0
0
0
1
0
1
0
1
1
1
1
1
1
1
1
1
1
1
0
1
0
0
1
1
1
1
3288.5
3288.7
3288.8
3289.1
3289.7
3289.8
3290.1
3290.4
3290.6
3290.8
3291
3291
3291.6
3291.8
3292.1
3292.3
3292.4
3292.5
3292.8
3293.4
3293.5
3293.7
3293.7
3294.9
3295.3
3295.6
3295.6
3296.2
3297.6
3298.1
3298.6
3298.7
3299
3299.1
3299.1
3299.6
3299.7
3300.1
3300.5
3300.5
3300.6
3301
3301.6
0
0
1
1
0
0
0
1
1
1
0
0
0
0
1
1
0
1
0
0
0
1
0
0
0
0
1
0
1
1
1
0
0
1
0
1
1
0
1
0
0
1
0
0
0
1
1
1
1
0
1
0
1
1
0
0
1
1
0
0
0
1
0
1
1
0
1
0
1
0
1
1
0
1
1
1
0
1
1
1
0
1
0
0
0
1
1
1
1
0
1
0
0
1
1
1
0
1
0
1
0
1
1
0
1
0
1
1
1
0
0
1
0
0
1
1
0
0
1
0
0
1
0
1
0
0
0
0
0
0
1
1
0
1
1
1
0
1
1
0
0
0
1
0
0
1
0
0
1
1
1
0
1
0
0
1
1
0
0
0
0
0
1
0
0
1
1
1
0
0
0
0
1
0
1
1
0
1
1
1
0
1
0
1
1
0
0
1
1
1
1
0
1
0
0
1
0
1
1
0
1
0
1
1
0
0
0
0
1
1
0
0
1
0
0
0
1
1
0
0
1
0
1
0
0
1
1
0
1
0
0
1
1
0
0
1
0
0
0
0
1
0
0
0
1
1
0
0
1
0
1
0
1
1
1
1
0
1
1
0
0
1
1
0
1
0
0
0
1
0
1
0
1
0
0
0
1
1
0
0
1
1
1
0
0
1
0
0
0
1
1
0
1
0
0
0
0
1
1
1
1
0
1
0
0
0
1
0
1
1
0
0
1
0
1
1
1
0
1
0
1
0
1
1
0
1
1
1
1
1
0
1
0
1
0
1
0
1
1
0
1
0
0
1
1
0
1
1
1
0
1
1
0
0
0
0
0
0
1
0
0
0
1
1
0
0
1
1
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0
1
1
1
1
3301.6
3301.7
3302.1
3302.3
3302.4
3302.5
3302.5
3302.7
3302.8
3302.9
3303.6
3303.7
3304.1
3304.1
3304.1
3304.2
3304.4
3304.5
3304.8
3304.9
3305.8
3305.9
3305.9
3305.9
3306.7
3306.7
3306.8
3306.9
3307.1
3307.2
3307.4
3307.5
3308
3309.3
3309.4
3309.7
3309.8
3310.7
3311.2
3311.6
3311.8
3313.2
3313.5
0
1
0
1
1
0
1
1
0
1
1
1
1
0
1
0
1
1
1
0
0
1
0
1
0
0
1
1
0
1
0
0
1
1
1
0
0
0
1
0
1
0
0
1
0
1
0
0
0
0
1
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
1
1
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
1
1
0
1
0
0
1
1
1
1
0
0
0
1
1
1
0
1
0
1
1
1
1
1
1
0
1
0
0
0
0
1
1
1
0
0
1
0
1
1
0
0
1
0
1
0
1
1
0
1
1
0
0
0
1
1
1
1
0
1
0
0
1
0
0
0
1
0
0
1
1
1
1
1
1
0
0
0
1
0
1
1
0
0
0
0
1
1
1
1
0
1
0
1
0
1
0
0
0
0
0
1
0
1
0
1
0
0
0
1
1
1
0
0
0
1
1
0
0
1
0
1
0
0
1
0
1
0
1
1
0
1
1
1
1
0
0
0
0
0
0
1
1
0
1
1
0
0
0
1
0
0
0
1
0
1
0
1
1
1
1
1
1
0
0
0
0
0
1
0
1
1
0
1
1
0
0
1
0
0
1
0
1
1
0
0
0
0
0
0
1
0
0
0
1
0
0
1
0
1
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1
1
0
0
1
0
0
0
0
1
0
0
0
1
0
0
1
1
1
0
1
0
1
0
0
1
0
0
0
1
0
0
0
1
1
1
1
1
1
0
0
0
0
1
1
1
0
1
1
0
1
1
1
1
0
1
0
0
0
1
1
0
1
0
1
1
1
1
0
1
1
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
3313.7
3314.2
3314.6
3315.4
3315.6
3315.7
3315.8
3316.6
3316.8
3316.8
3317.3
3317.6
3317.8
3318
3318.2
3318.3
3318.4
3318.5
3318.8
3319
3319.3
3319.3
3319.9
3319.9
3320.3
3320.4
3320.4
3321.1
3321.2
3321.3
3321.5
3321.8
3321.8
3321.9
3322.2
3322.2
3322.3
3322.5
3322.8
3323
3323.6
3323.7
3324.4
0
0
1
1
1
1
0
1
0
0
0
1
1
1
1
0
0
1
0
0
0
0
1
1
1
0
0
0
0
0
1
1
1
0
0
0
1
0
1
0
1
1
1
1
1
1
1
1
0
0
0
0
0
1
1
0
0
0
0
1
0
1
0
0
0
1
1
1
1
1
0
0
1
0
1
0
1
0
0
0
1
0
1
1
1
0
0
1
0
0
0
0
0
1
0
1
0
1
1
1
0
1
0
1
1
1
1
0
1
0
1
1
1
0
1
1
0
1
1
0
1
0
0
0
0
1
0
1
0
1
0
1
0
0
0
1
1
0
1
1
1
0
1
0
0
0
0
1
1
1
0
0
0
1
1
0
0
1
1
1
0
0
0
0
0
1
0
1
1
1
0
0
1
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
1
0
1
1
1
0
1
0
1
1
0
0
1
1
1
0
1
1
1
0
0
0
0
1
0
1
1
1
0
1
1
1
1
0
1
1
1
0
0
0
1
1
1
0
1
1
0
0
0
1
0
1
1
0
0
0
0
0
1
0
0
1
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
1
1
1
0
0
1
0
0
1
1
0
0
1
1
1
0
1
1
0
0
0
0
1
1
0
1
1
1
0
1
1
0
0
1
1
0
1
0
1
1
0
1
0
0
0
0
0
0
1
1
1
1
1
1
0
1
1
1
1
1
1
1
0
1
0
1
1
1
1
1
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
3324.9
3325
3325.3
3325.8
3326.1
3326.8
3326.9
3327.7
3328.1
3329.4
3329.9
3330.3
3330.4
3330.5
3330.6
3331
3331.5
3331.9
3331.9
3331.9
3332
3332.1
3332.7
3333.4
3333.5
3333.5
3334
3334.1
3334.2
3334.4
3334.6
3334.8
3334.9
3335.4
3335.6
3336
3336.8
3336.9
3337.1
3337.3
3337.4
3337.5
3337.7
0
1
0
0
1
0
1
0
0
0
1
0
1
1
1
1
0
1
0
1
0
0
0
0
0
1
0
0
1
1
0
0
0
0
1
0
0
1
0
0
1
1
1
0
1
1
0
1
1
1
0
1
1
0
1
0
0
1
1
0
1
0
0
0
0
0
1
0
0
1
0
0
1
1
1
1
0
0
0
1
1
1
0
0
1
0
0
0
1
1
0
0
0
0
1
0
1
0
1
0
0
1
0
1
0
1
1
1
1
0
1
0
1
1
0
1
1
0
1
1
0
1
1
0
1
0
0
0
0
1
1
0
0
1
0
0
1
0
1
1
1
0
0
0
1
1
0
0
0
1
1
0
1
1
1
1
0
0
0
0
0
1
0
1
0
0
1
1
1
0
0
0
1
0
1
0
0
1
1
0
0
1
0
0
1
0
0
0
0
1
0
0
0
1
1
0
0
1
0
1
0
0
1
0
1
0
0
0
1
1
0
1
1
0
0
0
0
0
1
1
1
0
1
1
0
0
1
0
1
1
0
1
0
0
0
0
0
0
1
1
0
0
1
1
0
0
0
0
1
0
0
1
0
1
1
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
1
1
0
0
1
0
0
0
0
0
0
1
0
0
1
0
1
0
1
0
0
0
0
1
1
0
0
1
1
1
1
0
0
0
0
0
1
1
0
0
0
0
1
0
0
0
0
0
1
0
1
0
0
1
1
1
1
1
0
1
0
1
1
1
0
1
1
1
0
1
1
1
1
0
1
1
1
1
0
1
0
3338.4
3338.5
3338.6
3338.6
3338.8
3338.9
3340.4
3340.9
3341.1
3341.3
3342.5
3342.6
3343.2
3343.5
3344.3
3345
3345.6
3345.8
3346.4
3346.6
3346.9
3347
3347.3
3348.1
3348.6
3349
3349.1
3349.2
3349.5
3349.6
3349.7
3349.7
3349.9
3349.9
3350
3350.3
3350.7
3351.1
3351.3
3351.9
3352.3
3352.4
3352.6
0
0
1
1
0
1
0
0
0
1
1
0
0
0
1
0
0
1
0
0
0
0
0
0
1
0
0
1
1
0
0
0
1
1
1
0
0
0
0
0
0
0
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
1
0
1
1
0
0
0
0
1
0
1
1
0
1
0
0
1
0
1
1
0
1
0
1
0
1
1
0
0
1
0
0
1
0
0
0
0
0
0
0
0
0
1
0
1
1
1
1
1
0
1
1
0
1
0
0
0
0
1
1
0
0
0
0
1
0
0
0
0
0
0
1
0
1
0
0
0
1
1
0
0
0
1
1
0
0
0
1
0
1
0
1
0
0
0
1
1
0
0
1
1
0
0
0
0
0
1
0
1
1
0
1
1
0
1
0
0
1
0
1
1
0
0
1
0
1
0
0
0
1
1
0
1
0
0
1
1
0
0
0
1
1
0
1
0
1
0
0
1
0
0
0
1
1
0
0
0
1
1
0
0
0
1
0
0
1
1
0
0
0
1
0
1
0
0
0
0
0
1
0
0
0
0
1
0
0
0
1
0
0
0
0
1
1
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
1
0
0
1
1
0
1
0
1
1
0
0
0
0
0
0
1
1
1
0
0
1
1
1
0
1
0
1
0
0
1
0
0
0
1
1
0
0
0
1
1
1
0
0
1
0
0
0
1
0
0
0
0
1
0
1
1
1
0
1
1
1
0
1
1
1
1
1
1
1
1
0
0
1
1
1
0
1
0
0
1
1
3352.6
3352.7
3352.9
3353
3353.3
3353.3
3353.6
3353.8
3354.4
3354.6
3355.3
3355.3
3356.8
3357.3
3357.8
3357.9
3359.2
3360.3
3361.8
3361.9
3361.9
3362.2
3363.4
3364.2
3364.3
3365
3365
3365.1
3366
3366.1
3366.2
3366.3
3367.2
3367.7
3367.7
3368.3
3368.9
3369
3369
3369.4
3370.2
3370.3
3371.2
1
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
1
1
1
0
0
0
1
0
1
0
0
1
1
1
0
0
1
1
1
0
1
0
1
0
1
0
0
1
1
1
0
1
0
0
1
1
0
1
0
0
1
1
0
0
1
1
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
1
0
0
1
0
1
0
0
0
1
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
0
1
0
1
0
1
1
0
0
0
0
1
0
0
1
0
1
1
1
0
0
0
0
0
0
1
0
1
0
0
0
0
1
1
0
0
0
1
0
0
0
1
1
0
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
1
0
0
1
1
1
1
0
1
1
0
1
0
0
0
1
0
0
0
0
0
1
0
1
0
0
0
1
0
0
1
0
0
1
0
1
1
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
1
0
1
1
0
1
1
1
1
1
1
1
1
0
1
1
1
1
1
1
1
1
3371.8
3372.2
3374
3374.7
3375.4
3376.1
3376.5
3377.3
3378
3379.9
3380
3380.6
3380.7
3381
3381
3381.5
3382.6
3382.7
3384
3384.4
3384.4
3384.5
3384.9
3385.1
3386.6
3387.1
3389.3
3391.8
3394.8
3395.2
3396.1
3396.9
3397.7
3399.5
3399.6
3403.2
3403.6
3411.9
3412.1
3412.5
3414.4
3427
3427.1
Table S9 The statistical fit of all possible linear combinations of factors in the large data set that
excluded OD ratios. The fit of models was statistically compared using maximum likelihood
statistics according to the Akaike Information Criterion (AIC). Models are arranged in the order
of the best-fitting models (lowest AIC) to poorer fitting models (higher AIC). The inclusion or
absence of explanatory variables from models is shown by 1 and 0, respectively.
Tissue 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 RNA conc. 0 1 0 0 1 0 1 0 1 1 0 0 1 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 r26S:18S RIN 1 1 1 1 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 1 0 1 1 1 1 0 1 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 Platform 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 µg RNA Bases sequenced sequenced AIC 1 0 9174 1 0 9177.3 1 0 9188.1 1 1 9189.9 1 0 9191.4 1 0 9193.1 1 1 9193.5 0 0 9193.7 0 0 9194.2 1 0 9196.5 1 1 9204.4 0 1 9206.8 1 1 9208 0 0 9208.4 0 0 9208.5 0 1 9209.7 1 1 9210.7 0 0 9213.2 0 0 9213.8 1 1 9214.3 1 0 9215.1 1 0 9218.2 0 1 9221.9 0 1 9224.6 0 1 9228.1 0 1 9231 1 1 9235 0 0 9236.6 0 0 9237.4 1 1 9238.6 0 1 9254.8 0 1 9256.6 1 0 9261.5 1 0 9263.7 1 1 9273 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 0 1 0 1 1 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 1 1 0 1 0 1 1 0 0 0 1 0 0 1 1 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 0 1 1 1 0 0 1 1 1 0 0 1 1 1 1 1 1 0 0 0 0 1 0 1 1 1 1 0 1 1 1 1 1 0 0 1 0 1 0 0 1 0 0 1 0 1 1 1 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 1 0 1 0 0 1 0 1 1 1 0 1 1 0 0 0 0 1 1 1 1 0 1 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1 1 0 0 1 0 1 0 1 1 0 1 0 1 1 1 1 0 0 1 0 0 0 1 0 1 1 0 1 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 1 1 1 0 1 0 1 0 0 1 0 1 0 0 1 0 0 0 1 1 1 0 1 0 1 1 1 0 1 0 1 1 0 0 9276.3 9276.4 9277.2 9277.7 9279.5 9280.5 9280.8 9283 9284.1 9286.3 9288.6 9289.8 9292.4 9293.1 9294.1 9294.2 9294.6 9295.9 9296.3 9296.8 9297.6 9298.1 9299.8 9299.9 9302 9303.3 9303.4 9303.8 9305.5 9307.9 9308.9 9310.1 9310.3 9311.3 9314.5 9315 9316.5 9317.1 9317.5 9317.7 9320.6 9321.8 9322.6 0 1 1 0 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 1 1 0 0 1 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 1 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 1 1 1 1 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 1 1 1 1 0 0 1 0 1 0 1 1 1 1 0 0 1 1 0 0 0 0 0 0 1 0 0 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 0 1 1 0 0 0 1 0 1 1 0 1 0 0 0 1 1 1 1 0 0 1 1 0 9323.3 9324.1 9325.5 9326 9327.1 9328.4 9330.1 9333.1 9335.9 9339.2 9341.5 9341.9 9343.7 9346.5 9346.8 9347.3 9365 9367.2 9382.5 9385.1 9388 9391.6 9397 9397.3 9398.1 9400 9400.4 9403.1 9405.1 9406.4 9408.6 9409.1 9412.7 9412.9 9413.7 9417.2 9417.6 9420.2 9421 9421.6 9425.9 9429 9438.1 0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 1 1 1 9439.9 9451.5 9452.9 9454.6 9465.6 9467 

Documentos relacionados

MASTER ARM invitee matrix - LIST OF PARTICIPANTS_16Sept06

MASTER ARM invitee matrix - LIST OF PARTICIPANTS_16Sept06 MASTER ARM invitee matrix - LIST OF PARTICIPANTS_16Sept06 12-16 September 2006 Sao Paulo, Brazil

Leia mais