- BIOEN FAPESP

Transcrição

- BIOEN FAPESP
Division 1: Improvements in the feedstock: building a better cane plant for
energy – EnergyCane; sugarcane agriculture; other feedstocks
BIOEN Workgroup on sugarcane genomics
Glaucia Mendes Souza
Instituto de Química - Universidade de São Paulo
Michel Vincentz
Instituto de Biologia- UNICAMP
Renato Vicentini
Instituto de Biologia- UNICAMP
Anete Pereira de Souza
Instituto de Biologia- UNICAMP
Fabio Nogueira
Instituto de Biologia - UNESP
Marie-Anne Van Sluys
Instituto de Biociências - Universidade de São Paulo
BIOEN
Claudia Monteiro-Vitorello (ESALQ - Bioinf)
Joao Paulo Kitajima (IEAE-SP – Bioinf)
Nathalia de Setta (UFABC)
Cushla Metcalfe (PD FAPESP)
Guilherme M. Q. Cruz (DD FAPESP)
Edgar Andres Ochoa (DD CNPq)
Andreia Prata (DD FAPESP)
Mayra Kuroki (MS-CNPq)
Tatiane Correa (TS-USP)
Jonas Gaiarsa (DD-FAPESP) – Infra Bioinf
International Partners
Angelique D’Hont (CIRAD, Fr)
Helene Bergers (CNRGV, Fr)
Ray Ming (U. Illinois, US)
A Paterson (U Georgia, US)
Van Sluys Workshop BIOEN 2012 Renato Vicentini (CBMEG- UNICAMP)
Fabio Nogueira (IB-UNESP Botucatu)
Cristina
Michel Vincentz (CBMEG- UNICAMP)
Mariane Vilela
Luis Eduardo Del Bem
Anete Prereira de Souza (CBMEG- UNICAMP)
Claudio Benicio
Danilo Sforca
Monica
Monalisa Sampaio (UFSCar – Breeding Pgm)
Fernanda
Antonio Augusto Garcia (ESALQ)
Helaine Carrer (ESALQ)
Maria Lucia Carneiro (ESALQ)
Alessandra Palhares
Daniel Scherer Moura (ESALQ)
Marcio Castro Silva Filho (ESALQ)
Glaucia M Souza (BIOEN coord)
Milton Nishyiama Jr
Carolina Lembke
Marcos Buckeridge (INCT-CTBE coord)
Adriana Grandis
Amanda de souza
Eveline Tavares
Deciphering the sugarcane genome sequence
Goals Van Sluys Workshop BIOEN 2012 Genome Sequence
1.  Sugarcane genome draft > BIOEN 1000 BACs.
Achievements:
  517 BACs sequenced:
  R570 [358 - assembled]
  SP80-3280 [157 – assembled]
  24 BACs need resequencing
  48 BACs/run need 10 months to conclude
  300 BAC manuscript under construction…
2.  Transposable element complement.
Achievements:
> 60% of known repeats are defined
> 4 papers published
3.  Develop a bioinformatic pipeline for BioEn integrative platform.
SP 80-­‐3280 BAC Library ConstrucDon and Screening 480 posi:ves clones SP 80-­‐3280 BAC Library BAC Library Size 221,184 clones Medium Size Clones 110 kb Coverage Full Genome (10Gb) 2.4X Screening tools Pool 3D for ½ library and Macroarrays Storage 576 plates of 384 wells SP 80-­‐3280 BAC Library Screening •  20 Probes •  4 Membranes •  55,296 clones in duplicity per membrane Used to Fapesp/
Bioen Sugarcane Sequencing Program BACs mapped > Sorghum
LGVI 2 3 4 4 2/2 LGI 5 1
2 LGIII 2 2 2
3 2 2 3 LGII 6 3
LGIII U2/U3 LGVII Rust 4
4 2 5
LGVIII 4 LGIII 12 5 6 2 6
U2/U3 3 2 2 7
2 4 8
4 2 3 9
2 2 10
2 2 5 4 10Mb
Glaucia & Milton Pipeline for the Identification of
putative promoter regions
Gene Predic:on: AUGUSTUS Alignment tool: sim4 Query: Sugarcane and Sorghum genes Subject: Sugarcane Genome (BACs) Developed tool to obtain and store, for each predicted gene: -­‐ Upstream and Downstream gene regions -­‐ Sugarcane mapped genes (SAS) -­‐ Sorghum mapped genes There are three main steps to get the upstream and downsteam region: 1 – if exists a SAS aligned to a predicted gene, is returned the regions for this predicted gene 2 – else, is returned the regions for a SAS, mapped in a non predicted gene region 3 – else, if the SAS is not mapped in the Genome, is used its best sorghum gene match and is returned the regions for the predicted gene, which has this sorghum gene aligned Glaucia & Milton Sucest-Fun - Cane Genome Analysis
1.525
Number of predicted genes
5’ >= 500pb and 3’ >= 500pb
5’ >=500pb and 3’ < 500pb
Number predicted genes with 3’ and
5’ region
5’ < 500pb and 3’ >= 500pb
5’ < 500pb and 3’ < 500pb
1.474
27
24
0
The summary to the analysis of all predicted genes in the Sugarcane Genome (BACs)
and
its upstream and downstream size regions
Manuscript in preparation:
“1191 thousand genes defining the expressing sugarcane genome landscape”
50 45 40 35 30 25 20 15 10 5 0 0 1 2 For 273 BACs:
261 BACs with genes
12 BACs without genes (3
centromeric)
1191 protein coding genes
ORFe`ome (55.6 million reads)
(5 day old germinating bud)
1037 ++ signal
154 no signal
3 4 5 6 7 8 9 10 11 12 17 • Anotação
• > Kita & Claudia
• General features & TEs
• > Nathalia, Cushla, Guilherme
• SSRs
• > Benicio & Anete
• Ortologia
• > Dudu, Renato & Michel
• sRNAs patterns
• > Fabio & Renato
Iden:fica:on and characteriza:on of small RNAs and targets in sugarcane Research Team Fabio TS Nogueira (UNESP-­‐IB-­‐Botucatu) Fausto AO Moreira (ESALQ) Geraldo F F Silva (ESALQ) Eder Silva (UNESP-­‐IB-­‐Botucatu) Edna GO Moreia (UNESP-­‐IB-­‐Botucatu) Collaborators: Helaine Carrer (ESALQ) Renato VicenDni (UNICAMP) Michel Vincentz (UNICAMP) Marie-­‐Anne Van Sluys (USP) Lázaro Peres (ESALQ) Project main goals •  Developing a database of sugarcane small RNA and targets; OK! (hhp://sysbiol.cbmeg.unicamp.br/SCmiRNA) •  Monitoring small RNA and target gene expression during early sugarcane development; OK! •  Cloning and sequencing populaDons of small RNAs from axillary buds using next-­‐generaDon sequencing; OK! •  Analyzing funcDons of selected small RNAs and targets. NOT QUITE! Major Results : 1) IdenDficaDon of sugarcane microRNA precursors and targets Zanca et al. 2010 BMC Plant Biol 2) IdenDficaDon of sugarcane small RNA associated with transposable elements (TEs) Domingues et al. 2012 BMC Genomics 3) IdenDficaDon of sugarcane small RNA associated with vegetaDve bud outgrowth % of unique sRNA sequences Or:z-­‐Morea et al. 2012 Submi\ed microRNA target Length (nts) sRNA distribution
 Coding: 199,064
 Non-coding: 1,653,491
a. 40,00
0
20,0
00
80,00
0
60,00
0
100,00
0
120,00
0
5.8S
25
S
18S
ETS
b. ITS1
NT
ITS2
CRM_2
CRM_3
CRM_4
CRM_1
Tat_1
20,000
SCEN_1
40,000
Tat_3
Tat_4
SCEN_2
80,000
Tat_2
60,000
COPIA SCEN_3
100,000
SCEN_4 CRM_5
Tat_5
120,000
140,000
Anete & Benicio Method to idenDfy SSR -­‐ Data analyzed 280 sugarcane BACs sequences (size ~32 Mbp) -­‐ To find SSR in BACs MISA program (perl script) IdenDficaDon of simple and compound SSR MoDf classes: di-­‐ to hexanucleo:des -­‐ DistribuDon of SSR associated with genes predicted Input files: SSR locaDon and genes predicDon in BACs LocaDon: intergenic and intragenic regions (intron and exon) Program: In-­‐house script Anete & Benicio Simple Sequence Repeats (SSR) in sugarcane R570 BACs Number of SSR idenDfied = 4.342 Intergenic region = 3.601 Intragenic region= 741 intronic = 566 exonic = 175 Sugarcane Energe:c Balance: A Systems Approach Towards Understanding Regula:on of Sucrose Metabolism and Sugar Signaling Dr. Renato VicenDni (Principal InvesDgator) Research Goals • 
• 
Elucidate which genes in sugarcane leaves are responsive to changes in the sink:source raDo. Develop accurate predicDve methods capable of scaling from genotype to phenotype and idenDfy causality among sucrose biosynthesis pathway and contrasDng phenotypes. • 
Develop metabolic models of the sucrose biosynthesis pathway based on literature accumulated knowledge. • 
Understand the diversificaDon of sugar-­‐induced gene expression programs among angiosperms , with focus in the sucrose metabolism network and transcripDon factors. Renato VicenDni • 
• 
• 
• 
Ini:al Results Manipula1on of Sink Capacity -­‐ The lowest sucrose content genotype shows the highest photosyntheDc rate, specially in the third day of the sink-­‐source treatment. Phylexpress – We developed a bioinformaDcs tool for large scale orthology establishment and help understanding geneDc networks evoluDonary plasDcity. Sugarcane Gene Content -­‐ Using this tools we idenDfy that more than ten thousand sugarcane coding-­‐genes remain undiscovered. And more than 2,000 ncRNAs conserved between sugarcane and sorghum was revealed (Vicen1ni R. et al. 2012, Tropical Plant Biology). Sugarcane co-­‐expression network – The sugarcane meta-­‐network was generated for coexpressed gene clusters resulDng in 85 clusters with 381 edges. We also found miRNAs and siRNAs potenDally involved in regulaDon of these network. MapMan general metabolism map for the all orthologous idenDfied in grasses databases. ncRNAs showing examples of phase distributed sRNAs. We detect that many clusters include genes involved in same biological process. Laboratory of Plant Genetics
http://www.cbmeg.unicamp.br/node/330
Energe:c Homeostasis and Sugar Signaling: Diversifica:on of the Molecular Mechanisms Involved in the Control of the Energe:c Balance in Angiosperms Team: PI -­‐ Michel Vincentz / PhDs -­‐Luiz Eduardo Vieira Del Bem / Gustavo Turqueto Duarte/ Juliana CrisDna BapDsta / Américo José Carvalho Viana / Raphael Ricon / Post Docs – Cleverson Carlos MaDolli / David Newman Main Objec:ves: 1 -­‐ Compare the transcript profiles regulated by sugars in angiosperms 2 – Evaluate the importance of post-­‐transcripDonal control (mRNA stability) in glucose signaling and in the integraDon of glucose and the stress related hormone Abscisic Acid 3 – Define the funcDon of selected basic leucine zipper transcripDon factors related to glucose-­‐regulated processes. 4 – Define the mannose signal transducDon mechanism Main Results: Publica:ons 1.  VicenDni, L. E. V. Del Bem, M. A. Van Sluys, F. T. S. Nogueira & M. Vincentz (2012) Gene Content Analysis of Sugarcane Public ESTs Reveals Thousands of Missing Coding-­‐Genes and an Unexpected Pool of Grasses Conserved ncRNAs. Tropical Plant Biology (2012) 5:199–205 DOI 10.1007/s12042-­‐012-­‐9103-­‐z 2. MaDolli CC, Tomaz JP, Duarte GT, Prado FM, Del Bem LE, Silveira AB, Gauer L, Corrêa LG, Drumond RD, Viana AJ, Mascio PD, Meyer C, Vincentz M. The Arabidopsis bZIP gene AtbZIP63 is a sensiDve integrator of transient ABA and glucose signals. 2011 Plant Physiol Aug 15. doi: 10.1104/pp.111.181743 3. Del Bem, Luiz EV; Vincentz, M. EvoluDon of xyloglucan-­‐related genes in green plants. BMC EvoluDonary Biology (Online), v. 10, p. 341, 2010. Submi\ed 1.  Amanda Bortolini Silveira, Charlohe TronDn, Sandra CorDjo, Joan Barau, Luiz Eduardo Vieira Del Bem,1 Olivier Loudet, Vincent Colot, Michel Vincentz . Extensive natural epigeneDc variaDon at a de novo originated gene 2. Gustavo Turqueto Duarte, Cleverson Carlos MaDolli, Bikram Dah Pant, Armin Schlereth, Wolf-­‐Rüdiger Scheible, Mark SDh, Michel Vincentz . Involvement of microRNA pathway in glucose-­‐mediated control of Arabidopsis early seedling development In Prep. 1.Luiz Eduardo V. Del Bem, Renato VicenDni dos Santos and Michel Vincentz . Phylexpress: a high-­‐throughput phylogeneDc method for establishment of orthology and expression data comparison in an evoluDonary perspecDve SUCEST AnnotaDon Top 5 matches 40% or higher similaDry Blast2GO: 954 genes had at least 1 GO term assigned in of 3 categories: molecular_funcDon; cellular_component; biological_process 237 genes had no GO annotaDon Over 120 genomic versions using 17 cul:vars validate this duplica:on. M13F 234F
692F
2381F
1079R
1564R
2995F
2995R
3601R
4500R
6333R
6959R
DAPI CENT DEL Maximus Ale N Seha et al, in press Turning insertions into molecular markers…
flanking genomic region
Ivana element
absent
present
locus 44D02
R570
RB72
1012
RB72
199
RB72
454
RB72
5053
RB72
5828
RB73
2577
RB73
5200
RB73
5275
RB73
9735
RB75
126
RB76
5418
RB78
5148
RB80
6043
RB82
5336
RB83
5019
RB83
5054
RB83
5089
RB83
5205
RB84
5197
RB84
5210
RB84
5257
RB85
5002
RB85
5035
F36819
H53
-398
9
IAC5
0-13
4
IAC5
1-20
5
IAC5
2-15
0
IAC6
4-25
7
IAC8
2-30
9
2
IAC8
3-41
5
7
IAC8
6-22
10
IAC9
1-10
99
MAN
ERIA
NA5
6-79
NCo
310
F07
F09
Man
dala
y (S
spon
Badi
tane
lla (S
um)
G ge
o
f
ficin
rman
arium
(S o
)
ffici
Chun
nariu
n ee
m
)
(S b
arbe
r i)
Creo
ula
POJ
2878
RB 8
3548
6
SP70
-114
3
SP80
-328
0
CB36
-24
CB45
-3
Co44
9
CP52
-68
Ivana - locus 44D02
12,0 10,0 8,0 6,0 4,0 2,0 0,0 Cushla Metcalfe, Monalisa Sampaio & Fernanda Zatti UFSCar
alleles not occupied alleles occupied Using TaqMan qPCR to examine copy
number polymorphism at a single locus
Using TaqMan qPCR to examine copy
number polymorphism at a specific loci
So far, each cultivar has a unique “TE code”
derived from 2 loci presence and absence allele
combination.
Locus 44D02 Locus 15015 CONCLUSIONS:
  BAC library of SP80-3280 is ready and validated (Thanks to Anete`s group!)
  sRNA library is contributing to understand sugarcane bud development. Also,
contributes to gain insight on gene regulation and gene regions (Thanks to Fabio`s
group)
 Promoter search using combined Augustus prediction and SAS (Thanks to
Glaucia`s group!)
  Model Arabidopsis plants are contributing to understanding sugar signaling and
energetic homeostasis (Thanks to Michel`s group)
  Integration database is being developed to understand regulatory network
changes (Thanks to Renato`s group)
  “1000 sugarcane proteins genome landscape!” Paving the path to sugarcane
genome assembly and annotation pipeline development by defining Transposable
Elements content and Manually curated 1191 gene annotation (Thanks to GaTE Lab,
Nathalia (UFABC), Claudia (ESALQ) and Kitajima)
Cellular Processes 

Documentos relacionados