SNP discovery from amphidiploid species and transferability across the Brassicaceae

Similar documents
WP Board 1054/08 Rev. 1

Genome-wide identification and characterization of mirnas responsive to Verticillium longisporum infection in Brassica napus by deep sequencing

Clubroot Resistance in Brassica rapa: Genetics, Functional Genomics and Marker- Assisted Breeding

BMAP4 ( Brassicaceae

Overcoming challenges to developing varieties resistant to Sclerotinia - managing pathogen variation. Photos: Caixia Li

Confectionary sunflower A new breeding program. Sun Yue (Jenny)

Genetic and morphological diversity in the Brassicas and wild relatives

Proposal Problem statement Justification and rationale BPGV INRB, I.P. MBG, CSIC

Technology: What is in the Sorghum Pipeline

Chapter V SUMMARY AND CONCLUSION

Reasons for the study

Mapping and Detection of Downy Mildew and Botrytis bunch rot Resistance Loci in Norton-based Population

Reshaping of crossover distribution in Vitis vinifera x Muscadinia rotundifolia interspecific hybrids

Randy Nelson Ram Singh

ZAIKA I.V. 1, SOZINOV A.A. 2, 3, KARELOV A.V. 2, KOZUB N.A. 2, FILENKO A.L. 4, SOZINOV I.A. 2 1

Catalogue of published works on. Maize Lethal Necrosis (MLN) Disease

Organization, diversity, expression and evolutionary dynamics of the NB resistance gene family in grapevine and related species

Introduction to the use of molecular genotyping techniques

Where in the Genome is the Flax b1 Locus?

Identification of candidate genes of QTLs for seed weight in Brassica napus through comparative mapping among Arabidopsis and Brassica species

Title: Development of Simple Sequence Repeat DNA markers for Muscadine Grape Cultivar Identification.

(Definition modified from APSnet)

Kiwifruit Breeding & Genomics

INDIAN COUNCIL OF AGRICULTURAL RESEARCH DIRECTORATE OF RAPESEED-MUSTARD RESEARCH, BHARATPUR, INDIA

Accuracy of imputation using the most common sires as reference population in layer chickens

OILSEEDS GROUND NUT (MONKEY NUT, PEANUT) Arachis hypogaea (2n = 40) Allo tetraploid Genomic constitution AABB

High-throughput genotyping for species identification and diversity assessment. in germplasm collections

EVALUATION OF WILD JUGLANS SPECIES FOR CROWN GALL RESISTANCE

Evolution of Crops. Audrey Darrigues. H&CS830 Dr. David Tay Autumn 2003

June 29, Tomato Genetics and Breeding at Penn State. An Overview. Majid R. Foolad

Fruit and berry breeding and breedingrelated. research at SLU Hilde Nybom

1. Evaluated published leaf, petiole and stem as inoculation sites

Identification of haplotypes controlling seedless by genome resequencing of grape

Big Data and the Productivity Challenge for Wine Grapes. Nick Dokoozlian Agricultural Outlook Forum February

Reniform Resistance from Texas Day Neutral Lines

Bangladesh. : Associate Professor and Leader of the Canola program, University of

Unravelling the taxonomy of the Colletotrichum species causing anthracnose in chili in Australia and SE Asia

CERTIFIED PRODUCTION OF CANOLA, MUSTARD, RADISH, AND RAPESEED

Resistance to Phomopsis Stem Canker in Cultivated Sunflower 2011 Field Trials

Mapping the distinctive aroma of "wild strawberry" using a Fragariavesca NIL collection. María Urrutia JL Rambla, Antonio Granell

Multinational Brassica Genome Project (MBGP) steering committee meeting at PAG 2014

Supplemental Data. Jeong et al. (2012). Plant Cell /tpc

Dune - the first canola quality Brassica juncea (Juncea canola) cultivar and future Juncea canola research priorities for Australia

AVOCADO GENETICS AND BREEDING PRESENT AND FUTURE

is pleased to introduce the 2017 Scholarship Recipients

Schatzlab Research Projects Michael Schatz. Oct 16, 2013 Research Topics in Biology, WSBS

Genetic diversity and population structure of rice varieties grown in the Mediterranean basin. Spanish population, a case of study

Origin and Evolution of Artichoke Thistle in California

FINAL REPORT TO AUSTRALIAN GRAPE AND WINE AUTHORITY. Project Number: AGT1524. Principal Investigator: Ana Hranilovic

of Vitis vinifera using

USDA-ARS Sunflower Germplasm Collections

Interloper s legacy: invasive, hybrid-derived California wild radish (Raphanus sativus) evolves to outperform its immigrant parents

Progress on the transferring Sclerotinia resistance genes from wild perennial Helianthus species into cultivated sunflower.

LUISA MAYENS VÁSQUEZ RAMÍREZ. Adress: Cl 37 # 28-15, Manizales, Caldas, Colombia. Cell Phone Number:

Discrimination of Ruiru 11 Hybrid Sibs based on Raw Coffee Quality

RESEARCH ON CONSERVATION, EVALUATION AND GENETIC HERITAGE EXPLOITATION OF TOMATO

Genetic diversity of wild Coffee (Coffea arabica) and its implication for conservation

Molecular phylogeny of Brassica U s triangle species based on the. analysis of PolA1 gene

DIVERSIFICATION OF SUNFLOWER GERMPLASM FOR DIFFERENT ECONOMICALLY IMPORTANT CHARACTERISTICS

Apport de la Cytogénétique Moléculaire. àl analyse du Génome de la Canne à sucre

MUMmer 2.0. Original implementation required large amounts of memory

First Occurence and Susceptibility of Prunus Species to Erwinia amylovora in Hungary

2010 Analysis of the U.S. Non-GMO Food Soybean Variety Pipeline. Seth L. Naeve, James H. Orf, and Jill Miller-Garvin University of Minnesota

Developing Machine-Harvestable Fresh Market Tomatoes; and other Highlights from the UF Breeding Program

Consequences of growing genetically modified (GM) oilseed rape in coexistence with non-gm oilseed rape

See Policy CPT CODE section below for any prior authorization requirements

Preliminary observation on a spontaneous tricotyledonous mutant in sunflower

GENETIC RESOURCES OF SAFFRON AND ALLIES (CROCUS SPP) The CROCUSBANK Project

Pevzner P., Tesler G. PNAS 2003;100: Copyright 2003, The National Academy of Sciences

YIELD POTENTIAL OF NOVEL SEMI-DWARF GRAIN AMARANTHS TESTED FOR TENNESSEE GROWING CONDITIONS

Why PAM Works. An In-Depth Look at Scoring Matrices and Algorithms. Michael Darling Nazareth College. The Origin: Sequence Alignment

Controlling Pierce s Disease with Molecular and Classical Breeding

Genomics: cracking the mysteries of walnuts

Barley Research at Aberdeen. Gongshe Hu USDA-ARS Aberdeen, Idaho

Produce basic hot and cold desserts

Construction of an integrated genetic linkage map for the A genome of Brassica napus using SSR markers derived from sequenced BACs in B.

Use of Rutabaga (Brassica napus var. napobrassica) for the Improvement. of Canadian Spring Canola (Brassica napus) By: Derek William Frank Flad

Virginia Wine Board Project # Annual Progress Report - July 2015

Temple Frieze from Iraq 2500 BCE. Outline. Evolution of Lactase Persistence. Domesticated Cattle. Prehistory of dairying

DNA marker assisted breeding in interspecific crosses to improve canola (Brassica napus L.)

PROJECTS FUNDED BY THE SOUTHERN REGION SMALL FRUIT CONSORTIUM FOR 2011

Photo: cookwoods.com

Genome-wide comparative analysis of NBS-encoding genes between Brassica species and Arabidopsis thaliana

Structures of Life. Investigation 1: Origin of Seeds. Big Question: 3 rd Science Notebook. Name:

Construction of a Wine Yeast Genome Deletion Library (WYGDL)

BATURIN S.O., KUZNETSOVA

GETTING TO KNOW YOUR ENEMY. how a scientific approach can assist the fight against Japanese Knotweed. Dr John Bailey

ALTHOUGH whole-genome duplications (polyploidy)

Business opportunities and challenges of mainstreaming biodiversity into the agricultural sector

ARM4 Advances: Genetic Algorithm Improvements. Ed Downs & Gianluca Paganoni

RUST RESISTANCE IN WILD HELIANTHUS ANNUUS AND VARIATION BY GEOGRAPHIC ORIGIN

Great Lakes Hop & Barley Conference Barley Contributions to Beer Flavor: Flavor Fields and The Oregon Promise

Environmental risks related to the release of genetically modified plants with the focus on oilseed rape(brassica napus)

Update from the UKRBC Breeding Programme

Combining high throughput genotyping and phenotyping for the genetic improvement of table grapes in Chile

National Academy of Agricultural Science, Rural Development Administration, Suwon , South Korea e

QTLs Analysis of Cold Tolerance During Early Growth Period for Rice

Level 3 Biology, 2016

Uutcros sing Potential for Brassica Species

SHORT TERM SCIENTIFIC MISSIONS (STSMs)

Transcription:

SNP discovery from amphidiploid species and transferability across the Brassicaceae Jacqueline Batley University of Queensland, Australia j.batley@uq.edu.au 1

Outline Objectives Brassicas Genome Sequencing SNP discovery SNP validation Cross species transferability Application Future work 2

Objectives Development of bioinformatics tool for SNP discovery and annotation Establish cost effective discovery and validation of SNPs within the amphidiploid B. napus Assess association of SNPs with genes for agronomic traits Assess the extent of LD within B. napus Assess genetic diversity of important agronomic genes within cultivated Brassica spp. and wild relatives Establish a strategy for SNP discovery from other large and complex genomes. 3

Methodology Paired end sequence from parents of mapping populations SNP discovery Genotyping using golden gate and infiunium assays SNPs genetically and physically mapped Cross species amplification to other Brassicaceae members 4

Brassicas

Diversity genomics Characterising genomic and phenotypic diversity in cultivated and wild plant species and their pathogens Brassicaceae, Leptosphaeria maculans Investigating genetic variation in crops and wild relatives Investigating the evolution of plant pathogen interactions Identifying novel genes and genetic markers for traits of interest, such as disease resistance 6

Genetic diversity Germplasm collections are valuable gene pools Assessing genetic and genomic diversity within these collections: assign lines and populations to diverse groups study the evolutionary history of wild relatives verify pedigrees and fill in the gaps in incomplete pedigree or selection history monitor changes in allele frequencies in cultivars or populations help narrow the search for new alleles at loci of interest. 7

Domestication bottlenecks B. napus canola, B.juncea mustard and B. carinata are allopolyploids. Rare natural polyploids only incorporate a limited genetic diversity from progenitor diploids. Wide genetic diversity in B. rapa, B. nigra, B. oleracea progenitors and wild relatives, options to enhance canola and mustard. A range of strategies is available to realise the genetic potential of the Brassicaceae. 8

Sequence data Illumina GAIIx and Hi-Seq data for: 8 B. napus cultivars 2 B. rapa cultivars B. oleracea 3 Brassicaceae Funding for 100+ Brassicas 9

Brassica genome sequencing B. rapa ssp. Pekinensis var. Chiifu 10 chromosomes, ~550 Mbp Multinational Brassica genome sequencing committee originally agreed BAC by BAC sequencing approach >100,000 BAC end sequences >600 BACs sequenced Genome sequenced using Illumina GAIIx 10

B. rapa SNP discovery and genotyping Illumina paired end sequence from parents of mapping populations SNP discovery Genotyping using golden gate Physical mapping Cross species amplification to other Brassicaceae members 11

SNP validation

Genotyping Illumina Golden gate system 384 SNPs 2 B. rapa mapping populations Parents of B. napus mapping populations Selection of wild Brassicaceae 13

SNP Validation SNP Pool 1 Strictest Criteria SNP Pool 2 Less strict Criteria SNP Pool 3 Lenient Criteria ~ 320 SNPs ~ 50 SNPs ~ 15 SNPs GoldenGate Oligo Pool

SNP Validation SNP Pool 1 Strictest filtering Criteria SNP Pool 2 Less strict Criteria SNP Pool 3 Lenient Criteria 94% conversion 80% conversion 30% conversion

SNP Genotyping 16

SNP Genotyping 17

Genetic diversity Assess relationships within the Brassicaceae Correlate this with morphological and interspecific hybridisation data 18

Brassicaceae diversity

Brassicaceae diversity

B. napus SNP discovery Custom algorithm developed for SNP discovery from Illumina data for amphidiploid species Distinguish between inter and intra genomic SNPs

The SGSautoSNP algorithm We do not consider the reference in SNP discovery the reference is only used to bring the reads together SNPs are called from these reads => different to most other SNP callers 1. coverage must be at least 4 2. SNP score must be at least 2 Example: SP1 = 6*A AP1 = 1*G M2P = 1*G SNP score = 2 3. no conflict within a variety i.e. all bases in each cultivar must be the same if e.g. Junior 3 * A and 1 * T => conflict 22

Output visualisation 23

B. napus SNP discovery Custom algorithm developed for SNP discovery from Illumina data for amphidiploid species Distinguish between inter and intra genomic SNPs XA_0011r 1252 1252 3 S=G=2;M1=G=3;Sr=X=0;A=G=3;J=T=3;M2=G=1;Bn=X=0;E=X=0; T;G; XA_0011r 1379 1379 5 S=T=2;M1=T=3;Sr=X=0;A=T=1;J=C=3;M2=X=0;Bn=X=0;E=C=2; C;T; XA_0011r 2036 2036 4 S=G=1;M1=G=2;Sr=X=0;A=G=1;J=T=8;M2=T=3;Bn=X=0;E=T=6; T;G; XA_0011r 4921 4921 2 S=X=0;M1=X=0;Sr=X=0;A=T=8;J=X=0;M2=X=0;Bn=X=0;E=C=2; C;T; XA_0011r 5070 5070 4 S=X=0;M1=G=2;Sr=X=0;A=G=2;J=A=6;M2=X=0;Bn=X=0;E=X=0; A;G; XA_0011r 5273 5273 3 S=C=4;M1=C=5;Sr=X=0;A=C=6;J=G=2;M2=X=0;Bn=X=0;E=G=1; C;G; XA_0011r 5442 5442 8 S=T=1;M1=X=0;Sr=X=0;A=T=7;J=C=5;M2=X=0;Bn=C=1;E=C=3; C;T; XA_0011r 5512 5512 7 S=G=3;M1=G=3;Sr=X=0;A=G=5;J=A=4;M2=X=0;Bn=A=2;E=A=1; A;G; XA_0011r 5976 5976 11 S=T=8;M1=T=1;Sr=X=0;A=T=2;J=C=6;M2=X=0;Bn=C=2;E=C=3; C;T; XA_0011r 5992 5992 10 S=A=9;M1=A=1;Sr=X=0;A=A=3;J=G=5;M2=X=0;Bn=G=2;E=G=3; A;G;

B. napus SNP discovery Base Change Type Number A>G C>T A>C A>T C>G G>T transition transition transversion transversion transversion transversion 105045 105513 42480 49287 29828 42217

B. napus SNP discovery Base Change Type Number A>G transition 105045 C>T transition 105513 A>C transversion 42480 A>T transversion 49287 C>G G>T transversion transversion 29828 42217 Base Change Type Number A>G transition 24207 C>T A>C A>T C>G G>T transition transversion transversion transversion transversion 24375 10158 12254 6621 9918

B. napus SNP density 30 25 20 15 Series1 10 5 0 0 100000 200000 300000 400000 500000 600000 700000

B. napus SNP validation 24/25 SNPs correctly predicted through validation by PCR and sequencing 20/22 SNPs correctly predicted through Golden gate Range of sequence coverage and confidence scores 28

Gene discovery Finding the genes for the traits Integration of genetic data with genomic data Mapping of QTL regions to genomic data... Annotation 29

Gene discovery - application OI09 A06 Genetic map 10 cm Physical map Na12 E11 BRAS023BRMS040 CB10439 CB10278 BRMS036 BRMS075 Na12 A02 BRMS005 KBRH143H15 RA2 A05 Physical 1Mbp scaffolds 30

Scaffold and Marker Assembly Chromosome Marker Scaffold A7

CMap3D Duran et al. (2010) Bioinformatics 26: 273-274 32

Identification of Candidate Blackleg Resistance Genes TNL (Gene number) Scaffold 1 3 2 3 3 3 4 3 5 3 6 3 7 12 8 12 9 12 10 12 11 3 12 3 13 3 14 3 15 12 16 12 17 19 18 19 19 19 20 19 21 19

TNL6 Sequence and Protein Alignment B. rapa B. napus 1 B. napus 2 B. rapa B. napus 1 B. napus 2 B. rapa B. napus 1 B. napus 2

Gene Mutation Species Predicted Number of Reads Sequence Verified TNL 1 18,240 Reference: B. rapa G N/A B. napus 1 G 3 G B. napus 2 C 1 C TNL5 5,208,963 Reference: B. rapa C N/A B. napus 1 C 1 C B. napus 2 T 4 T TNL 5 5,209,056 Reference: B. rapa A N/A B. napus 1 G 1 G B. napus 2 A 5 A TNL5 5,209,772 Reference: B. rapa A N/A B. napus 1 A 1 A B. napus 2 T 6 T TNL5 5,207,023 Reference: B. rapa G N/A B. napus 1 T 4 T B. napus 2 G 1 G TNL 6 5,891,882 Reference: B. rapa T N/A B. napus 1 T 4 T B. napus 2 C 3 C

Change in charge was the most common change due to protein differences

37

Gene discovery Primer PCR Gene/EST genomic sequence Known (Arabidopsis) Unknown (Brassica)

http://flora.acpfg.com.au/tagdb/ http://flora.acpfg.com.au/tagdb Marshall, D.J., et al. (2010) Plant Methods. 6:19 39

TAGdb output 40

Sym genes Brassicas can not form symbiotic associations with rhizobia or mycorrhizae - BUT - contain homologues for many genes involved in these processes. What is the diversity of and selection pressure on these genes across the Brassicaceae? What are these proteins doing? general pathogen/microbial perception and response? e.g. LjNUP85, LjNUP133 Tagdb results e.g. NFR1 NFR5 9 Arabidopsis homologues e.g. LjPOLLUX Ferguson et al., 2010

Sequencing SYM genes in Brassicas: NSP1 (Nodulation Signalling Pathway1) BrNSP1 and BoNSP1 vs MtNSP1 = 57% CDS similarity AtNSP1 vs MtNSP1 = 58% CDS similarity BrNSP1 vs AtNSP1 = 83.8% CDS similarity BoNSP1 vs AtNSP1 = 83.7% CDS similarity BrNSP1 vs BoNSP1 = 98% CDS similarity Ferguson et al., 2010

Sequencing SYM genes in Brassicas: NSP1 (Nodulation Signalling Pathway1) High conservation in the GRAS domain. Residues important for NSP1 function in Lotus japonicus are conserved in the Brassicaceae. Ferguson et al., 2010

Sequencing SYM genes in Brassicas: NSP2 (Nodulation Signalling Pathway2) BrNSP2 vs BoNSP2 = 98% CDS similarity BrNSP2 vs AtNSP2 = 78.2% CDS similarity BoNSP2 vs AtNSP2 = 78.5% CDS similarity BrNSP2 and BoNSP2 vs MtNSP2 = 55% CDS similarity Ferguson et al., 2010

Sequencing SYM genes in Brassicas: NSP1 (Nodulation Signalling Pathway1) Alanine residue important for NSP1-NSP2 interaction in Lotus japonicus is not conserved in the Brassicaceae, but conserved in rice. Rice NSP1 and NSP2 are functional in nodulation in transgenic Lotus japonicus. Ferguson et al., 2010

Sequencing SYM genes in Brassicas: POLLUX One copy on both the A and the C genomes: BrPOLLUX (A), BoPOLLUX (C) 98% similar. Ferguson et al., 2010

BrPOLLUX CDS is 69.4% similar to Lj POLLUX CDS, Bo POLLUX = 69%, AtPOLLUX = 61%. 85.6% similarity between BrPOLLUX and AtPOLLUX, 85.4% between Bo and At. Currently sequencing POLLUX in other Brassicaceae members. Least similarity in N-terminal transit peptide.

Consistent with cation channel function: POLLUX Geneious Pro Transmembrane Prediction (Biomatters).

Future work SNP identification and genotyping of cultivated and wild Brassicaceae Large scale SNP discovery and genotyping for fine mapping and LD studies Identify which Brassicaceae to sequence Use next generation sequencing data, molecular markers and morphological variation to study diversity across Brassica species and wild relatives 49

Summary Next generation sequencing data is suitable for gene, promoter and SNP discovery in nonsequenced and orphan species SNPs can be applied for gene discovery and evolution in crop species and wild relatives High throughput genotyping can be used for fine mapping and LD studies 50

Acknowledgements Emma Campbell Christina Delay Megan McKenzie Reece Tolleneare Joanne McLanders Manuel Zander Alice Hayward Paul Berkman Chris Duran Kaitao Lai Michal Lorenc Sahana Manoli Adam Skarshewski Lars Smits Jiri Stiller David Edwards Bob Redden Harsh Raman Xiaowu Wang